From time to time I will receive a note from a Db2 for z/OS DBA, asking for help in understanding why there is a larger-than-expected amount of Db2 catalog access activity associated with package authorization checks, and what can be done to address that situation. In my experience there are two primary causes of this observed Db2 behavior, one more straightforward than the other. In this blog entry I'll describe these two drivers of head-scratchingly high numbers for catalog accesses related to package authorization, and I'll provide associated mitigating actions that you can take.
Before proceeding, a couple of points of clarification. When I mention "package authorization," what I'm talking about, in the context of this blog entry, is authorization to execute a package (not authorization to create a package by way of a BIND PACKAGE operation - that's a different subject). Also, this whole discussion is relevant primarily to packages that are the compiled and executable form of what Db2 people refer to as static SQL statements (there are other packages that essentially enable preparation and execution of dynamic SQL statements). Such a static SQL package could be associated with a Db2 stored procedure, or with a CICS-Db2 transaction program, or with a batch job. If the application process requesting execution of a static SQL package is local to the Db2 subsystem (i.e., it is not accessing the Db2 system by way of a network connection and the Db2 distributed data facility), the package will be executed in conjunction with a Db2 plan, with which it will be associated by way of the plan's package list (more on that to come).
OK, on to the first of the two aforementioned drivers of unexpectedly high catalog access activity related to package authorization checking.
A too-small package authorization cache
In a Db2 system in which the workload involves execution of a lot of static SQL statements, there can be a LOT of package authorization checks - like, hundreds per second. To help make this authorization checking as efficient as it can be, Db2 has, in its database services address space, an in-memory cache of package authorization information. When Db2 checks to see if ID ABC is authorized to execute package XYZ, it will check the package authorization cache first, to see if the authorization information is there. If that information is not found in the cache, Db2 checks the catalog (the SYSPACKAUTH table, for example) to see if ID ABC is authorized to execute package XYZ. The information found in the catalog is then placed by Db2 in the package authorization cache in memory, so that the next time a check of ID ABC's authorization to execute package XYZ is needed, the check can be very quickly and efficiently accomplished via the cache.
All well and good, but sometimes there's a bit of a problem here, and it has to do with the size of the in-memory package authorization cache. If it's too small, there could be a lot of cache "misses" for package authorization checks, and those misses will drive catalog accesses. How can you tell if this is happening? Use your Db2 monitor to generate a statistics long report for the Db2 subsystem of interest, covering a busy hour of a busy day (depending on the Db2 monitor in use at your site, this report might be called a statistics detail report). In that report, you should see a set of fields under the heading AUTHORIZATION MANAGEMENT, such as those shown in the report snippet below (headings and field names might vary slightly from one Db2 monitor to another).
AUTHORIZATION MANAGEMENT QUANTITY /SECOND
--------------------------- -------- -------
PKG-AUTH SUCC-W/O CATALOG 5885.1K 1634.76
PKG-AUTH UNSUCC-CACHE 43464.00 12.07
These numbers look good. Why do I say that? Because more than 99% of the time, a package authorization check was completed without a requirement for catalog access (1634.76 + 12.07 package authorization checks per second, with only 12.07 of those requiring catalog access). There were very few package authorization cache "misses" (as indicated by the smallness of the values in the PKG-AUTH UNSUCC-CACHE row relative to the values in the PKG-AUTH SUCC-W/O CATALOG row).
Sometimes, the numbers in this part of a Db2 monitor statistics long report don't look so good. I've seen that the value in the PKG-AUTH UNSUCC-CACHE row can be as high as several hundred per second (even over 1000 per second). When that's the case, it can be an indication that the package authorization cache is way smaller that it should be. Why might that be the case? Well, the size of the cache is determined by the value of the CACHEPAC parameter in ZPARM, and prior to Db2 10 the default CACHEPAC value was only 100 KB. Sometimes, an existing ZPARM value is carried forward when a Db2 system is migrated to a new version, and that old and small CACHEPAC value did indeed go forward at quite a few sites; so, check the value of CACHEPAC for your Db2 system. If it is anything less than the max value of 10M (meaning, 10 MB), change the value to 10M.
Having said all this, I'll tell you that depending on the maintenance level of your Db2 system, you might look for CACHEPAC in your ZPARM listing and NOT SEE IT. Why is that? It's because the fix for APAR PH28280, which came out in the latter part of 2020, removed CACHEPAC from ZPARM and set its value internally to 10M (that fix did the same for CACHERAC, the parameter that specifies the size of the routine authorization cache, used to check authorization to execute Db2 routines such as stored procedures and user-define functions).
OK, so a too-small package authorization cache can lead to high levels of catalog access activity related to package authorization checks. You can make the size of that cache what it should be, which is 10M (or IBM will do that for you, via the fix for the aforementioned APAR), but there's something else that could cause a lot of catalog access for package authorization checks...
Location asterisks in a plan's PKLIST
I noted earlier that a static SQL package used by a local-to-Db2 application will be executed in conjunction with a Db2 plan (remote client applications that access Db2 via the Db2 distributed data facility are associated, by default, with plan DISTSERV, which is not a plan in the traditional sense). The plan for a local-to-Db2 application has something called a PKLIST, which is short for package list. In a plan's PKLIST are the packages that can be executed by way of the plan. You can make use of asterisks in a plan's PKLIST in a couple of ways. One way, which is very common and helpful, is to use an asterisk for the packages in a given package collection; so, if packages in collection COLL1 are to be executed through plan PLAN1, the PKLIST for PLAN1 can have the entry COLL1.*. That covers all the packages in collection COLL1, and it's a lot more convenient than putting each individual package in COLL1 in the PKLIST for PLAN1.
Another use of asterisks in a plan's PKLIST can be problematic. I'm referring here to the location of a collection. You could have, in the PKLIST for PLAN1, the entry *.COLL1.*, and here's what that high-order asterisk means: it means that PLAN1 can be used to execute packages in collection COLL1 at the local Db2 system and at any remote location at which a collection named COLL1 might exist. Sometimes, there actually is a collection COLL1 at a remote server, and there is a need for packages in the remote COLL1 collection to be executable through PLAN1. Usually, that is not the case - the plan is only going to be used with local packages. "So what?" you might say, "If PLAN1 is only going to execute packages in the local COLL1 collection, it can do that it it has an *.COLL1.* entry in its PKLIST." True, but here's the deal: if there is an asterisk in the location-name part of an entry in a plan's PKLIST, there will be a package authorization check done every time a package in that collection is executed via the plan. That, in turn, could lead to a lot of associated catalog access activity. If, instead, the entry is the plan's PKLIST is COLL1.* (i.e., nothing - not even an asterisk - is specified for the collection's location), that will be as though the local Db2 had been explicitly specified as the collection's location, and that single-location specificity will mean that authorization to execute the packages in COLL1 is checked at BIND PLAN time and will NOT be checked when packages in COLL1 are subsequently executed via the plan (the authorization checked at BIND PLAN time is that of the ID of the process issuing the BIND PLAN command - for example, that ID may have been granted EXECUTE ON PACKAGE COLL1.*).
With this in mind, if for a local-to-Db2 application involving static SQL execution you see an unexpectedly high level of catalog accesses (specifically, accesses to catalog tables such as SYSPACKAUTH), check the entries in the plan's PKLIST. If you see an entry of the form *.COLL1.*, you might want to do one of two things: remove the asterisk in front of the collection's name if the collection is local; or, if packages in a collection with the specified name at remote location LOC2 will be executed via the plan, change an entry of the form *.COLL1.* to LOC2.COLL1.*. What if the plan will be used to execute packages both in a local collection named COLL1 and in a COLL1 collection at location LOC2? In that case, have a COLL1.* entry in the plan's PKLIST for the local collection COLL1, and a LOC2.COLL1.* entry for the collection COLL1 at location LOC2. Yes, this is more of a hassle than just putting an *.COLL1.* entry in the plan's PKLIST, but it can result in elimination of execution-time package authorization checks (and associated catalog access activity).
And there you have it. To make package authorization checking as efficient as it can be, make the size of the package authorization cache as large as it can be, which is 10M (if you have not already done this or if Db2 has not already done that for you via the fix for APAR PH28280), and - for local-to-Db2 applications using static SQ packages - don't use an asterisk for the location of the packages' collection in the relevant plan's PKLIST (instead, don't specify anything - not even an asterisk - for the location of a local collection, and explicitly specify the remote location name of a not-local collection).
I hope this information will be useful for you.
Rob, can you tell me about inflight ,indoubt,postponed abort,incommit threads. Difference between those and short insight would help .
ReplyDeleteEach of these terms is defined on this page in the Db2 for z/OS Knowledge Center on IBM's Web site: https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/admin/src/tpc/db2z_consistencyafterfailure.html
DeleteRobert
Robert,can you tell what causes the object to go in psrbd additonal to rbdp state ?I see an index with with rbdp,psrbdp.But I see there was no alter done to the tables.Last run was load.
ReplyDeleteRDBP (REBUILD-pending) is set for a logical or physical index partition. PSRBD (page set REBUILD-pending) is set for a non-partitioned secondary index, and it affects the index in its entirety. See the descriptions of these statuses here: https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/ugref/src/tpc/db2z_loadpendingstatus.html.
DeleteA non-partitioned secondary index can definitely be in both RDBP and PSRBD status.
LOAD can certainly place indexes in REBUILD-pending status. That could happen if a LOAD job were terminated prior to completion of the INDEXVAL stage (see https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/ugref/src/tpc/db2z_resetrebuildpending.html).
LOAD can also put indexes in REBUILD-pending status when the utility is executed with the INDEXDEFER option (go to this page, and do a search in the page on INDEXDEFER: https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/ugref/src/tpc/db2z_loadsyntax.html).
Robert
Thanks Robert.That helped me
DeleteHappy to hear that - you're welcome.
DeleteRobert