In part 1 of this three-part series on leveraging Big Memory (i.e., a large z/OS LPAR real storage resource) to enhance DB2 workload CPU efficiency, I focused on getting the most out of a DB2 buffer pool configuration without increasing its size (this by adjusting certain threshold settings and redistributing buffers between low-I/O and high-I/O pools). In part 2, I offered guidance on being bold -- but not reckless -- in asking for more memory for a DB2 subsystem's buffer pool configuration. In this, the concluding entry of the series, I'll look at memory-for-MIPS moves that involve DB2 pools other than the buffer pools -- starting with the EDM pool.
Actually, that "starting with the EDM pool" bit is an "it depends" thing, depending on the version of DB2 for z/OS you're using. What I want to talk about here is improving CPU efficiency via the use of the RELEASE(DEALLOCATE) parameter of the BIND (and REBIND) PACKAGE command. RELEASE(DEALLOCATE), when used for an SQL-issuing program that is executed via a thread that persists through commits, saves CPU cycles by causing DB2 to retain certain items associated with package execution, such as table space locks and thread copies of package sections, until thread deallocation, as opposed to releasing these resources at each commit (and very likely having to re-acquire them as the application process continues to execute). Threads that persist through commits include CICS-DB2 protected entry threads, threads used by batch jobs, and -- starting with DB2 10 -- high-performance database access threads used for DRDA clients (I blogged about high-performance DBATs last year). The less SQL work an application process performs between commits, the greater the proportional CPU cost of at-commit resource release and re-acquisition that occurs when RELEASE(COMMIT) is in effect. It is for such application processes -- high volume, relatively simple transactions, and batch jobs that issue lots of commits -- that RELEASE(DEALLOCATE) can deliver significant CPU savings (sometimes on the order of 10%). There are operational considerations associated with RELEASE(DEALLOCATE) and persistent threads. These include utility execution scheduling (a utility might not be able to execute if an application process holds a lock -- even a non-exclusive, intent-type lock -- on the target table space, and with RELEASE(DEALLOCATE) that table space -level lock will be held until thread deallocation) and package bind and DDL activity scheduling (same reason -- you could end up having to wait for deallocation of persistent threads used for the execution of packages bound with RELEASE(DEALLOCATE)). Keep these considerations in mind. It's likely that you can deal with them -- lots of folks have (and the PKGREL option of the -MODIFY DDF command can help in this regard when you use high-performance DBATs).
Another operational consideration associated with the use (or increased use) of the RELEASE(DEALLOCATE) bind option has to do with increased virtual and real storage utilization, and that's where the "it depends" at the start of the preceding paragraph comes in. There's no "it depends" with respect to increased memory utilization -- that WILL happen when you use RELEASE(DEALLOCATE) with persistent threads. The dependency is on WHERE that increased memory utilization happens in the DB2 database services address space (aka DBM1). See, one of the thread-allocated resources that DB2 will retain until thread deallocation when RELEASE(DEALLOCATE) is in effect for a package is the copies of package sections associated with the SQL statements issued by the program. Here's how that works: when a program issues an SQL statement, the section of the associated package containing the bound statement (if it's static) will be copied from the "skeleton" part of the EDM pool to another part of DBM1 (if it's a dynamic SQL statement, the relevant package could be, for example, one of the IBM Data Server Driver packages used for
dynamic SQL statements issued by network-attached client-server
applications via JDBC calls). WHERE in DBM1 will that thread copy of the package be stored? THAT'S the version-dependent part of this equation. Prior to DB2 10, the thread's copy of a package section (or a plan section, in the case of a DBRM bound directly into a plan -- something that is not supported in a DB2 10 environment) was stored in a part of the EDM pool. With DB2 10, the package section allocated to a thread is copied to an agent local pool that is exclusive to the thread in question (and there's some plan-related thread storage, too, even in a DB2 10 system -- it's mostly package storage, but not all). This DB2 10 change is a good thing, in that it eliminates latching that occurred when EDM pool space was used for copies of packages allocated to threads. It also moved this storage above the 2 GB bar for packages bound in a DB2 10 system, and THAT gives you a WHOLE lot more virtual storage head room for using RELEASE(DEALLOCATE) to reduce the CPU cost of DB2-accessing programs. [In a DB2 Version 8 system, none of this virtual storage used for thread copies of package and plan sections went above the 2 GB bar, and with DB2 9 only some of it went above the 2 GB bar.]
The bottom line, then, is that RELEASE(DEALLOCATE) can do you more good, in terms of boosting CPU efficiency, in a DB2 10 environment because you almost certainly will be able to use this bind option more extensively than before. Now, this doesn't mean that you can or should bind ALL of your packages with RELEASE(DEALLOCATE) in a DB2 10 system, because real storage resources, though potentially extensive, or nonetheless finite. If your DB2 monitor supports DB2 10, it will report to you (in a statistics detail report or an online display) the amount of storage -- above and below the 2 GB bar -- used for plans and packages (these are copies associated with threads -- storage used for skeleton package and cursor table purposes is still in the EDM pool, and is reported as such). Keep an eye on that storage utilization, and of course keep an eye on the demand paging rate for the z/OS LPAR in which the DB2 subsystem is running: if that's in the single digits or less per second, on average, during busy periods of the processing day, real storage is not being pressured to an undesirable extent. In a DB2 9 or Version 8 system, keep an eye on the EDM pool storage used for PT and CT, and consider enlarging that part of the pool (or cutting back on your use of RELEASE(DEALLOCATE)) if the number of free pages there is indicated to be less than 10% of the total number of pages in that part of the pool (and that's actually two parts in a DB2 9 environment -- one above the 2 GB bar and one below).
The other EDM-pool related memory-for-MIPS play has to do with dynamic statement caching. With respect to the global (versus the local) statement cache, this has been above-the-bar storage since DB2 V8 (the global statement cache, allocated when the ZPARM parameter CACHEDYN is set to YES, is very commonly used; local statement caching, associated with the KEEPDYAMIC(YES) package bind option, is less commonly used). Your DB2 monitor will show you the hit ratio for the dynamic statement cache. The larger this section of the EDM pool, the greater the hit ratio is likely to be, and the larger that hit ratio is the more you save through avoidance of full prepares of dynamic SQL statements. With this said, there's probably a point of diminishing returns that you could hit in your system with regard to enlarging the dynamic statement cache (accomplished by updating the EDMSTMTC parameter in ZPARM). If the hit ratio for the dynamic statement cache is north of 90% in your environment, making the cache larger may not move the needle much further to the right.
Turning now the RID pool, used by DB2 for RID list processing (done for things like index ANDing and ORing, hybrid join, and list prefetch): what you want is for as many of these operations as practically possible to complete in memory, and a larger RID pool will help in accomplishing that objective. Your DB2 monitor will show you the number of times that RID processing operations were terminated because of a lack of storage for such processing, and a RID pool enlargement can be accomplished by increasing the value of the MAXRBLK parameter in ZPARM (the RID pool has been allocated in above-the-bar virtual storage since DB2 V8). DB2 10 delivered a couple of important changes related to RID list processing: first, the default size of the RID pool in a DB2 10 system is 400 MB, versus 8 MB with DB2 9 and DB2 V8 (this is one of a number of reasons why you can expect DB2 real storage utilization to increase when you migrate to DB2 10). Important change number 2 concerns what happens when RID pool space is insufficient for the completion of a RID list processing operation. Prior to DB2 10, such a storage-related RID list processing failure would cause DB2 to abandon RID list processing for the SQL statement in question and go instead with a table space scan for the object being accessed. With DB2 10, insufficient RID pool space for a RID list processing operation causes DB2 to keep on trucking with the operation, using 32K work file space for the RID list. That's likely to be a performance-benefiting approach versus abandoning RID list processing for a table space scan, and even though processing a RID list using work file space is not quite as CPU-efficient as getting it all done in the RID pool, the overhead of overflow is mitigated if you have a large buffer pool dedicated to your 32K work file table spaces (fewer I/Os means less CPU consumption). Note that the new DB2 10 ZPARM parameter MAXTEMPS_RID can be used to limit the amount of space in the work file database that DB2 can use for a single RID list processing operation.
Lastly, a word about the DB2 sort pool. This is space in memory (above the bar since DB2 V8) that DB2 uses for SQL-related (versus utility-related) sort operations. The larger the sort pool, the more CPU-efficient DB2 SQL-related sorts tend to be. If you're considering a sort pool enlargement, keep a couple of things in mind: first, this pool (sized via the SRTPOOL parameter in ZPARM), is the maximum size of the sort work area that DB2 will allocate for each concurrent sort user (so a sort pool size of X MB could result in DB2 using 20X MB of virtual storage for sort pool space if there are 20 concurrent sort operations in progress). Second, the default size of the sort pool went from 2 MB in DB2 V8 and DB2 9 systems to 10 MB with DB2 10 (again, one of multiple reasons for DB2 10's greater use of memory versus DB2 9 and DB2 V8). If you want to increase the size of the sort pool on your system (the upper limit is 128 MB, but the largest value I've seen on a DB2 system is 30 MB), keep that per concurrent sort item in mind, and as always keep your eye (perhaps with the help of a z/OS systems programmer) on the z/OS LPAR's demand paging rate. I'm fine with a rate in the low single digits per second or less. A demand paging rate in the higher single digits per second is yellow-light territory in my book, and a rate in excess of 10 per second would tell me that the z/OS LPAR's memory resource may be over-taxed, and I might take steps to reduce DB2's use of that resource (via buffer pool, EDM pool, RID pool, or sort pool adjustments, or some combination of these).
Modern mainframe systems tend to have a lot of real storage, and the size of that resource for the average z/OS LPAR keeps getting larger, based on what I'm seeing. I'd like for you to use System z memory to make your DB2 workload more CPU-efficient, and I hope that the information in this three-part blog entry will help you to do that.
I want to leave a comments stating that our account at a large Data Services Outsourcing is getting benefit out of this and many other articles. I hope you know we appreciate it.
ReplyDeleteThanks,
Appreciative Mainframe Programmer