Sunday, December 29, 2013

DB2 for z/OS: Want to use High-Performance DBATs? Check your MAXDBAT Value

Of the features introduced with DB2 10 for z/OS, high-performance DBATs is one of my favorites. It enabled (finally) DDF-using applications to get the CPU efficiency benefit that comes from combining thread reuse with the RELEASE(DEALLOCATE) package bind option -- a performance tuning action that has long been leveraged for CICS-DB2 workloads. Implementing high-performance DBATs is pretty easy: in a DB2 10 (or 11) environment, when a package bound with RELEASE(DEALLOCATE) is executed by way of a DBAT (i.e., a database access thread -- the kind used for DRDA requesters that connect to DB2 via the distributed data facility), that thread becomes a high-performance DBAT (if it isn't one already). Before jumping into this, however, you should consider some things that are impacted by the use of high-performance DBATs. One of those things is the DBAT pool. That's where the MAXDBAT parameter of ZPARM comes in, and that's what this blog entry is about.

The value of MAXDBAT determines the maximum number of DBATs that can be concurrently active for a DB2 subsystem. The default value is 200, and at many sites that value, or one that's a little larger, has effectively supported a much greater number of DB2 client-server application connections (the default value for CONDBAT in ZPARM -- the maximum number of connections through DDF to a DB2 subsystem -- is 10,000). How so? Well, if your system is set up to allow for inactive connections (CMTSTAT = INACTIVE has been the default in ZPARM since DB2 V8), when a DDF transaction completes the associated connection will go into an inactive state (a very low-overhead transition, as is the transition back to the active state) and the DBAT used for the transaction will go into the DBAT pool, ready to service another transaction. That can happen because a "regular" DBAT is only associated with a particular DB2 connection while it is being used to execute a request from said connection. Because it is common for only a small percentage of DDF connections to a DB2 subsystem to be active (i.e., associated with in-flight transactions) at any given moment, a large ratio of connections to DBATs has historically been no problem at all.

Bring high-performance DBATs into the picture, and things change. In particular, a high-performance DBAT, once instantiated, will remain dedicated to the connection through which it was instantiated until it's been reused by 200 units of work (at which point it will be terminated, so as to free up resources allocated to the thread). That high-performance DBAT, therefore, will NOT go into the DBAT pool when a transaction using the thread completes. When a request associated with another connection comes in (i.e., from a connection other than the one through which the high-performance DBAT was instantiated), the high-performance DBAT won't be available to service that request. Some other DBAT will have to be used, and guess what? If that DBAT isn't a high-performance DBAT, it will become one if the package associated with the incoming request (and that could be a DB2 Connect or IBM Data Server Driver package) was bound with RELEASE(DEALLOCATE). The DBAT pool thus becomes progressively smaller as high-performance DBATs are instantiated. Know what else happens? The number of active DBATs goes up -- maybe sharply. Why? Because a "regular" DBAT is active only while it is being used to execute a DDF transaction. A high-performance DBAT, on the other hand, is considered to be active as long as it exists -- that will be 200 units of work, as mentioned previously, and when a high-performance DBAT is waiting to be reused, it's an active DBAT.

This last point -- about the number of active DBATs potentially rising sharply when high-performance DBATs are utilized -- is illustrated by some information I recently received from a DB2 professional. At this person's shop, high-performance DBATs were "turned on" for a DB2 subsystem (the PKGREL option of the -MODIFY DDF command can be used as a "switch," telling DB2 to either honor RELEASE(DEALLOCATE) for packages executed via DBATs -- thereby enabling instantiation of high-performance DBATs -- or not), and the number of active DBATs for the subsystem went from the usual 60 or so to about 700. Because the MAXDBAT value for the DB2 subsystem was already at 750, these folks didn't run out of DBATs, but the pool of "regular" DBATs got pretty small. In response to the big increase in active DBATs seen when high-performance DBAT functionality was enabled, the MAXDBAT value for the DB2 system in question was increased to 2000. Was this OK? Yes: When packages are bound or rebound in a DB2 10 for z/OS environment, almost all thread-related virtual storage goes above the 2 GB "bar" in the DBM1 address space, and that allows for a 5- to 10-times increase in the number of threads that can be concurrently active for the DB2 subsystem.

So, if you're thinking about using high-performance DBATs (and you should), check your subsystem's MAXDBAT value, and consider making that value substantially larger than it is now. Additionally, take steps to enable selective use of high-performance DBATs by your network-attached, DB2-accessing applications. For programs that contain embedded SQL statements and, therefore, have their own packages (e.g., DB2 stored procedures -- both external and native), use RELEASE(DEALLOCATE) for the most frequently executed of these packages. For the packages associated with DB2 Connect and/or the IBM Data Server Driver, use two collections: The default NULLID collection, into which you'd bind the DB2 Connect and/or IBM Data Server Driver packages with RELEASE(COMMIT), and another collection (named as you want) into which you'd bind these packages with RELEASE(DEALLOCATE). Then, by way of a data source or connection string specification on the client side, direct DDF-using applications to NULLID or the other collection name, depending on whether or not you want high-performance DBATs to be used for a given application.

To keep an eye on DBAT usage for a DB2 subsystem, periodically issue the command -DISPLAY DDF DETAIL. In the output of that command you'll see a field, labeled QUEDBAT, that shows the number of times (since the DB2 subsystem was last started) that requests were delayed because the MAXDBAT limit had been reached. If the value of this field is non-zero, consider increasing MAXDBAT for the subsystem. You might also want to look at the value of the field DSCDBAT in the output of the -DISPLAY DDF DETAIL command. This value shows you the current number of DBATs in the pool for the subsystem. As I've pointed out, maintaining the "depth" of the DBAT pool as high-performance DBAT functionality is put to use might require increasing MAXDBAT for your DB2 subsystem.

DDF activity can also be tracked by way of your DB2 monitor. I particularly like to use a DB2 monitor-generated Statistics Long Report to see if the connection limit for a DB2 subsystem (specified via the CONDBAT parameter in ZPARM) is sufficiently high. In the section of the report under the heading "Global DDF Activity," I'll check the value of the field labeled CONN REJECTED-MAX CONNECTED (or something similar -- fields in reports generated by different DB2 monitors might be labeled somewhat differently). A non-zero value in this field is an indication that the CONDBAT limit has been hit, and in that case you'd probably want to set CONDBAT to a larger number to allow more connections to the DB2 subsystem.

So there you go. Using high-performance DBATs can improve the CPU efficiency of your DB2 for z/OS client-server workload, but if you do leverage high-performance DBAT functionality then you might need to boost the DBAT limit for your DB2 subsystem in order to maintain the depth of your DBAT pool, because as high-performance DBATs increase in number, pooled DBATs decrease in number (unless you've upped your MAXDBAT value to compensate for this effect). Boosting MAXDBAT in a DB2 10 (or 11) environment is OK, as thread-related virtual storage in such an environment is almost entirely above the 2 GB "bar" in the DBM1 address space (assuming that packages have been bound or rebound with DB2 at the Version 10 or 11 level). Of course, you need real storage to back virtual storage, so if you increase the MAXDBAT value keep an eye on the z/OS LPAR's demand paging rate and make sure that this doesn't get out of hand (if the demand paging rate is in the low single digits or less per second, it's not out of hand).

Wednesday, December 18, 2013

DB2 for z/OS: Monitoring Prefetch Read Activity

Not long ago, a colleague sent to me some DB2 for z/OS buffer pool activity data that a DB2-using organization had sent to him. The DB2 people at this site were alarmed by some of the prefetch-related numbers in the data, and they had requested a review of the information. My analysis of the figures suggested that the system administrators were concerned by numbers that were in fact positive in nature; however, I also saw prefetch items that really were troublesome, and I subsequently learned that these were likely due to a buffer pool configuration change that had been made to improve application performance but ended up working towards the opposite end. In this blog entry I'll explain why buffer pool statistics related to prefetch activity are sometimes misinterpreted, and how "tuning" actions intended to bolster synchronous read performance can have unexpectedly negative consequences.

First, the misinterpreted figures. A system administrator at the aforementioned DB2 for z/OS site was doing a good thing: reviewing the output of the DB2 command -DISPLAY BUFFERPOOL(bpname) DETAIL for two of the key buffer pools allocated in a production subsystem. He calculated some ratios, and was particularly concerned about a couple of the calculated values. For both of the pools, the ratio of pages read via dynamic prefetch to the number of dynamic prefetch requests was quite low: about 2 to 1 for one pool, and only 0.13 to 1 for the other pool. This had the appearance to the system administrator of something being not right, as he knew that a dynamic prefetch request would generally be for 32 pages.

In fact, these dynamic prefetch requests versus pages read via dynamic prefetch numbers were just fine, but they are often viewed incorrectly. One factor that contributes to these incorrect readings of -DISPLAY BUFFERPOOL data is the fact that many people look at prefetch requests when they should be focusing instead on prefetch I/Os. The really important thing to understand here is this: a prefetch request is just that: a request to read from disk a particular set of pages (typically 32 contiguous table space or index pages in the case of dynamic prefetch). If all of the pages associated with a prefetch request are already in the buffer pool, the prefetch request will not result in a prefetch I/O. What that should tell you is, a high ratio of prefetch requests to prefetch reads is a good thing -- it means that pages are being effectively cached in the buffer pool, so that when a chunk of said pages is asked for by way of a prefetch request, an associated I/O will often not be necessary because all of the pages in the prefetch request are already in memory. Sure, a prefetch I/O will often be preferred to a synchronous I/O, because the latter always results in application wait time, while in the case of the former it is hoped that the prefetch I/O operation will complete before the application process (the one on behalf of which the I/O is being driven) asks for rows (or index entries, as the case may be) from the pages being read from disk. Even so, however, a "non-I/O" is preferable versus a prefetch I/O.

Pages read via prefetch, then, should be compared to prefetch I/Os, not to prefetch requests. Looked at in that light, pages read via dynamic prefetch for the buffer pools of interest here might still appear to be on the low side: about 13 pages per prefetch I/O for the one buffer pool and about 14 pages per prefetch I/O for the other pool. It would be better if those figures were closer to 32, the size of a typical dynamic prefetch request, right? WRONG. It's the same thing as a high ratio of prefetch requests to prefetch read I/Os: fewer pages read into memory per prefetch I/O is a good thing. It means that (again) pages are being effectively cached in the buffer pool, so that when a prefetch request for 32 pages (for example) is issued, far fewer than 32 pages will have to be read into memory from disk because close to 20 of those 32 pages (on average for these particular buffer pools during a particular time period) are already in the buffer pool.

Thus it is that the high ratio of dynamic prefetch requests to dynamic prefetch read I/Os seen by the system administrator for the two buffer pools in his DB2 system, and the low ratio of pages read via dynamic prefetch to dynamic prefetch read I/Os, are not "uh-oh" numbers -- they are "Yes!" numbers. At the same time, however, there were indeed some "uh-oh" numbers in the -DISPLAY BUFFERPOOL output that caught my eye. Specifically, I saw that prefetch had been disabled for one of the pools, due to a lack of buffer resources, 164 times during the 28 minutes of activity captured in the command output. [The output of -DISPLAY BUFFERPOOL DETAIL contains a timestamp for each buffer pool listed, showing the start of the activity-capture time period (the end of the period is the time at which the command was issued). This timestamp will indicate either the time at which a pool was last allocated, or when the -DISPLAY BUFFERPOOL command was previously and most recently issued for the pool; so, if the command is issued once for a pool and then issued again for the pool an hour later, the timestamp value in the output of the second issuance of the command will be one hour before the time at which the command was issued for the second time.] That's not good. Because prefetch reads, when appropriate, are generally preferred over synchronous reads (this because, as previously mentioned, they are "anticipatory" reads, intended to bring pages into memory before they are needed by an application process), you really don't want to see prefetch disabled because of a shortage of buffers available to support prefetch reads.

Why would there be such a buffer shortage for this pool? I had a suspicion, and that was confirmed when I saw that the sequential threshold for the pool (also known as VPSEQT, short for the virtual pool sequential threshold) was set at 25, versus the default value of 80. What did that mean? It meant that instead of the usual 80% of the pool's buffers being available to hold pages brought into memory via prefetch, only 25% of the buffers were available for this purpose (the other 75% of the buffers in the pool were reserved exclusively for pages read into memory via synchronous reads). Why had the sequential threshold been lowered for the pool? The thinking had been that this action would reduce synchronous read activity (and we like to reduce that because, as I pointed out, a synchronous read always involves application wait time) by providing more synchronous-read-only buffers in the pool. Sounds logical, right? And, it might have had the desired effect if there hadn't been a lot of prefetch requests for the pool. In fact, there were more than 800 prefetch requests per second for this pool, and dramatically reducing the buffer resources available to support prefetch I/Os associated with the prefetch requests ended up negatively impacting synchronous read activity -- precisely the opposite of the hoped-for effect.

Here's the connection between a non-zero value for PREFETCH DISABLED - NO BUFFER in -DISPLAY BUFFERPOOL output and synchronous read activity: when prefetch is disabled, the pages that would have been read into memory via a prefetch read I/O will likely still be sitting on disk when an application process needs them. What happens then? They are synchronously read into the buffer pool (while the application process waits). Thus, incidents of prefetch being disabled can increase synchronous read activity; furthermore, with far fewer buffers (versus the default) in this pool being available for holding prefetch-read pages, pages that had been brought into memory via prefetch were quickly being flushed out of the pool to make room for other prefetch-read pages, and that meant that they couldn't be re-referenced without driving more I/Os, and that also worked to drive up synchronous read activity for the pool.

So, I told these folks (through their local IBM technical specialist) to significantly increase the VPSEQT setting for this buffer pool, and explained the rationale behind the recommendation. They ended up changing the sequential threshold for the pool back to the default value of 80. The results (shown in the buffer pool section of a DB2 monitor statistics long report for the DB2 subsystem) were quite positive and, I think, interesting: the frequency of prefetch being disabled due to a lack of buffer resources went from about 6 per minute to zero, and synchronous read activity dropped by 43%, even though the percentage of the pool's buffers set aside exclusively for synchronous reads went from 75% to 20%. With the VPSEQT change, more pages could be read into memory in an anticipatory fashion, and those pages, once prefetched into the pool, were likely to stay resident in the pool for a longer period of time, and the combination of these effects reduced the need to synchronously read pages from disk.

Here are what I see as the key take-aways from this case:
  • Do not discount the importance of prefetch reads. Some DB2 for z/OS people are overly focused on synchronous read activity, to the point that they will do things that negatively impact prefetch read operations for a buffer pool. That can be a mistake, leading in some cases to increased synchronous read activity.
  • Be careful about reducing VPSEQT for a buffer pool. Such an adjustment might have an overall positive performance impact for a pool that has relatively little prefetch activity. Here, I'd look at the volume of prefetch requests (sequential, list, and dynamic) per second for the pool of interest. If that number is high (e.g., hundreds or more per second), DB2 has determined that asynchronous reads are the right choice for bringing many of the pages of objects assigned to the pool that are needed by programs into memory. If you squeeze down the resources with which DB2 can work in servicing prefetch requests, the result could be an undesirable increase in synchronous read activity. Even if you do think that you have a pool for which a lowering of the VPSEQT setting could be beneficial, don't go overboard. Don't change from VPSEQT=80 to some much-lower value in one fell swoop -- make more modest adjustments to VPSEQT and monitor results as you go along.
  • Monitor buffer pool activity in an ongoing fashion, using information from your DB2 monitor and/or the output of the -DISPLAY BUFFERPOOL DETAIL command (a form of this command that I like to use is -DISPLAY BUFFERPOOL(ACTIVE) DETAIL, and I like to issue that command once and then again in an hour to have an hour's worth of activity captured in the output of the second issuance of the command). Look for incidents of PREFETCH DISABLED - NO BUFFER and PREFETCH DISABLED - NO READ ENGINE (the latter can also be related to a lack of buffer resources, if that lack of resources leads to elevated levels of prefetch read I/O activity). If you see non-zero values in these fields for a buffer pool, first check the VPSEQT value for the pool, and consider changing that value to 80 if it is less than 80. If you already have VPSEQT=80 for the pool, consider making the pool larger, if you have the real storage in the z/OS LPAR to back a larger pool.

If you're a DB2 person, prefetch is your friend. Don't hobble it.