Monday, September 30, 2019

Db2 12 for z/OS - Let's Talk About MAXDBAT in ZPARM

"ZPARMs" is a reference to the parameters in the Db2 for z/OS module called DSNZPARM - these are essentially the configuration parameters for a Db2 subsystem. Two of the ZPARMs that are closely related to each other are CONDBAT and MAXDBAT. CONDBAT specifies the maximum number of connections that network-attached applications (i.e., applications that access the Db2 system via the Db2 distributed data facility, aka DDF) can have with the Db2 system at any one time. MAXDBAT is, essentially, the number of those application connections that can be "in-use" at one time. ["In-use" has to do with the processing of transactions that are originated by the connected applications - for such a transaction to be processed by Db2, the associated connection has to be paired with a type of Db2 "thread" called a database access thread, or DBAT (think of a DBAT as a DDF thread).]

As a rule, you don't want a Db2 system's CONDBAT limit to be reached. Why? Because when that happens, the next connection request from a DDF-using application will fail with an error code. How can you tell if a Db2 system's CONDBAT limit has been reached (aside from noting that an application encountered a connection failure)? You can see that in a statistics detail report generated by your Db2 monitor (depending on the monitor, that might be called a statistics long report). In that report, you'd see a section with DDF-related information, and a field that would be labeled as the blue-highlighted line below (or in a similar way):

GLOBAL DDF ACTIVITY          QUANTITY
---------------------------  --------
DBAT/CONN QUEUED-MAX ACTIVE      0.00
CONN REJECTED-MAX CONNECTED      0.00

If that blue-highlighted number is non-zero, the CONDBAT limit for the Db2 system was hit during the time interval captured in the report. If I saw that CONDBAT was hit on my Db2 system, I'd increase the CONDBAT value. I wouldn't make the CONDBAT value way higher that it needs to be (it can be as high as 150,000 for a Db2 subsystem), but neither would I be stingy in this department (an application connection not in an "in-use" state is placed in an inactive status by Db2, and an inactive connection has a very small virtual storage footprint, and snapping it back to active status when needed is a very low-overhead operation).

Simple enough, but what about MAXDBAT? Do you also want that limit to not be reached? The answer to this question is a little less straightforward versus the CONDBAT situation. I'd say that in most cases you'd want the MAXDBAT value to be high enough so as not to be reached, but that's not necessarily so in all cases. First, how can you see that MAXDBAT has or has not been reached for a Db2 system? One way would be to check the aforementioned statistics detail (or statistics long) report generated by your Db2 monitor - the field highlighted in purple below is the relevant one:

GLOBAL DDF ACTIVITY          QUANTITY
---------------------------  --------
DBAT/CONN QUEUED-MAX ACTIVE      0.00
CONN REJECTED-MAX CONNECTED      0.00

A non-zero value in the purple-highlighted field indicates that the MAXDBAT limit was reached for the Db2 subsystem during the time interval captured in the report. Another way to check on this would be to issue the Db2 command -DISPLAY DDF DETAIL. In the output of that command you'd see this field and an accompanying value:

QUEDBAT=      0

The QUEDBAT value indicates the cumulative number of times that the MAXDBAT limit was hit for the Db2 subsystem since DDF was last started (which probably would be when the Db2 subsystem was last "bounced," or stopped and restarted).

What happens when MAXDBAT is hit? In that case, the request for a DBAT (required in order to service a transaction request coming via an application connection) is queued. No error, at least not right away, but if the request waits too long then the application server might time the transaction out, and we likely don't want that; so, you don't want MAXDBAT to be hit, right? If it is hit, you want to increase the MAXDBAT value, right? Probably right, but not always. Here's one of the "not always" scenarios: suppose you have a DDF workload characterized by the occasional surge of transaction volume (such as might occur during a certain part of a month). You could make the MAXDBAT value large enough to accommodate that surge, and that would be OK if the Db2 system's processing capacity is sufficient to effectively process the surge of transactions. What if that is not the case? What if the surge of transactions, if allowed to flow right into the Db2 system, would max the system out, taking the utilization of the Z server's general-purpose "engines" (processors) to something close to 100%? The z/OS system won't fail (z/OS is famous for staying up in extreme processing situations), but work could get really backed up, so much so that response times could soar, leading to performance complaints from application users (and maybe to monetary penalties if a service level agreement is violated).

If you have a situation in which a DDF transaction surge overwhelm's a server's processing capacity, you might be better off with a MAXDBAT value that induces some transaction queueing during surges. If, for example, the number of concurrently-executing DDF transactions on a system is almost always below 2000, and surges above that have severely impacted response times, a MAXDBAT value of 2000 could be beneficial. Yes, if a surge comes along and MAXDBAT is hit then transactions will start queueing up, waiting for a DBAT to come free (when an in-process transaction completes), but the system, shielded from the negative impact of transaction overload, will continue to process work quickly and efficiently. That, in turn, will cause in-use DBATs to free up quickly, and that could mean that the time a DBAT-awaiting transaction spends in the queue will be very small. While there would be some response time elongation due to some transactions having to queue up for a DBAT, the performance impact could be reduced versus the "let 'em all in when they arrive" situation.

Bottom line: if a Z server has the processing capacity to efficiently handle "peak of the peak" DDF transaction volumes, make the MAXDBAT value high enough to avoid transaction wait-for-DBAT queueing. If, on the other hand, the occasional really-big DDF transaction surge causes the system to get severely bogged down so that transaction service times shoot way beyond the acceptable level, you could actually improve performance and throughput via a MAXDBAT value that maintains efficient processing by forcing a degree of transaction queueing. Note, too, that if you want to induce surge-time queuing only for transactions associated with a particular DDF-using application or applications, you can accomplish that via the Db2 profile tables (SYSIBM.DSN_PROFILE_TABLE and SYSIBM.DSN_PROFILE_ATTRIBUTES), which enable the setting up of DBAT limits (and/or connection limits and/or idle thread timeout values) in a granular fashion (as described in an entry I posted to this blog a couple of years ago).

I hope that this information will be helpful for you.