Db2 13 for z/OS, which became generally available about seven months ago, introduced two interesting features that are similar in some ways but differ in one important aspect (about which I'll comment momentarily). These new features allow an application (or, more broadly, a "process") to set its own lock timeout limit and/or its own deadlock priority. With this blog entry I aim to provide related information that will be useful for you.
Application-level lock timeout
First, let's establish the need for this Db2 13 enhancement. Historically, there has been one lock timeout limit - specified via the IRLMRWT parameter in ZPARM - that applies to all processes interacting with a Db2 subsystem. While IRLMRWT is still there in a Db2 13 environment, it became apparent some time ago that "one size fits all" will often NOT be ideal when it comes to lock timeout in a Db2 system. Think about it. Suppose the value of IRLMRWT is at the default of 30 seconds for a production Db2 system at your site. You might have a developer of a Db2-accessing online application say, "What? NO! This app has a mobile front-end and users can get VERY frustrated if they have to wait more than a few seconds for a transaction to complete. It would be TERRIBLE to have a transaction sit and wait for 30 seconds to get a Db2 lock. We need the lock timeout value to be WAY lower than 30 seconds." At the same time, a developer of a long-running batch application might say, "What? NO! This job HAS to complete once it gets started or we miss SLAs and have angry customers. The job typically runs for five hours, and maybe it's been running for four hours and you want to time it out because it's been waiting for a lock for 30 seconds? 30 seconds is NOTHING as far as this job's concerned. The Db2 lock timeout value should be SUBSTANTIALLY greater than 30 seconds." Both of the developers are expressing legit concerns. How can those disparate concerns be addressed?
They can be addressed via the new (with Db2 13) special register named CURRENT LOCK TIMEOUT (available for use when Db2 13 function level 500 has been activated). Here are some things to know about CURRENT LOCK TIMEOUT:
- The value of the special register - expressed in seconds - can be anything between -1 and 32767 (or a site-specified upper bound - see the next item in this list). A value if -1 means that the process will not be timed out if it ends up waiting for a lock - it will wait until it gets the requested lock or becomes deadlocked with some other process. A value of 0 means that the process does not want to wait at all for a lock - it wants to get an error message if a requested lock can't be obtained immediately (this basically makes available for application use a formerly Db2-internal mechanism known as a conditional lock request).
- If the default upper-limit value of 32767 seconds is deemed by a Db2-using organization to be too high, a different max value can be provided via the new (with Db2 13) ZPARM parameter SPREG_LOCK_TIMEOUT_MAX. If you set that value to (for example) 1800, no process will be able to set the CURRENT LOCK TIMEOUT special register to a value greater than 1800 seconds.
- If a lock timeout occurs and an application-level timeout limit was in effect for the lock requester and/or for the lock holder, that will be reflected in the information provided via the DSNT376I lock timeout message generated by Db2.
- The value of the CURRENT LOCK TIMEOUT special register can be set automatically for an application by way of the Db2 profile tables, and not just for DDF-using applications (more information on this is provided below).
- The maximum value for DEADLOCK_RESOLUTION_PRIORITY is 255 (the range of acceptable values for the global variable is 0-255).
- If you're a Db2 DBA, you might think, "I have a process that I think of as 'should-complete,' versus 'must-complete.' I want that process to generally be the winner in a deadlock situation, but I don't want it to get in the way of a 'must-complete' process. If 255 is a good DEADLOCK_RESOLUTION_PRIORITY value for a 'must-complete' process, what would be a reasonable priority value for a 'should-complete' process?" There is not a totally straightforward answer to that question. What you could do is this: start with some value for the 'should complete' process (maybe 150, or maybe 200, for example), and see if it ends up becoming the loser in a deadlock situation. If that happens, you can see how the priority of the "winner" process compared to the priority that you assigned to your 'should-complete' process, and potentially adjust your process's priority accordingly. How could you see the deadlock priority of a process that "beat" your process? That information is available via the IFCID 172 Db2 trace record. Activating IFCID 172 should involve very little overhead, as the trace record captures information about deadlocks, and deadlocks tend to be unusual in most Db2 systems I've seen. By the way, you should be able to use your Db2 monitor to generate a report with formatted information from IFCID 172 trace records (if you use IBM's OMEGAMON for Db2 monitor, the report to use for this purpose is called the Record Trace Report - that report can format the information in most any Db2 trace record).
- I mentioned previously that there are exceptions to the "255 always wins" rule. Even if DEADLOCK_RESOLUTION_PRIORITY has been set to 255 for a process, that process could be the loser if it gets deadlocked with a process that is changing data in a table space defined with NOT LOGGED (hard for Db2 to roll back a unit of work when there are no associated undo records in the log), or if it gets deadlocked with a rollback or an abort or a backout process.