Tuesday, February 27, 2024

Db2 13 for z/OS: Now You Can Dynamically Remove, as Well as Add, Active Log Data Sets

Db2 10 for z/OS (which came out back in 2010) provided a new capability related to management of a Db2 subsystem's active log data sets (known, along with the archive log data sets, as the subsystem's "log inventory"). The enhancement: the NEWLOG option of the Db2 command -SET LOG. With this added functionality, a Db2 for z/OS system administrator could add new active log data sets to a Db2 subsystem's log inventory, without having to bring the subsystem down (the system administrator would probably, in fact, add new pairs of active log data sets, as you always want to use dual logging to avoid a single point of failure for system and data recovery operations). Prior to this Db2 10 enhancement, adding active log data sets to a subsystem's log inventory could only be accomplished through execution of the DSNJU003 utility (also referred to as the "change log inventory" utility), and DSNJU003 can only be executed when the target Db2 subsystem is down.

The ability to dynamically add pairs of active log data sets to a Db2 subsystem's log inventory was welcomed by many Db2 people, and you can probably imagine why. A Db2 subsystem's active log data sets can be thought of, logically, as a ring of data sets around the Db2 subsystem. Suppose there are 20 pairs of active log data sets in this logical ring. Db2 writes information to pair #1, and when that pair of data sets is filled up then information is written to pair #2, and when that pair is filled up then information is written to pair #3, and so on around the ring. Meanwhile, not long after the filling up of active log data set pair #1, the information written to that pair of data sets will be copied to a pair of archive log data sets, and that action will make active log data set pair #1 reusable, so that new information can be written to that pair of data sets when Db2 comes back around the ring to them. The same archive operation is performed for other active log data set pairs after they have been filled up, making them reusable when their turn in the rotation comes up again to be the current active log data set pair.

All well and good - unless something goes wrong with the archive log write process. If filled-up active log data set pairs can't be archived, they can't be made reusable, and when Db2 has gone around the ring and comes back to the not-reusable active log data set pairs, logging will stop, and when logging stops just about everything stops. In a pre-Db2 10 environment, you could add active log data set pairs to a subsystem's log inventory to buy more time (by providing more space for logging) as you worked to fix whatever was impeding the log archiving process, but at the cost of stopping the Db2 subsystem in order to execute the DSNJU003 utility. Not good. Being able to buy extra fix-the-archiving-problem time by dynamically adding new pairs of active log data sets to a Db2 subsystem's log inventory, while the subsystem was still up and running, made for a much better situation.

Fast-forward to Db2 13 for z/OS, and now we get (once function level V13R1M500 has been activated) the ability to dynamically remove active log data set pairs, thanks to the new REMOVELOG option of the -SET LOG command. The value of dynamic (i.e., while the Db2 subsystem is up and running) removal of active log data set pairs is as a complement to the dynamic-add functionality we've had since Db2 10. Together, the NEWLOG and REMOVELOG options of the -SET LOG command provide a capability that can be very useful - namely, online replacement of a Db2 subsystem's active log data set pairs with better data set pairs.

"Better?" How so? Well, usually this will mean bigger and/or encrypted. Let's take the data set size case. Suppose you have a production Db2 subsystem that has 20 pairs of active log data sets, each data set being 2 GB in size. You're going through those active log data sets faster than you'd like - maybe filling up three or four (or more) pairs in an hour when the system is busy. You'd rather have active log data sets that are 8 GB apiece, versus 2 GB (Db2 12 for z/OS took the maximum size of an active log data set from 4 GB to 768 GB). Can you go from 2 GB active log data sets to 8 GB active log data sets without stopping the Db2 subsystem? With Db2 13, you can. Here's how that would work:

  1. You dynamically add 20 pairs of active log data sets that are sized at 8 GB apiece, using the NEWLOG option of the -SET LOG command (a Db2 subsystem can have up to 93 pairs of active log data sets).
  2. After the older and smaller active log data sets have been archived, dynamically remove them from the Db2 subsystem's log inventory via the new (with Db2 13) REMOVELOG option of the -SET LOG command.
Now you have 20 pairs of active log data sets, each sized at 8 GB, when before you had 20 pairs of active log data sets sized at 2 GB apiece, and in getting from A to B you never had to stop the Db2 subsystem.

The same approach could be used to go from 20 pairs (for example) of unencrypted active log data sets to 20 pairs of encrypted active log data sets in an online way (referring here to exploitation of the data set encryption feature of z/OS):

  1. Dynamically add 20 pairs of active log data sets with which an encryption key label was associated at data set creation time.
  2. When the older unencrypted data sets have been archived, dynamically remove them from the Db2 subsystem's log inventory.
In these example use cases, I've utilized the phrase, "when the older (smaller and/or unencrypted) actice log data sets have been archived, dynamically remove them." That suggests that trying to dynamically remove a not-yet-archived active log data set could be problematic. Do you need to worry about this? No. Why not? Because Db2 won't let you accidentally shoot yourself in the foot when using the REMOVELOG option of -SET LOG. Specifically:

  • Db2 won't let you remove an active log data set to which it is currently writing information.
  • Db2 won't let you remove a log data set in the pair that is next in line for the writing of log information.
  • Db2 won't let you remove an active log data set that has not been archived (i.e., an active log data set that is not in REUSABLE status).
  • Db2 won't let you remove an active log data set that is currently in use (for example, an active log data set that is being read by a RECOVER utility job).
If you try to dynamically remove an active log data set to which Db2 is currently writing, or one that is next in line for writing, or one that has not been archived (i.e., is not in the REUSABLE state), the -SET LOG command will fail with the message DSNJ391I. If the active log data set you're trying to dynamically remove does not have one of these characteristics but is currently in use by some process, that data set will be marked as REMOVAL PENDING, and message DSNJ393I will be issued. In that case, you can remove the data set from the log inventory by issuing -SET LOG with REMOVELOG again when the data set is no longer in use. Alternatively, if the Db2 subsystem is standalone in nature (as opposed to being a member of a Db2 data sharing group), the data set will be removed from the log inventory automatically when the subsystem is next recycled (in a data sharing environment, subsequent re-issuance of -SET LOG with the REMOVELOG option is required to remove a REMOVAL PENDING data set from the log inventory). Note that if an active log data set has been marked as REMOVAL PENDING, it will not be used again by Db2 for read or write purposes. Note also that information about an active log data set that is in REMOVAL PENDING status can be checked via the output of the Db2 command -DISPLAY LOG DETAIL (the DETAIL option was added with function level 500 of Db2 13). When you see, in the output of -DISPLAY LOG DETAIL, that an active log data set in REMOVAL PENDING status has 0 readers, you know that it is no longer in use and can be physically removed from the log inventory with another issuance of -SET LOG with REMOVELOG.

One more thing: I have been referring to removal of an active log data set "from the log inventory" of a Db2 subsystem. In the Db2 documentation, you'll see references to removal of an active log data set "from the BSDS" of a Db2 subsystem. The documentation is saying the same thing I'm saying. The BSDS - short for bootstrap data set - contains information about a Db2 subsystem's active and archive log data sets.

OK, there you have it. If you want to upgrade your active log data sets in one or more ways - maybe bigger than they are now, maybe encrypted versus unencrypted - then the REMOVELOG option of -SET LOG (thanks, Db2 13), together with the NEWLOG option (thanks, Db2 10) is your ticket for getting that done without having to stop the Db2 subsystem in question. Just another way that Db2 for z/OS enables you to take high availability higher than ever before.

Wednesday, January 31, 2024

Db2 for z/OS: Stop Making APPLCOMPAT in ZPARM More Important Than It Is

The APPLCOMPAT option of the Db2 for z/OS BIND and REBIND PACKAGE commands is really important - that's why I posted part 1 and part 2 blog entries on the topic back in 2019. The APPLCOMPAT parameter in ZPARM, on the other hand (referring to DSNZPARM, the data set that contains a Db2 subsystem's configuration parameter settings), is less important. I pointed this out in part 1 of the aforementioned two-part blog entry on APPLCOMPAT, but I still find that plenty of Db2 for z/OS people ascribe significance to the ZPARM parameter APPLCOMPAT that just doesn't jibe with reality. That being the case, I am writing this blog entry in the hope that it will help to drive home the point that the ZPARM parameter called APPLCOMPAT should (generally speaking) not be the main focus of your APPLCOMPAT-related concerns.

To illustrate the point that plenty of people continue to over-inflate the importance of the APPLCOMPAT parameter in ZPARM, I'll share with you a question that a Db2 for z/OS person sent to me by way of one of my colleagues. The question was, basically, "We are getting ready to activate Db2 12 function level V12R1M510 (a prerequisite for migration from Db2 12 to Db2 13 for z/OS). Can we be pretty much assured that doing this will not cause SQL behavioral changes if we leave the value of APPLCOMPAT in ZPARM unchanged (they had this ZPARM parameter set to V10R1)?" In responding to this question, I explained that in advancing a Db2 system's active function level, one can indeed protect application programs from the risk of SQL behavioral changes (I'll explain what that means in a moment), but, I noted, this SQL behavioral change protection is provided by the APPLCOMPAT package bind specification, NOT by the APPLCOMPAT parameter in ZPARM. You can take a Db2 system's active function level as high as you want, and that will not lead to application-affecting SQL behavioral changes as long as you don't change the APPLCOMPAT value of your applications' Db2 packages. The value of the APPLCOMPAT parameter in ZPARM is only somewhat relevant to this discussion.

OK, what's all this about "SQL behavioral changes?" The term refers to this situation: same SQL statement, same data, different result. You might think, "How could that happen?" Well, every now and then, the Db2 for z/OS development team decides that the behavior of a given SQL statement should change, for one reason or another (and it's well-considered - these changes are not effected lightly). That change can be introduced with a new version or a new function level of Db2. My favorite example of a Db2 for z/OS SQL behavioral change is one that happened with Db2 11. In a Db2 10 environment, you could use a SQL statement to cast an eight-byte store clock value (a time value that a program can obtain from the z/OS operating system) as a Db2 timestamp value. In a Db2 11 system, that same SQL statement - cast an eight-byte store clock value as a Db2 timestamp - would fail with a -180 SQL error code. Same SQL statement, same data, different result in a Db2 11 versus a Db2 10 environment.

Here's one reason I really like this example: how many programs do you have that need to cast an eight-byte store clock value as a Db2 timestamp? Probably none - and this is typically the case for Db2 SQL behavioral changes - they usually affect either zero or very few of an organization's Db2-accessing programs. Alright, but what if you did have programs that needed the Db2 10 behavior of the Db2 TIMESTAMP function? Would you have been up a creek when your Db2 system went from Db2 10 to Db2 11? No - you would have been fine in that case, because you could just bind the Db2 packages used by those programs with APPLCOMPAT(V10R1), and that would mean that the programs would execute with Db2 10 SQL behavior, and that would mean that those programs could cast an eight-byte store clock value as a Db2 timestamp. See - it's the APPLCOMPAT package bind specification that provides protection (when needed) from Db2 SQL behavioral changes.

[By the way, in the Db2 for z/OS documentation, what I have been calling "SQL behavioral changes" are referred to as "SQL incompatibilities." These are documented for each Db2 application compatibility level, going back to V10R1 (that's as far back as Db2 application compatibility goes).]

So, I said up front that the APPLCOMPAT parameter is ZPARM is not as important as the APPLCOMPAT specification for your Db2 packages. Does that mean that the ZPARM has no significance? No. What is the purpose of the APPLCOMPAT parameter in ZPARM? It's this: the ZPARM parameter provides the default value for a package's APPLCOMPAT setting when the BIND PACKAGE command is issued without an APPLCOMPAT specification. That's it. I tell people to think of APPLCOMPAT in ZPARM as being like a cubbyhole. A BIND PACKAGE command may be issued without an APPLCOMPAT specification. The package in question needs an APPLCOMPAT value. Where is Db2 going to get that value, when the value was not provided via the BIND PACKAGE command? Db2 in that case is going to look in the cubbyhole labeled APPLCOMPAT in ZPARM. In that cubbyhole is a piece of paper (figuratively speaking) on which (for example) V12R1M509 is written. OK, that will be the package's APPLCOMPAT value.

[You might wonder: what if REBIND PACKAGE is issued without an APPLCOMPAT specification? Will the rebound package in that case get the APPLCOMPAT value to which the ZPARM parameter APPLCOMPAT has been set? Probably not. Why not? Because it is very likely that a package being rebound already has an APPLCOMPAT value, and in that case if the REBIND PACKAGE command is issued without an APPLCOMPAT specification then the package's current APPLCOMPAT value will be retained. For REBIND PACKAGE, then, the APPLCOMPAT parameter in ZPARM is relevant only when the REBIND PACKAGE command is issued without an APPLCOMPAT specification and the package in question does not already have an APPLCOMPAT value (again, unlikely, though not impossible - you can check on this via a query of the SYSIBM.SYSPACKAGE catalog table, which has a column named APPLCOMPAT).]

Given that APPLCOMPAT in ZPARM simply provides the default value for APPLCOMPAT when BIND PACKAGE is issued without an APPLCOMPAT specification, what should the value of this ZPARM parameter be? There isn't a right or wrong answer to this question - it's up to you. Personally, I'd lean towards making the value of APPLCOMPAT in ZPARM as high as it can be, which would be equal to the currently active function level in a Db2 system. Why would I want that? Because APPLCOMPAT, in addition to providing protection (when needed) from Db2 SQL incompatibilities, also enables use of newer SQL syntax and functionality. If I have APPLCOMPAT in ZPARM set to, for example, V10R1, and BIND PACKAGE at my site is typically issued without an APPLCOMPAT specification, I am limiting application programmers to SQL syntax and functionality only up to the Db2 10 level - can't use newer built-in functions such as LISTAGG and PERCENTILE_CONT, can't use Db2 global variables, can't use Db2 arrays, can't use newer special registers such as CURRENT LOCK TIMEOUT, etc. Is that what you want? Sure, if a program using one of those newer SQL capabilities fails at bind time because of the default V10R1 APPLCOMPAT value, you can fix that problem by issuing BIND PACKAGE a second time with an APPLCOMPAT specification that is current enough to support the desired functionality, but again, is that what you want?

At some Db2 for z/OS sites, APPLCOMPAT in ZPARM is indeed set at V10R1. Why so low? One reason maybe the misunderstanding (to which I've referred) of the purpose of the ZPARM parameter. Alternatively, maybe APPLCOMPAT in ZPARM is set at V10R1 because of concern about BIND PACKAGE issued for programs that aren't net new but rather have had a bit of an SQL change (which would then require a new precompile and BIND in the case of a static SQL-issuing program, as opposed to a REBIND). A person might think, "What if there's an existing program with 20 static SQL statements, and a programmer changes just one of those statements? When there is a BIND PACKAGE (with ADD or REPLACE, as the case may be) for that program's Db2 package, and the BIND PACKAGE is issued without an APPLCOMPAT specification, I want a from-the-ZPARM-parameter default APPLCOMPAT value that will have the 19 SQL statements that weren't changed behaving as they always have." OK. I get that. Like I said, it's up to you. Just keep in mind that the risk of adverse impact on your programs from Db2 SQL incompatibilities is usually very low - these incompatibilities are relatively few and far between, and they tend to affect few if any of your Db2-accessing programs.

The main point I want to make is this: when you change the value of the APPLCOMPAT parameter in ZPARM, that action is not, in and of itself, going to cause Db2-accessing programs in your environment to suddenly start behaving differently. All you've done with the ZPARM parameter update is change the APPLCOMPAT value that a package will get if a BIND PACKAGE command is issued without an APPLCOMPAT specification. Consider the ZPARM value change in that light, and act accordingly.

Wednesday, December 27, 2023

Db2 for z/OS: Code Level, Catalog Level, Function Level, and More

In a Db2 for z/OS context, the terms "code level," "catalog level," and "function level" were introduced when the Db2 for z/OS development team went to the continuous delivery mechanism for delivering new product functionality in-between the availability dates of new versions of the DBMS. That was a little over 7 years ago, referring to the general availability of Db2 12 - the first continuous-delivery version of Db2 for z/OS. And yet, there remains a good bit of misunderstanding among some in the Db2 for z/OS user community regarding basic concepts that are of foundational importance in a continuous-delivery sense. That being the case, I'll try to shed some clarifying light on the subject via this blog entry.


Think about an upside-down hierarchy

By "upside-down" hierarchy, I mean one that goes bottom-up from a dependency perspective. At the bottom - the foundation - you have the Db2 system's code level. This has to do largely with the currency of the code in the Db2 system's load library (or libraries, if this is a Db2 data sharing group - it's a best practice in that case for each member subsystem to have its own load library). This is the code that is loaded into memory when the Db2 system is started. You obviously don't have availability of Db2 functionality if that functionality is not present in the Db2 system's code; so, if a Db2 system's code level is 121505 (indicating code that includes functionality delivered up through Db2 version 12 function level 505), you can't create a stored procedure with CREATE OR REPLACE syntax because that syntax was introduced with Db2 12 function level (FL) 507 - by definition, a Db2 12 FL505 code level does not include functionality first delivered by Db2 12 FL507 code.

I mentioned that a Db2 for z/OS system's code level is generally reflective of the currency of the Db2 code in question. Here's what that means: over the course of time, it's normal for the code of a Db2 system (and for other subsystems in a z/OS LPAR and for the z/OS LPAR itself) to be taken to a more-current maintenance level - ideally, this will be done 2-4 times per year, and often the aim is to take the code in the z/OS LPAR (Db2 code included) to a higher RSU level (RSU - short for Recommended Service Upgrade - is a packaging of z/OS and z/OS-related software maintenance that facilitates upgrading the service currency of a z/OS system). This process involves application of PTFs ("fixes," in z/OS parlance) to code in a z/OS system, including Db2 code. Maybe, in the course of one of these service-upgrade procedures, the fix for APAR PH33727 is applied to the system's Db2 code (that which a fix "fixes" is described via the associated APAR, i.e., the APAR describes what is changed or enhanced by the fix). APAR PH33727 is the one associated with Db2 12 function level 510, and when the corresponding PTF gets applied to a Db2 system's code then that system's Db2 code level will go to 121510. Does that mean that functionality delivered through Db2 12 function level 510 is now available in the system? No - there are further dependencies in the bottom-up hierarchy.


Next Db2 level up from code: catalog

The Db2 catalog is the set of tables that basically contain metadata - "data about the data," and about the related Db2 structures (e.g., tables, table spaces, indexes) and other associated database objects (e.g., packages, routines). Sometimes, a Db2 function level introduces new Db2 features that have catalog dependencies - in other words, these are new features that cannot be used until some Db2 catalog changes that support the new features have been effected. Take, for example, Db2 12 function level 509. That function level introduced the ability to specify a data compression type at the individual table space level, or at the partition level for a range-partitioned table space (two data compression types are available in a Db2 for z/OS system - one, which is based on the Lempel-Ziv compression algorithm, is referred to as fixed-length, and the other is Huffman compression). For a Db2 DBA to be able to utilize this feature, the first requirement is the ability to specify COMPRESS YES FIXEDLENGTH or COMPRESS YES HUFFMAN in a CREATE or ALTER TABLESPACE statement. That ability is provided in the Db2 code starting with code level 121509; however, the new forms of the COMPRESS YES clause can't be used unless Db2 can record in the catalog the fact that fixed-length or Huffman compression is used for a given table space or table space partition. That cataloging capability is provided by the COMPRESS_USED column that is added to the catalog table SYSIBM.SYSTABLEPART when the Db2 catalog level goes to V12R1M509 - hence, getting the catalog level to V12R1M509 is required for compression-type specification at the table space or partition level in a Db2 12 system (by the way, "fixed length," in a Db2 data compression context, does not refer to the length of rows in a table - it refers to the length of substitution values in a compression dictionary).

When there is a requirement to take a Db2 catalog to a higher level, that change is accomplished via execution of the Db2 utility called CATMAINT, with a specification of (for example) UPDATE LEVEL(V12R1M509). Note that if a Db2 system's catalog is currently at, say, the V12R1M500 level, it can be taken straight to the V12R1M509 level with one execution of CATMAINT - that one execution of the utility would make the catalog changes associated with level 509, and also the changes associated with other catalog levels between 500 and 509.

Sometimes, a Db2 function level introduces new capabilities that do not require catalog changes. In such cases, the catalog only has to be at the level related to the last preceding function level that did require catalog changes. For example, the features of Db2 12 function level 510 have no catalog dependencies; thus, there us no 510 catalog level, and use of Db2 12 FL510 functionality can be available when the Db2 system's catalog level is V12R1M509 (the description of a function level in the Db2 for z/OS documentation always lets you know if the function level requires catalog changes).

I mentioned in the preceding sentence that Db2 12 FL510 functionality "can be" available when the Db2 system's catalog level is V12R1M509. Does that mean that something other than a catalog level change can be required to use the features of a Db2 function level? Yep - that's exactly what that means.


Next level up: activated function level

For the continuous delivery mechanism for Db2 new-function delivery to work in a practical sense, the "turning on" of a function level's new features had to be made an asynchronous event with respect to the up-leveling of Db2 code that would introduce the features to the Db2 subsystem's load library. If this were not the case - if, instead, a Db2 code level's new features were instantly available once present from a load library perspective - then Db2 for z/OS systems programmers might hesitate to upgrade the maintenance level of a Db2 system out of concern about readiness to provide support and guidance in the use of the new features. That would not be a good thing - z/OS and z/OS subsystems function best when they are at a relatively current level of maintenance.

The means through which adding new features to Db2 code is made asynchronous to having that new code be usable in a Db2 system is the Db2 command -ACTIVATE FUNCTION LEVEL; so, a Db2 system's code level might be 121509, and the system's Db2 catalog level might be V12R1M509, but the previously-mentioned ability to issue ALTER TABLESPACE (or CREATE TABLESPACE) with a COMPRESS YES HUFFMAN specification won't be there until a Db2 administrator has issued the command ACTIVATE FUNCTION LEVEL (V12R1M509). Thanks to the -ACTIVATE FUNCTION LEVEL command, a Db2-using organization can decide when they want the features introduced in a Db2 code level to be usable in their Db2 environment.

So, does the -ACTIVATE FUNCTION LEVEL command put us at the top of our upside-down Db2 continuous delivery hierarchy? Not quite. One to go...


The last level: application compatibility

In a typical production Db2 for z/OS system, there's a lot going on - lots of different applications accessing Db2 for z/OS-managed data, lots of DBA activity related to administering the system, lots of new-program deployment action, etc. In light of that fact, the -ACTIVATE FUNCTION LEVEL command is a pretty big switch. What if the immediate need that an organization has for a given Db2 function level is related to exploitation of a new feature for a single application, or for one particular database administration task? Db2 application compatibility levels provide a way to very selectively exercise functionality that has been newly activated in a Db2 system. Db2 application compatibility levels are managed primarily through a Db2 package bind parameter called APPLCOMPAT (you might want to check out the part 1 and part 2 entries on APPLCOMPAT that I posted to this blog a few years ago). Returning to the previously used example, let's say that a Db2 DBA wants to alter a table space to use Huffman compression. Is it enough for the Db2 system's code level to be 121509, and for the catalog level to be V12R1M509, and for V12R1M509 to be the activated function level? No - that's not enough. The DBA will issue the ALTER TABLESPACE statement with a COMPRESS YES HUFFMAN specification by way of a Db2 package (there is always a package associated with execution of a Db2 SQL statement). That package might be related to one of the Db2-provided programs often used by DBAs to do their work - maybe SPUFI, or DSNTEP2. The package, like all packages, will have an APPLCOMPAT specification. For the ALTER TABLESPACE with COMPRESS YES HUFFMAN to execute successfully, the package through which the statement is issued - DSNTEP2, let's say - must have an APPLCOMPAT specification of not less than V12R1M509.

As this example suggests, a package's APPLCOMPAT value enables a program that issues SQL through the package to utilize SQL syntax that was introduced with a given Db2 function level. That is one purpose of the APPLCOMPAT package bind specification. The other purpose of APPLCOMPAT is to enable a program to get the SQL behavior of an earlier version and function level of Db2 for z/OS, if that older SQL behavior is needed. See, there are times when, going from one version or function level of Db2 to another, the behavior of a SQL statement will change. What does that mean? It means same SQL statement, same data, different result. This kind of change is referred to in the Db2 for z/OS documentation as a SQL incompatibility. There are times when a program executing in a Db2 system with function level X activated needs the behavior that a SQL statement had with a Db2 version or function level that is older than X. APPLCOMPAT can deliver, for this program, that older Db2 behavior. Here's an example: suppose that a DBA named Steve needs to create a non-universal table space in a Db2 system that he administers, and let's say that the activated function level for this system is V12R1M510. It's a fact that, starting with function level V12R1M504, a CREATE TABLESPACE statement can only create a universal table space. Is Steve stuck? No. Steve can create the needed non-universal table space by using a program (we'll say the Db2-provided DSNTEP2) whose package has an APPLCOMPAT value of V12R1M503. What if the DSNTEP2 package at Steve's shop has an APPLCOMPAT value of V12R1M504 or higher? No problem: Steve just needs to make sure that the first SQL statement issued by his DSNTEP2 job is SET CURRENT APPLICATION COMPATIBILITY = 'V12R1M503'; then, a CREATE TABLESPACE statement can be issued to create a non-universal table space (this scenario is described in an entry I posted to this blog in 2020). Note that SET CURRENT APPLICATION COMPATIBILITY can be used (with a dynamic SQL-issuing program) to dynamically take a program's application compatibility level to something below - but not above - the APPLCOMPAT level of the program's Db2 package.

For many Db2 DDF-using client-server applications (applications known as DRDA requesters in Db2-speak), the Db2 packages used will be those related to the IBM Data Server Driver. These packages reside, by default, in a collection named NULLID. If the NULLID packages have an APPLCOMPAT value of X, and some DRDA requester application requires an APPLCOMPAT value that is higher or lower than X, this need is often satisfied by BIND COPY-ing the packages in NULLID into an alternate collection (with the required APPLCOMPAT specification), and then using the Db2 profile tables to automatically direct the DRDA requester application in question to the alternate IBM Data Server Driver package collection. This technique is described in an entry I posted to this blog a while back, and while that entry refers to IBM Data Server Driver packages BIND COPY-ed into an alternate collection with an alternate RELEASE specification, the same thing can be done for IBM Data Server Driver packages that have an alternate APPLCOMPAT specification.

OK, so remember this bottom-up thing:

  1. First, the feature you want to use needs to be present in your Db2 system's code - that's the code level.
  2. The feature you want to use may have a catalog level requirement - that's the catalog level. You can't take the catalog level to X (via execution of the Db2 CATMAINT utility) unless the system's Db2 code level is at least X.
  3. When the code and catalog levels are right for the Db2 feature you want to use, you need to make sure that the appropriate function level has been activated on the Db2 system - that's the activated function level. Function level X cannot be activated unless the Db2 system's code level is at least X and the Db2 catalog level is at least X (or, if function level X has no catalog dependencies, the catalog level has to be at least the level of the last preceding function level that did have catalog dependencies).
  4. For your program to use the Db2 feature of interest, the application compatibility level has to be set as needed - that's done via the APPLCOMPAT value of the program's Db2 package (or by execution of SET CURRENT APPLICATION COMPATIBILITY, if the program issues dynamic SQL statements and if you need to take the application compatibility level lower than the package's APPLCOMPAT value). A package's application compatibility level cannot be set to X unless the activated function level of the Db2 system is at least X.

Checking on all this in your Db2 for z/OS environment

To see the upside-down hierarchical lay of the land in your Db2 environment, issue the Db2 command -DISPLAY GROUP. This output will look something like this (and don't be misled by the word GROUP in the command - this is applicable for a standalone Db2 subsystem as well as to a Db2 data sharing group):

*** BEGIN DISPLAY OF GROUP(DSNPROD ) CATALOG LEVEL(V13R1M100)
                  CURRENT FUNCTION LEVEL(V13R1M100)
                  HIGHEST ACTIVATED FUNCTION LEVEL(V13R1M100)
                  HIGHEST POSSIBLE FUNCTION LEVEL(V13R1M500)
                  PROTOCOL LEVEL(2)
GROUP ATTACH NAME(DSNP)
-----------------------------------------------------------------
DB2        SUB                    DB2    SYSTEM   IRLM
MEMBER  ID SYS  CMDPREF  STATUS   LVL    NAME     SUBSYS IRLMPROC
------- -- ---- -------- -------- ------ -------- ----   --------
DB1P     1 DB1P DB1P     ACTIVE   131503 SYS1     IR1P   DB1PIRLM
DB2P     2 DB2P DB2P     ACTIVE   131503 SYS2     IR2P   DB2PIRLM
-----------------------------------------------------------------
SCA STRUCTURE SIZE:   36864 KB,  STATUS= AC,  SCA IN USE:     1 %
LOCK1 STRUCTURE SIZE: 17408 KB
NUMBER LOCK ENTRIES:   2097152
NUMBER LIST ENTRIES:     21415,  LIST ENTRIES IN USE:           3
SPT01 INLINE LENGTH:     32138
*** END DISPLAY OF GROUP(DSNPROD )

For this 2-member Db2 data sharing system, the code level, highlighted in green, is 131503  (Db2 for z/OS Version 13, function level 503). The catalog level, highlighted in orange, is V13R1M100. The activated function level, highlighted in purple, is V13R1M100. As for a package's APPLCOMPAT level, you can see that via a query of the Db2 catalog table SYSIBM.SYSPACKAGE (check the value in the APPLCOMPAT column for the package's row in the table).

I hope that this information will be useful for you. The end of 2023 is around the corner. I'll post more in '24.

Wednesday, November 29, 2023

Db2 13 for z/OS: Autobind Phase-In

Db2 13 function level 504 became available last month (October 2023), via the fix for APAR PH54919. One of the new capabilities delivered with FL504 is something called autobind phase-in. I like that new feature a lot, and I think you will, too - especially if you're a Db2 for z/OS DBA. In this blog entry I'll explain what autobind phase-in is, why it's a very welcome addition to Db2 functionality, and how you can get ready to leverage the feature, even before you've activated function level V13R1M504.

First, a shout-out to my coworker Dengfeng Gao, a member of the IBM Db2 for z/OS development team. Dengfeng had a lot to do with making autobind phase-in a reality, and she provided me with much of the information I'm now passing along to you. Thanks, Dengfeng!

OK, to begin the story...


The way things were

Way, way back (early 1990s, as I recall), Db2 for z/OS introduced packages. For a Db2-accessing program that issues static SQL statements, you can think of the associated package as being, in essence, the compiled and executable form of the program's SQL statements (what distinguishes static SQL statements: they are prepared for execution prior to being issued by a program, via a Db2 process known as bind). When a static SQL-issuing program executes (this could be, for example, a CICS transaction, or a batch job, or a stored procedure, or a Db2 REST service), the program's package is allocated to the Db2 thread being used by the application process, and the part of the package corresponding to a particular SQL statement is executed when it's time to run that SQL statement.

SQL statements, by and large, reference Db2 tables; thus, packages are dependent on the tables referenced by SQL statements associated with the packages. Packages are also dependent on database objects that are not referenced in SQL statements (indexes are a prime example), when those objects are part of a SQL statement's access plan (i.e., the paths and mechanisms by which data targeted by a SQL statement will be accessed - for example, via a nested loop join that will employ certain indexes on the outer and inner tables). The dependencies of packages on database objects (tables, table spaces, views, indexes, etc.) are recorded in the SYSIBM.SYSPACKDEP table in the Db2 catalog.

Sometimes, a database object on which a package depends is changed in a way that requires regeneration of the package; or, an object on which the package depends (such as an index) might be dropped; or, a privilege needed by the package's owner might be revoked. In such situations, the package in question is marked as "invalid" by Db2 (such a package will have a value of 'N' in the VALID column of the SYSIBM.SYSPACKAGE catalog table). When that happens (in a pre-Db2 13 FL504 environment), the package cannot be executed again until it is regenerated by Db2. That regeneration could be accomplished through a REBIND command issued for the package by a DBA (or issued from a batch job coded by a DBA); but, what if there is a request to execute an invalidated package before that package has been regenerated through a REBIND command? In that case, Db2 will automatically regenerate the package, and that process is called autobind (it's sometimes referred to as auto-rebind).

When autobind happens for a package (again, we're talking about a pre-Db2 13 FL504 environment), it can be disruptive for the application(s) that drive execution of the package. This disruption can take several forms:

  • The application process whose request for execution of an invalidated package triggered the autobind has to wait until the autobind completes.
  • If another application process also requests execution of the invalidated package before the first autobind completes, that will result in a second attempt to autobind the package. That second attempt will have to wait, because it requires a lock on the package held by the first autobind process; thus, this second requester of the package will also sit and wait (if the first autobind finishes successfully and the second requester has not timed out in the meantime, the second requester will use the package as regenerated by the initial autobind process).
  • If the autobind fails (this could happen as the result of an authorization issue, among other things), the package will be marked as "inoperative" by Db2 (indicated by the value 'N' in the OPERATIVE column of the package's row in SYSIBM.SYSPACKAGE). In that case, any attempt to execute the package will fail until the package is explicitly rebound (usually, by a DBA).


This not-good situation changes dramatically (for the better) with Db2 13 FL504 autobind phase-in functionality. Before getting to that, I'll cover some prep work in which DBAs will want to engage.


Laying the autobind phase-in groundwork: a new catalog table, and a new BIND/REBIND option

When function level 500 has been activated in a Db2 13 system, the CATMAINT utility can be executed to take the catalog level toV13R1M501. When that happens, some new tables get added to the catalog. One of those new catalog tables is SYSIBM.SYSPACKSTMTDEP. As the name implies, Db2 will use this table to record static SQL dependencies on database objects at the statement level. Does that just happen? Nope - and this is where DBA action comes in.

When function level V13R1M502 has been activated, new packages can be bound - and existing packages can be rebound - with the new DEPLEVEL option. If you bind or rebind a package with DEPLEVEL(STATEMENT) then Db2 will record statement-level dependency information in the SYSPACKSTMTDEP catalog table, in addition to recording package-level dependency information in SYSPACKDEP (if you bind or rebind with a specification of DEPLEVEL(PACKAGE), it'll be business as usual - only package-level dependency information will be recorded in the catalog).

Would you like to make DEPLEVEL(STATEMENT) the default for package BIND and REBIND actions? If so, set the value of the ZPARM parameter PACKAGE_DEPENDENCY_LEVEL to STATEMENT.

Is there any downside to binding or rebinding packages with DEPLEVEL(STATEMENT)? A small one (in my opinion): because of the extra work of recording statement-level dependency information in SYSPACKSTMTDEP, binding or rebinding with DEPLEVEL(STATEMENT) will somewhat increase elapsed and CPU time for the bind/rebind operation. My expectation is that in a typical production Db2 for z/OS system, CPU consumption related to BIND and REBIND activity is a very small fraction of total CPU consumption.

Is there an upside to binding and rebinding packages with DEPLEVEL(STATEMENT)? Oh, yeah...


Enter autobind phase-in

Once a Db2 13 system's activated function level. is V13R1M504 or higher, this is what happens when a package that has been bound or rebound with DEPLEVEL(STATEMENT) is invalidated: the first request to execute the package following invalidation (assuming the invalidated package wasn't explicitly rebound before that execution request) will trigger an autobind of the package.

"Wait," you might think. "Isn't that the same thing that happened before the advent of autobind phase-in?" Yes, but the autobind itself takes a very different path, and has a very different workload impact, versus the prior process. To wit:

  • The package will be regenerated in the background. The process that requested execution of the invalidated package will be allowed to execute the package - it will not have to wait for the autobind to complete.
  • When the invalidated package is executed, statements that were not invalidated by the action that invalidated the package (e.g., an ALTER of a table that is referenced by some - but not all - of the package's statements) will continue to execute as they did before the invalidation of the package.
  • Also when the invalidated package is executed, statements that were invalidated by the (for example) ALTER action will be incrementally bound when issued by the associated program. This means that they will be dynamically prepared for execution, and that will mean a temporary additional CPU cost (temporary until the in-the-background autobind completes the regeneration of the package that had been invalidated), but the statements will be executed.
  • And if, before the in-the-background autobind completes, there is a second request to execute the invalidated package, will that trigger a second autobind action? Nope - the one autobind is for any and all package requesters. That second requester will be allowed to execute the invalidated package, just as was the case for the requester that triggered the in-the-background autobind - still-valid statements will execute as usual, and invalidated statements will be incrementally bound and then executed.
  • When the in-the-background autobind has finished its work, the newly regenerated package will be phased into use, in much the same way that the rebind phase-in functionality introduced with Db2 12 FL505 phases a newly rebound package into use: the first request for execution of the package following completion of the in-the-background autobind will get the regenerated (and now valid) package. Eventually, processes that had been executing the previous instance of the package (the instance that had been invalidated) will be done with that, and all processes will be using the regenerated package when they request its execution.
  • If the in-the-background autobind fails, will the invalidated package be marked inoperative (with attendant error situations for processes that request execution of the package)? Nope. In that case, the package will be marked with rebind-advisory status ('R' in the OPERATIVE column for the package's row in SYSPACKAGE). The package can still be executed (as described above: as-usual for not-invalidated statements, incremental bind for invalidated statements), but an explicit REBIND is recommended to get the package regenerated and back into a valid status.


Bottom line: with autobind phase-in, autobind activity will have hugely less impact on throughput and service levels for applications that execute packages that have been invalidated.

Note that the above-described much-better autobind process applies only to packages that have been bound or rebound with DEPLEVEL(STATEMENT) - and you can start doing that (as previously mentioned) once you've activated function level 502 in a Db2 13 system.

One other item of information: the in-the-background autobind done for autobind phase-in will execute in access-path-reuse mode - in other words, Db2 will reuse the access paths previously utilized for the package's SQL statements, if that can be done (it of course could not be done for all statements if, for example, package invalidation resulted from the dropping of an index on which some of the package's SQL statements depended). The same goes for the incremental bind of invalidated SQL statements when the invalidated package is requested for execution before the in-the-background autobind has completed - access paths will be reused if possible.

OK, so if you've gotten to Db2 13 at your site, give major consideration to rebinding packages (and binding new packages) with DEPLEVEL(STATEMENT), once function level V13R1M502 or higher has been activated; and, look forward to a much more application workload-friendly autobind process when you get function level V13R1M504 activated.

Monday, October 30, 2023

Db2 13 for z/OS: A New Means for Managing RELEASE(DEALLOCATE) Packages

It has long been understood by many Db2 for z/OS DBAs that a combination of the RELEASE(DEALLOCATE) package bind specification (especially for frequently executed Db2 packages that consume little CPU time per execution) and persistent threads can significantly reduce in-Db2 CPU time for Db2-accessing applications (a "persistent thread" is one that persists through commits - examples include CICS-Db2 protected threads, the threads between IMS wait-for-input regions and Db2, the Db2 threads associated with batch jobs, and high-performance database access threads, aka high-performance DBATs). The CPU efficiency benefit of RELEASE(DEALLOCATE) + persistent threads comes mainly from avoiding the cost of constantly releasing the package in question (i.e., separating it from the thread) at a commit point, only to reallocate it to the thread when it is subsequently (often, very soon - maybe a fraction of a second later) again requested for execution (some additional CPU savings are achieved via retention across commits of "parent" locks acquired in the execution of the package - these are table space- or partition-level locks, and they are almost always non-exclusive in nature).

Always nice to get a Db2 workload CPU efficiency boost, but those CPU savings came at one time with several "flip side" concerns. One of those concerns - conflict with Db2 utilities caused by retained parent locks - was addressed a long time ago (back in the mid-1990s) with the advent of the drain locking mechanism that utilities can use to gain exclusive access to a database object. Another concern from days past had to do with virtual storage constraint - the combination of RELEASE(DEALLOCATE) and persistent threads causes said threads to consume more virtual storage, and when that virtual storage was within the quite-limited confines of the EDM pool, there was a real risk of that getting filled up and causing application failures if one were not careful. Thankfully, that virtual storage-related risk was eliminated with Db2 10, when the virtual storage space used for allocation of packages to threads for execution moved from the EDM pool to above-the-bar agent local pool space, of which there is a very large quantity.

That left us with one more operational challenge associated with the RELEASE(DEALLOCATE) + persistent threads combo: conflict between in-use packages and processes that need to either rebind or invalidate a package. See, a package can't be rebound or invalidated when it is in-use, and a RELEASE(DEALLOCATE) package allocated for execution to a persistent thread is considered by Db2 to be continuously in-use until the thread is terminated, and that could be a while. This last operational challenge was partly addressed with the rebind phase-in functionality that was introduced with function level 505 of Db2 12 for z/OS. OK, great - you can successfully and non-disruptively rebind a package even when the package is in-use at the time of the issuance of the REBIND PACKAGE command; but, what about the situation in which a package needs to be invalidated, perhaps as a result of execution of an ALTER statement targeting an object on which the package is dependent (or as a result of an online REORG that materializes a pending DDL change)? Db2 11, in new-fucntion mode, provided an assist in this case: Db2 gained the ability to detect when a package-invalidating action was being blocked by a dependent package bound with RELEASE(DEALLOCATE) and allocated to a persistent thread - detecting that situation, Db2 could automatically, dynamically and temporarily change the package's behavior to RELEASE(COMMIT) at its next commit point. This enhancement, though welcome, did not fully eliminate the problem.

Why did Db2 11's automatic detection and reaction to a RELEASE(DEALLOCATE) package being allocated to a persistent thread not totally resolve the blocked package invalidation problem? Two reasons:

  1. The persistent thread to which the RELEASE(DEALLOCATE) package is allocated may be outside of Db2 - being, perhaps, in a between-the-last-and-the-next-transaction situation (this can particularly be an issue for high-performance DBATs).
  2. There might be several different dependent packages bound with  RELEASE(DEALLOCATE), and copies of these various packages could be allocated to a large number of threads, such that Db2 is not able to separate all the packages from all those threads in time to keep the package-invalidating action from timing out.
For these situations, Db2 13 provides the solution in the form of a new option that can be exercised via the profile tables (SYSIBM.DSN_PROFILE_TABLE and SYSIBM.DSN_PROFILE_ATTRIBUTES). The new option, available when Db2 13 function level 500 has been activated, comes in the form of a new keyword for the DSN_PROFILE_ATTRIBUTES table: RELEASE_PACKAGE. This keyword has one available associated ATTRIBUTE1 value: COMMIT, and specification of that value will cause Db2 to override RELEASE(DEALLOCATE) behavior for the package (or packages) associated with a profile to RELEASE(COMMIT) behavior, until further notice. And, this new profile table option can be used for local-to-Db2 applications (e.g., CICS-Db2 or IMS TM-Db2 or batch Db2) as well as for DDF-using applications (the profile tables had previously, essentially, been relevant only to DDF-using applications).

Let's consider a use-case scenario to illustrate exploitation of this Db2 13-delivered capability. Suppose you're a Db2 DBA, and you need to issue an ALTER TABLE statement that will invalidate several packages, in collection COLL_A, that are bound with RELEASE(DEALLOCATE) and are executed by way of persistent threads. Not wanting this ALTER TABLE statement to time out due to conflict with the dependent RELEASE(DEALLOCATE) packages, you take this approach (and we'll assume that this is a Db2 13 system with function level 500 or higher activated):

  1. Sometime prior to the time you want to issue the ALTER TABLE statement (maybe 15 minutes ahead of that time, as an example), you insert a row in DSN_PROFILE_TABLE to identify a profile that is associated with the collection COLL_A (you could also do this for a specific package in COLL_A, but we'll do this for the whole collection in this case). In doing this, you put the value 'Y' in the PROFILE_ENABLED column, to let Db2 know that this profile (and its associated attributes) is "live" (i.e., in effect).
  2. The just-created profile, like all profiles, has an ID associated with it (an integer value). We'll say that the ID for this COLL_A-related profile is 5. In the DSN_PROFILE_ATTRIBUTES table, you insert a row for profile 5. In that row, you specify 'RELEASE_PACKAGE' for the KEYWORDS column value, and 'COMMIT' for the ATTRIBUTE1 value. For the ATTRIBUTE2 column you specify a value of 1, because (in this example) you want the RELEASE(DEALLOCATE) override action to apply only to local-to-Db2 processes (a value of NULL in the ATTRIBUTE2 column would indicate that the RELEASE(DEALLOCATE) override action is to be taken only for DDF-using processes, and a value of 2 would mean, "Take this action for all related processes, whether local-to-Db2 or involving DDF").
  3. You issue the Db2 command -START PROFILE, so that Db2 will load the information in the profile tables into memory. Db2 sees that profile 5 (among, possibly, others) is enabled, and takes action: every time a package in COLL_A is loaded for execution, it will be treated as though bound with RELEASE(COMMIT), even if RELEASE(DEALLOCATE) had been specified for the most recent bind or rebind of the package.
  4. Because you took these steps 15 minutes prior to the time for issuing the ALTER TABLE statement, Db2 had plenty of time to switch to RELEASE(COMMIT) behavior for every instance of a RELEASE(DEALLOCATE) package in COLL_A that is allocated for execution to a persistent thread. You issue the ALTER TABLE statement, and it succeeds because there are no dependent packages bound with RELEASE(DEALLOCATE) and allocated to persistent threads to block execution of the statement. Note that the application workload associated with the RELEASE(DEALLOCATE) packages continues to execute - just with RELEASE(COMMIT) behavior in effect for those packages. That means you temporarily do without the CPU efficiency benefit of RELEASE(DEALLOCATE) for the associated application(s).
  5. With the ALTER TABLE statement having been successfully executed, you update the row for profile 5 in DSN_PROFILE_TABLE to have 'N' in the PROFILE_ENABLED column, to show Db2 that this profile (and its attributes) is no longer in effect. A subsequent issuance of the -START PROFILE command lets Db2 know of this profile status change by re-loading the profile table information into memory.
And that's it. A problem that might have been standing in your way is taken care of, easily and non-disruptively.

Of course, this story is not yet over. With the ALTER TABLE statement having been successfully executed, packages dependent on the table are invalidated. What happens after that? The invalidated packages will be auto-rebound by Db2 when next requested for execution, if you don't explicitly rebind them before that happens. Db2 13 function level 504, which came out just a few days ago, delivers big news on the auto-rebind front. I'll post a blog entry on that enhancement within the next few weeks.

Wednesday, September 27, 2023

Db2 for z/OS: Two Stories of Temporary Tables

Db2 for z/OS provides, for your use, two different temporary table types: declared and created. I recently worked with Db2 for z/OS people at a couple of different sites, providing assistance in understanding and effectively utilizing Db2 temporary tables, and I think information related to these experiences might be useful to folks in the larger user community - thus this blog entry. The first story points up an advantage of declared global temporary tables, while the second illustrates advantageous use of a created global temporary table. I hope that what follows will be of interest to you.


Db2 declared global temporary tables - the index advantage

A Db2 declared global temporary table (DGTT) is so-called because it comes into existence when it is declared - usually in an application program. There can be plenty of reasons for a program to declare a temporary Db2 table. For example, an application might need to get a preliminary set of result set rows from Db2 and then operate further on that data - easily done by inserting the result set rows into a declared temporary table and subsequently issuing various SQL statements that target the DGTT.

A Db2 for z/OS DBA sent me the other day some information about a use case at his site that made a declared global temporary table the right choice. Because the temporary table in this case was going to be fairly large, and because DELETE statements targeting the table would be coded with predicates, there would be a performance advantage to the use of a declared versus a created global temporary table (CGTT): an index can be defined on a DGTT, but not on a CGTT (this and other differences between declared and created global temporary tables are listed in the Db2 for z/OS online documentation).

So, with the available option of indexing, DGTTs will generally be preferred over CGTTs, right? Not necessarily...


When a CGTT can majorly reduce elapsed and CPU time for a process

The second Db2 temporary table story I'll relate came to me by way of a Db2 for z/OS systems programmer who works for a financial services organization. In this case, a z/OS-based COBOL batch program was retrieving data from a VSAM file and inserting related information into a Db2 temporary table. Rows were then SELECTed from the temporary table for further processing, after which the rows in the temporary table were deleted (via a predicate-less DELETE). These insert/retrieve/delete actions involving the temporary table were repeated about 300,000 times in an execution of the batch job. At first, the COBOL program declared and utilized a declared global temporary table. The problem? That approach did not perform as needed: the job consumed around 20 minutes of CPU time, and the SQL activity drove approximately 23 million GETPAGE requests (a GETPAGE happens when Db2 needs to examine the contents of a page of a database object, and GETPAGE activity is a major factor in the CPU cost of SQL statement execution). The Db2 sysprog also noticed that the batch process generated a lot of PREPARE activity (referring to the dynamic preparation of SQL statements for execution by Db2 - something that can significantly add to an application program's CPU consumption).

To try to reduce the CPU cost of the batch job, the Db2 team at this financial services organization switched from using a DGTT to a "permanent" (i.e., a "regular") Db2 table. Performance indeed got way better: GETPAGE requests dropped by over 70%, and CPU time for the job went from about 20 minutes to about 4 minutes. Why the big drop in GETPAGEs and CPU time? Probably this had to do with elimination of expensive SQL statement preparation activity. See, you might think that the SQL statements hard-coded in your COBOL (for example) program are all static ("static" SQL statements are prepared by Db2 for execution prior to program execution, via a process known as BIND), but when those statements refer to a DGTT they have to be dynamically prepared for execution when issued by the program because there is no definition of a DGTT in the Db2 catalog.

This big improvement in CPU efficiency notwithstanding, there was a side-effect of the switch from a DGTT to a permanent table that did not sit well with the Db2 team at the financial services company: as previously noted, the set of SQL statements targeting (initially) the DGTT and then the permanent Db2 table involved a delete of all rows in the table, and that happened 300,000 times in the execution of the batch job. When a permanent table was used in place of the DGTT, these 300,000 mass DELETEs (a mass DELETE is a DELETE without predicates) caused 300,000 rows to be inserted for the permanent table in the SYSCOPY table in the Db2 system's catalog. Running a MODIFY RECOVERY utility job to clear those rows out of SYSCOPY, and having to take an image copy of the permanent table to keep it from going into COPY-pending status, were viewed as significant hassles by the Db2 team. Was a still-better way forward available to them?

Indeed so. I suggested going with a created global temporary table. [Like a permanent table, a CGTT is defined by way of a CREATE statement, and there is information about a CGTT in the Db2 catalog. When a program references the CGTT it gets its own instance of the CGTT which (like a DGTT) is physically provisioned in the Db2 work file database.] The Db2 team did that, and the results were very positive. CPU time for the job - originally about 20 minutes with the DGTT and then about 4 minutes with the permanent table, went down to just over 2 minutes with the CGTT (as with the permanent table, no SQL statement dynamic preparation was needed, thanks to the CGTT being defined in the Db2 catalog); and, there were no inserts into SYSCOPY in association with the repeated mass DELETEs (same as with the DGTT); and, there was no need for an image copy of the CGTT because the instance of the table goes away automatically when the process using the table completes (same as with the DGTT). So, the CGTT in this case provided the advantages of a DGTT and of a permanent Db2 table, minus the relative disadvantages of those options (the dynamic statement preparation costs of the DGTT, and the mass-DELETE-related SYSCOPY inserts and the image copy requirements of the permanent table).


The bottom-line message

Declared and created global temporary tables both have their places in a Db2 for z/OS application environment. When considering the use of a Db2 temporary table for an application, be careful not to jump too quickly to a DGTT versus a CGTT decision (though sometimes there will be only one choice - as when, for example, UPDATE access to the temporary table is needed - something that can only be done with a DGTT). Consider the DGTT versus CGTT choice in light of the particulars of the use case, and choose accordingly. A thoughtful choice can yield a substantial performance advantage - so use your head.

Thursday, August 31, 2023

Db2 for z/OS: An Important Difference Between Data in Memory and Data on Disk

For the past several years, I've tried to post an entry per month to this blog. Sometimes, it will be very near the end of a month, and I haven't posted anything since the previous month, and I don't have any good ideas for a new entry. Then, I'll have an exchange with someone - could be via email, a phone call, a face-to-face discussion, whatever - and BOOM! Right there I'll find the inspiration for a blog entry. That happened again yesterday - the day before the last day of the month. Whew!

Here's what happened: in an email message, an IT professional recounted to me a situation that had her perplexed. The scenario: 11,000 rows of data were loaded into a table using the IBM Db2 for z/OS LOAD utility. Immediately after the completion of that load job, a program that updated 8 of those just-loaded rows executed and ran successfully to completion. Right after that, an unload job for the table in question was executed. This unload was performed using an IBM software product called Db2 High Performance Unload for z/OS, or HPU, for short (see https://www.ibm.com/docs/en/dhpufz/5.1.0?topic=documentation-db2-high-performance-unload-overview). HPU has two modes of operation: it can perform an unload by operating directly on the VSAM data set(s) associated with the target table, or it can do the unload through Db2, in which case the data is accessed in memory (i.e., in the buffer pool to which the table's table space is assigned). This unload was done in the former of these modes - operating directly on the VSAM data set(s) associated with the table's table space. The result of the unload surprised the person who emailed me. How so? Well, the unload was done using a predicate (the WHERE clause that you might see in a query), and the update program that ran between the load (of the 11,000 rows) and the unload changed values in a way that should have caused 8 of the 11,000 loaded rows to be filtered out by the unload process's predicate (the other 10,992 rows would be qualified by the predicate). The person who emailed me expected 10,992 records in the unload data set, but there were in fact 11,000 rows in that data set. The updates that should have caused 8 rows to be not-qualified by the unload process's predicate were committed before the unload job ran, so why was this update action not reflected in the contents of the unload data set? Consternation increased when another unload of the table, executed a few hours later (again, with the unload process using a predicate and operating directly on the table's associated VSAM data set(s)), generated an unload data set that did contain the expected 10,992 rows.

What in the world was going on here?

Here's what was going on: this all has to do with a big difference between a committed data change (which has relevance for Db2 data-in-memory) and an externalized data change (which relates to Db2 data-on-disk). What's important to know is that Db2 for z/OS does not externalize data changes (i.e., does not write changed data to the associated VSAM data set) as part of commit processing. Instead, database write I/O operations (to externalize data changes to VSAM data sets on disk) are done in a deferred way (and usually asynchronously, at that). This aspect of Db2's operation is critically important to scalability when it comes to data-change operations (e.g., INSERT, UPDATE and DELETE). If Db2 had to write changed pages to disk at commit time, data-change throughput would be majorly impacted in a negative way. In the scenario described above, the first unload generated by HPU (done right after the programmatic update of 8 of the 11,000 rows previously LOAD-ed into the table), operating directly on the table space's underlying VSAM data set(s), did not reflect the post-LOAD update of the 8 rows because the page(s) changed by the updating program were not written to the underlying VSAM data set(s) at commit time. The changed page(s) were externalized later by Db2 via deferred write processing, and that is why the second unload process, also operating directly on the table space's VSAM data set(s), reflected the aforementioned updates of 8 of the 11,000 table rows.

If Db2 deferred write action did eventually get the changed pages (associated with the updating of 8 of the table's rows) written to the associated VSAM data sets on disk - and it did - then what caused that deferred write action to happen? Usually Db2 deferred write operations for a given buffer pool are driven by one of two deferred write thresholds being reached for the pool. The deferred write queue threshold (abbreviated as DWQT) is expressed as a percentage of the total number of buffers allocated for a pool that are occupied by changed-but-not-externalized pages (the default value is 30), and the vertical deferred write queue threshold (VDWQT) is expressed as a percentage of the pool's buffers that are occupied by changed-but-not-externalized pages that belong to a particular data set (the default value is 5). Whenever either of those limits is reached (and it's usually the VDWQT limit), deferred write activity is triggered. The deferred write I/Os, by the way, are generally multi-page in nature (multiple pages written to disk in one I/O operation), and that is good for CPU-efficiency on a per-page basis. The CPU time associated with database writes (usually not much - I/Os don't require many CPU cycles) is charged to the Db2 database services address space (aka the DBM1 address space).

What about synchronous database write actions? Are they also deferred? Yes, they are - they're just triggered by something besides deferred write queue thresholds being reached. In the case of synchronous writes, the trigger is Db2 checkpoint processing. How that works: when Db2 executes a system checkpoint (which it does, by default, every 3 minutes), it notes all pages in each buffer pool that are in changed-but-not-yet-externalized status. When the next system checkpoint rolls around, Db2 checks to see if any of the changed-but-not-yet-externalized pages noted at the last checkpoint have still not been written to disk. If there are any such pages then they will be synchronously written to disk as part of checkpoint processing. Here, "synchronous" means that Db2 will immediately start writing those pages to disk, and it will continue to do that until they are all externalized.

OK, back to the story that prompted this blog entry. Is there a way that the initial HPU unload (the one executed very shortly after the programmatic update of 8 of the 11,000 rows LOAD-ed into the table) could have generated an unload data set with the desired 10,992 rows? Yes. In fact, there were at least two options for getting that done. One option would be to execute the Db2 QUIESCE utility for the table's table space prior to running the HPU unload. This would cause Db2 to write all changed-but-not-yet-externalized pages of the table's table space to disk, and then an HPU unload operating directly on the table space's VSAM data sets would have reflected the update of the 8 rows.

The second option would be to have HPU do the unload through Db2, as opposed to operating directly on the table space's underlying VSAM data sets - this is something that can be done through an HPU keyword. That, in turn, would have caused the HPU unload to be accomplished using data in memory (i.e., in the table space's assigned buffer pool) - any of the to-be-unloaded pages that were not already in memory would have been read into memory to as part of the unload process. This approach would have reflected the programmatic update of the 8 rows because those updates had been committed, and Db2 data-in-memory is always in a transactionally consistent state (any in-memory data that is not transactionally consistent because of an in-flight - that is, not-yet-completed - data change operation is blocked from access by X-type locks, taken at a page or a row level, that are not released until the data changes in question are committed).

Which of these options would you choose? It would depend on what is most important for you. The QUIESCE option would allow the HPU unload to operate directly on the VSAM data set(s) associated with the table space, and that would yield a CPU efficiency benefit, but the QUIESCE itself could be at least somewhat disruptive for applications accessing the target table. The "through Db2" option would not disrupt any application processes that were accessing the table at the time, but it would cause the HPU unload operation to consume some additional CPU time.

By the way, if you're wondering, "If committed data changes are written to disk in a deferred way, how is loss of committed data changes prevented in the event of an abnormal termination (i.e., a "crash") of the Db2 subsystem that happens when there are changed-but-not-yet-externalized pages in memory?" Worry not - data recorded in the Db2 transaction log is used to process those "pending writes" as part of the "roll-forward" phase of Db2 restart processing following a subsystem failure.

One more thing: the mechanics of all this are different in a Db2 data sharing environment (involving group buffer pool writes and associated castout operations to eventually get changed pages written to VSAM data sets on disk), but the net effect is the same.

And there you have it. I'm thankful for your visiting this blog, and I'm thankful for interesting questions that come in when I'm trying to figure out what I'm going to blog about.