Monday, August 30, 2021

Db2 for z/OS: Package-Related Things of Which You Should be Aware

Db2 for z/OS packages have been around for a long time - since, I believe, V2R3, back in the early 1990s. Even so, I've found myself talking more about packages to people lately than I have in quite some years. These discussions - conducted in meetings, in phone calls and through email - have highlighted for me some related topics and items of information of which more folks should be aware. I've packaged (no pun intended) these informational nuggets in this blog entry, and I hope you'll find them to be useful. Without further ado, and in no particular order...

Many Db2 for z/OS packages are associated with static SQL, but "static SQL" is a term not universally understood by people who work with Db2.  Sure, Db2 DBAs are familiar with the concept of static SQL. So, too, are application developers who code SQL statements in certain programs - typically, programs that 1) execute in the same z/OS LPAR as the target Db2 system, 2) are written in COBOL, and 3) execute as batch jobs or CICS transactions (though they could be stored procedure programs, and might in that case be written in SQL PL).  Application developers who code Db2-accessing SQL statements in JDBC or ODBC form may not be so familiar with the static SQL concept. For those folks: a static SQL statement is one that has already been prepared for execution, via the Db2 BIND process, before it is ever even issued for execution by a program. A Db2-related SQL statement that is not static is referred to as dynamic - it is prepared for execution when Db2 first sees it coming in from an application program (statement preparation can be relatively expensive, so the prepared form of a dynamic SQL statement is stored in-memory by Db2 in its dynamic statement cache - that prepared form of the statement can be reused if the same dynamic statement comes along again). SQL in JDBC or ODBC form is dynamic from the Db2 perspective. If a Db2-accessing program contains static SQL statements, that program has an associated package, and the package in that case can be thought of as the compiled, executable form of the program's static SQL statements.

Dynamic SQL statements are also associated with packages. Any SQL statement executed in a Db2 for z/OS system will always have a package associated with it. If it's a dynamic SQL statement, the associated package will likely be one of the IBM Data Server Driver packages (sometimes referred to as the Db2 Connect packages - entitlement to use the IBM Data Server Driver is by way of a license for Db2 Connect).

Db2 packages are associated with collections. As far as I'm concerned, this is one of the best ways to conceptualize a collection: a package's collection is kind of like a table's schema name (i.e., its high-level qualifier). Does that mean that the same package can be in more than one collection? Yes - and hold onto that thought.

SQL statements of the DDL variety are also associated with packages. As previously mentioned, any Db2-targeting SQL statement will be associated with a package. That being the case, a DDL statement (data definition language, including ALTER and CREATE) will be issued by way of a package, same as a DML statement (data manipulation language, including SELECT, INSERT and UPDATE). Here's where this sometimes trips up even an experienced Db2 DBA: starting with Db2 12 for z/OS, a package's APPLCOMPAT value affects DDL as well as DML statements (with Db2 11, APPLCOMPAT was relevant only for DML statements). Want to ALTER a table with a KEY LABEL specification, to leverage the data set encryption feature of z/OS? Make sure that the APPLCOMPAT value of the package through which the ALTER TABLE statement will be issued (that might be a SPUFI or a DSNTEP2 package) has an APPLCOMPAT value of V12R1M502 or higher, because the KEY LABEL option of ALTER TABLE (and ALTER STOGROUP) was introduced with Db2 12 function level 502.

Access path changes: don't forget the PLANMGMT safety net, and don't cut a hole through that safety net if you don't have to. For a static SQL statement, access path selection (e.g., will a certain index be used in the evaluation of a predicate, will two tables be joined using the nested loop or merge scan or hybrid join method, etc.) is done at package bind time (for a dynamic SQL statement, access path selection is done when the statement is issued by an application program, unless the statement was previously prepared and its prepared form is still in Db2's dynamic statement cache). If a static SQL-containing package is rebound, access paths for the package's SQL statements could change (the APREUSE option of REBIND PACKAGE can be used to tell Db2 to stick with a package's existing SQL access paths, if possible). If a rebind action results in access path changes, the expectation is that the changes will be beneficial from a performance perspective, but that is not always the case. Sometimes, an access path change will have a negative impact on performance (incomplete or inaccurate catalog statistics, for example, could cause the Db2 query optimizer to choose an access path that appears to be good for performance but in fact is not). In such a case, the plan management feature of Db2, which is "on" by default (check the value of PLANMGMT in ZPARM), provides a safety net: it causes Db2 to retain the previous instance of a package when the package is rebound. In the event of a performance degradation caused by a rebind-related change in one or more query access paths, the previous and better-performing instance of the package can be quickly restored via another REBIND PACKAGE action with SWITCH(PREVIOUS) specified. That's great, but sometimes an organization will cut a big hole in this safety net. How? By unnecessarily doing BIND REPLACE for packages, versus REBIND. BIND REPLACE is needed when a program's static SQL statements are modified, because that necessitates a new precompile of the source program, which generates a new DBRM (a file containing the program's static SQL statements) that subsequently has to be bound to replace the prior version of the related package. If a BIND REPLACE results in a performance problem that is of an access path nature, REBIND with SWITCH(PREVIOUS) is not an option, because that restores the previous instance of the same version of the package, and the BIND REPLACE (assuming it was preceded by a Db2 precompile of the program source) generates a new version of the package. Here's where the "unnecessary" comes in: quite often, an organization will have a process whereby a static SQL-containing program will be re-precompiled and BIND REPLACE-ed whenever a program's source is changed, even when the source change does not change the program's static SQL statements. When source code other then SQL is changed, a new precompile and an associated BIND REPLACE is not necessary. Moral of the story: don't do program re-precompiles and BIND REPLACEs when you don't need to do that. And while I'm on the topic of unnecessary package-related actions...

Don't do unnecessary rebinds of packages. At many Db2 for z/OS sites, certain Db2 table spaces are REORGed quite frequently (maybe on a weekly basis), and catalog statistics are regenerated for the table spaces, and packages with static SQL statements targeting tables in the table spaces are rebound. DON'T DO THAT (the package rebinds, I mean), absent a really good reason. Here's the thing: if you have a package of well-performing static SQL statements, keep in mind that one of the really good things about static SQL is that the good performance is locked in, from an access path perspective. Why mess with that? Leave your well-performing packages alone. REORGing and RUNSTATing table spaces from time to time? Good idea. Rebinding dependent packages after running those utilities? Not such a good idea. Am I saying that you shouldn't ever rebind packages? Of course not. Just do it when it makes sense. When does it make sense? Well, it's a good idea to rebind all of your packages (and plans, too) soon after migrating to a new version of Db2, while still at the 100 function level (what we used to call conversion mode), because a) the new version's optimizer enhancements are available at that time, and b) even when access paths don't change, the package code generated by the new version of Db2 (packages are executable code) is likely to be at least a little more CPU-efficient than the code generated for the same package by the previous version of Db2. It also makes sense to rebind a package when you want a query associated with the package to use a newly created index on a table (that would of course require query re-optimization, which a rebind does). And certainly you'd rebind if you wanted or needed to change a package characteristic such as its RELEASE setting (COMMIT or DEALLOCATE) or its isolation level (e.g., cursor stability or uncommitted read) or its APPLCOMPAT value. Bottom line: have a reason for rebinding a package or packages (and "Because we just REORGed and RUNSTATed a table space on which the package depends" is usually not a good reason).

Remember that EXPLAIN information for a package is always available. Do you need to get access path information for a package that was bound without a specification of EXPLAIN(YES)? No problem: just use the EXPLAIN PACKAGE statement (this will extract access path information from the package itself - that information has been part of a package's structure since Db2 9 for z/OS). Want to know what a package's access path would be, if the package were to be rebound? No problem: just rebind the package with EXPLAIN(ONLY) - you'll get the access path information, but Db2 won't actually regenerate the package. More information about EXPLAIN PACKAGE and rebind with EXPLAIN(ONLY) can be found in an entry I posted to this blog a few months ago.

Get different behaviors for the same package by using different bind specifications and different collections. I mentioned previously that a collection is somewhat akin to a schema name for a package. Just as tables that are logically identical and have the same unqualified name can be in different schemas, so packages that are generated from the same SQL statements can be in different collections. This can be very handy when you want different behaviors for the same package in different situations. For example: suppose that in some cases you want queries associated with package PROGXYZ to be executed with cursor stability isolation, while in other cases you want the same queries to be executed with uncommitted read isolation. No prob: bind the package with ISOLATION(CS) into collection CSCOLL (or whatever name you want to use for the collection) and with ISOLATION(UR) into collection URCOLL (and that second bind action might be a BIND COPY from CSCOLL into URCOLL, with the different ISOLATION value specified). How do you "point" an application that uses the PROGXYZ package to the CSCOLL or the URCOLL collection, as needed? For a static SQL package used with (for example) a batch job or a CICS transaction, you could do this with a plan's PKLIST. You could also let the program itself operate the "switch in the tracks" by using the SET CURRENT PACKAGESET special register; or, for applications that are DRDA requesters you could point to one collection or another by way of the Db2 profile tables (and here we could be talking about alternate collections of the IBM Data Server Driver / Db2 Connect packages). You could turn statement concentration (automatic parameterizing by Db2 of dynamic query predicates that reference literal values) on or off for different applications in this way, or use different APPLCOMPAT values, or provide a different high-level qualifier for unqualified table names in SQL statements, or do a number of other things (to get the ideas flowing, check out the list of package bind options in the online Db2 documentation).

OK, that's enough package stuff for one blog entry. I hope that this information will be helpful for you. As always, thanks for stopping by.