Wednesday, September 27, 2023

Db2 for z/OS: Two Stories of Temporary Tables

Db2 for z/OS provides, for your use, two different temporary table types: declared and created. I recently worked with Db2 for z/OS people at a couple of different sites, providing assistance in understanding and effectively utilizing Db2 temporary tables, and I think information related to these experiences might be useful to folks in the larger user community - thus this blog entry. The first story points up an advantage of declared global temporary tables, while the second illustrates advantageous use of a created global temporary table. I hope that what follows will be of interest to you.


Db2 declared global temporary tables - the index advantage

A Db2 declared global temporary table (DGTT) is so-called because it comes into existence when it is declared - usually in an application program. There can be plenty of reasons for a program to declare a temporary Db2 table. For example, an application might need to get a preliminary set of result set rows from Db2 and then operate further on that data - easily done by inserting the result set rows into a declared temporary table and subsequently issuing various SQL statements that target the DGTT.

A Db2 for z/OS DBA sent me the other day some information about a use case at his site that made a declared global temporary table the right choice. Because the temporary table in this case was going to be fairly large, and because DELETE statements targeting the table would be coded with predicates, there would be a performance advantage to the use of a declared versus a created global temporary table (CGTT): an index can be defined on a DGTT, but not on a CGTT (this and other differences between declared and created global temporary tables are listed in the Db2 for z/OS online documentation).

So, with the available option of indexing, DGTTs will generally be preferred over CGTTs, right? Not necessarily...


When a CGTT can majorly reduce elapsed and CPU time for a process

The second Db2 temporary table story I'll relate came to me by way of a Db2 for z/OS systems programmer who works for a financial services organization. In this case, a z/OS-based COBOL batch program was retrieving data from a VSAM file and inserting related information into a Db2 temporary table. Rows were then SELECTed from the temporary table for further processing, after which the rows in the temporary table were deleted (via a predicate-less DELETE). These insert/retrieve/delete actions involving the temporary table were repeated about 300,000 times in an execution of the batch job. At first, the COBOL program declared and utilized a declared global temporary table. The problem? That approach did not perform as needed: the job consumed around 20 minutes of CPU time, and the SQL activity drove approximately 23 million GETPAGE requests (a GETPAGE happens when Db2 needs to examine the contents of a page of a database object, and GETPAGE activity is a major factor in the CPU cost of SQL statement execution). The Db2 sysprog also noticed that the batch process generated a lot of PREPARE activity (referring to the dynamic preparation of SQL statements for execution by Db2 - something that can significantly add to an application program's CPU consumption).

To try to reduce the CPU cost of the batch job, the Db2 team at this financial services organization switched from using a DGTT to a "permanent" (i.e., a "regular") Db2 table. Performance indeed got way better: GETPAGE requests dropped by over 70%, and CPU time for the job went from about 20 minutes to about 4 minutes. Why the big drop in GETPAGEs and CPU time? Probably this had to do with elimination of expensive SQL statement preparation activity. See, you might think that the SQL statements hard-coded in your COBOL (for example) program are all static ("static" SQL statements are prepared by Db2 for execution prior to program execution, via a process known as BIND), but when those statements refer to a DGTT they have to be dynamically prepared for execution when issued by the program because there is no definition of a DGTT in the Db2 catalog.

This big improvement in CPU efficiency notwithstanding, there was a side-effect of the switch from a DGTT to a permanent table that did not sit well with the Db2 team at the financial services company: as previously noted, the set of SQL statements targeting (initially) the DGTT and then the permanent Db2 table involved a delete of all rows in the table, and that happened 300,000 times in the execution of the batch job. When a permanent table was used in place of the DGTT, these 300,000 mass DELETEs (a mass DELETE is a DELETE without predicates) caused 300,000 rows to be inserted for the permanent table in the SYSCOPY table in the Db2 system's catalog. Running a MODIFY RECOVERY utility job to clear those rows out of SYSCOPY, and having to take an image copy of the permanent table to keep it from going into COPY-pending status, were viewed as significant hassles by the Db2 team. Was a still-better way forward available to them?

Indeed so. I suggested going with a created global temporary table. [Like a permanent table, a CGTT is defined by way of a CREATE statement, and there is information about a CGTT in the Db2 catalog. When a program references the CGTT it gets its own instance of the CGTT which (like a DGTT) is physically provisioned in the Db2 work file database.] The Db2 team did that, and the results were very positive. CPU time for the job - originally about 20 minutes with the DGTT and then about 4 minutes with the permanent table, went down to just over 2 minutes with the CGTT (as with the permanent table, no SQL statement dynamic preparation was needed, thanks to the CGTT being defined in the Db2 catalog); and, there were no inserts into SYSCOPY in association with the repeated mass DELETEs (same as with the DGTT); and, there was no need for an image copy of the CGTT because the instance of the table goes away automatically when the process using the table completes (same as with the DGTT). So, the CGTT in this case provided the advantages of a DGTT and of a permanent Db2 table, minus the relative disadvantages of those options (the dynamic statement preparation costs of the DGTT, and the mass-DELETE-related SYSCOPY inserts and the image copy requirements of the permanent table).


The bottom-line message

Declared and created global temporary tables both have their places in a Db2 for z/OS application environment. When considering the use of a Db2 temporary table for an application, be careful not to jump too quickly to a DGTT versus a CGTT decision (though sometimes there will be only one choice - as when, for example, UPDATE access to the temporary table is needed - something that can only be done with a DGTT). Consider the DGTT versus CGTT choice in light of the particulars of the use case, and choose accordingly. A thoughtful choice can yield a substantial performance advantage - so use your head.