Wednesday, September 27, 2023

Db2 for z/OS: Two Stories of Temporary Tables

Db2 for z/OS provides, for your use, two different temporary table types: declared and created. I recently worked with Db2 for z/OS people at a couple of different sites, providing assistance in understanding and effectively utilizing Db2 temporary tables, and I think information related to these experiences might be useful to folks in the larger user community - thus this blog entry. The first story points up an advantage of declared global temporary tables, while the second illustrates advantageous use of a created global temporary table. I hope that what follows will be of interest to you.


Db2 declared global temporary tables - the index advantage

A Db2 declared global temporary table (DGTT) is so-called because it comes into existence when it is declared - usually in an application program. There can be plenty of reasons for a program to declare a temporary Db2 table. For example, an application might need to get a preliminary set of result set rows from Db2 and then operate further on that data - easily done by inserting the result set rows into a declared temporary table and subsequently issuing various SQL statements that target the DGTT.

A Db2 for z/OS DBA sent me the other day some information about a use case at his site that made a declared global temporary table the right choice. Because the temporary table in this case was going to be fairly large, and because DELETE statements targeting the table would be coded with predicates, there would be a performance advantage to the use of a declared versus a created global temporary table (CGTT): an index can be defined on a DGTT, but not on a CGTT (this and other differences between declared and created global temporary tables are listed in the Db2 for z/OS online documentation).

So, with the available option of indexing, DGTTs will generally be preferred over CGTTs, right? Not necessarily...


When a CGTT can majorly reduce elapsed and CPU time for a process

The second Db2 temporary table story I'll relate came to me by way of a Db2 for z/OS systems programmer who works for a financial services organization. In this case, a z/OS-based COBOL batch program was retrieving data from a VSAM file and inserting related information into a Db2 temporary table. Rows were then SELECTed from the temporary table for further processing, after which the rows in the temporary table were deleted (via a predicate-less DELETE). These insert/retrieve/delete actions involving the temporary table were repeated about 300,000 times in an execution of the batch job. At first, the COBOL program declared and utilized a declared global temporary table. The problem? That approach did not perform as needed: the job consumed around 20 minutes of CPU time, and the SQL activity drove approximately 23 million GETPAGE requests (a GETPAGE happens when Db2 needs to examine the contents of a page of a database object, and GETPAGE activity is a major factor in the CPU cost of SQL statement execution). The Db2 sysprog also noticed that the batch process generated a lot of PREPARE activity (referring to the dynamic preparation of SQL statements for execution by Db2 - something that can significantly add to an application program's CPU consumption).

To try to reduce the CPU cost of the batch job, the Db2 team at this financial services organization switched from using a DGTT to a "permanent" (i.e., a "regular") Db2 table. Performance indeed got way better: GETPAGE requests dropped by over 70%, and CPU time for the job went from about 20 minutes to about 4 minutes. Why the big drop in GETPAGEs and CPU time? Probably this had to do with elimination of expensive SQL statement preparation activity. See, you might think that the SQL statements hard-coded in your COBOL (for example) program are all static ("static" SQL statements are prepared by Db2 for execution prior to program execution, via a process known as BIND), but when those statements refer to a DGTT they have to be dynamically prepared for execution when issued by the program because there is no definition of a DGTT in the Db2 catalog.

This big improvement in CPU efficiency notwithstanding, there was a side-effect of the switch from a DGTT to a permanent table that did not sit well with the Db2 team at the financial services company: as previously noted, the set of SQL statements targeting (initially) the DGTT and then the permanent Db2 table involved a delete of all rows in the table, and that happened 300,000 times in the execution of the batch job. When a permanent table was used in place of the DGTT, these 300,000 mass DELETEs (a mass DELETE is a DELETE without predicates) caused 300,000 rows to be inserted for the permanent table in the SYSCOPY table in the Db2 system's catalog. Running a MODIFY RECOVERY utility job to clear those rows out of SYSCOPY, and having to take an image copy of the permanent table to keep it from going into COPY-pending status, were viewed as significant hassles by the Db2 team. Was a still-better way forward available to them?

Indeed so. I suggested going with a created global temporary table. [Like a permanent table, a CGTT is defined by way of a CREATE statement, and there is information about a CGTT in the Db2 catalog. When a program references the CGTT it gets its own instance of the CGTT which (like a DGTT) is physically provisioned in the Db2 work file database.] The Db2 team did that, and the results were very positive. CPU time for the job - originally about 20 minutes with the DGTT and then about 4 minutes with the permanent table, went down to just over 2 minutes with the CGTT (as with the permanent table, no SQL statement dynamic preparation was needed, thanks to the CGTT being defined in the Db2 catalog); and, there were no inserts into SYSCOPY in association with the repeated mass DELETEs (same as with the DGTT); and, there was no need for an image copy of the CGTT because the instance of the table goes away automatically when the process using the table completes (same as with the DGTT). So, the CGTT in this case provided the advantages of a DGTT and of a permanent Db2 table, minus the relative disadvantages of those options (the dynamic statement preparation costs of the DGTT, and the mass-DELETE-related SYSCOPY inserts and the image copy requirements of the permanent table).


The bottom-line message

Declared and created global temporary tables both have their places in a Db2 for z/OS application environment. When considering the use of a Db2 temporary table for an application, be careful not to jump too quickly to a DGTT versus a CGTT decision (though sometimes there will be only one choice - as when, for example, UPDATE access to the temporary table is needed - something that can only be done with a DGTT). Consider the DGTT versus CGTT choice in light of the particulars of the use case, and choose accordingly. A thoughtful choice can yield a substantial performance advantage - so use your head.

10 comments:

  1. From my point of view, the NOT LOGGED attribute option for the DGTT's is a big win over CGTT's.

    ReplyDelete
  2. Do you mean because NOT LOGGED will suppress logging for insert, update and delete actions targeting a declared global temporary table? If so, I do not see that as an advantage of DGTTs over CGTTs, because logging is not done for actions involving CGTTs (see the information on logging in the Db2 for z/OS documentation page at https://www.ibm.com/docs/en/db2-for-zos/12?topic=tables-distinctions-between-db2-base-temporary).

    Robert

    ReplyDelete
  3. thanks, you are right, I thought all the CGTT activity had to be logged.

    ReplyDelete
  4. Thank you Robert ! Always valuable information.

    ReplyDelete
    Replies
    1. You are most welcome. I appreciate the kind words.

      Robert

      Delete
  5. Once the view definition gets created successfully,why does the ddl of view shows
    SET PATH = "SYSIBM","SYSFUN","SYSPROC","SYSIBMADM";
    in the very first line.What does this mean.When I recreate the same view should i include above lines as well?

    ReplyDelete
    Replies
    1. I do not know what "view definition" you're talking about. This blog entry does not refer to views.

      Robert

      Delete
  6. Robert,
    Great info in your blog as always... We have a lot of stored procedure calls which execute INSERT INTO CGTT_table SELECT FROM... complex SQL.. Typically these SPs are passing CGTT to another SP call. Occasionally these INSERT INTO CGTT_table calls get lock contention and SQLcode -913 timeouts. Similar SELECT ... WITH UR; has no contention. INSERT INTO cgtt_table will not all WITH UR; we're forced to use isolation CS. It seems pointless to prevent WITH UR in this case. Is there any way around this restriction? Thanks in advance.

    ReplyDelete
    Replies
    1. I think the issue has to do with the fact that isolation level, when specified for an individual SQL statement, applies to the statement as a whole - not to a part of the statement. That means that WITH UR, when specified for a subselect within an INSERT, would have to apply to the entirety of the statement, not just to the subselect. That, in turn, would require that WITH UR be applied to the INSERT, and that is not possible.

      If you need to have WITH UR in effect for the SELECT that generates the result set that will be be inserted into the CGTT, you might need to make the SELECT and the INSERT into the CGTT two different statements, versus the current INSERT with a subselect. How that could work: the SELECT, with a specification of WITH UR, is declared in a CURSOR. The cursor is opened, and a row is fetched (a FETCH INTO that populates variables with column values of a result set row). Then, the fetched row values are inserted into the CGTT. Then, you repeat the FETCH and the INSERT sequence until all result set rows have been retrieved (via the WITH UR cursor) and inserted into the CGTT. Obviously, that's more-complex code versus the iNSERT with the subselect, but I don't know how else you could get the WITH UR behavior for the SELECT.

      If you wanted to stay with the INSERT + subselect, you might need to attack the lock contention problem. Maybe something like use of row-level locking for the table space in question would be an option, or maybe some process getting update locks on the table space needs to commit more frequently.

      Robert

      Delete