Friday, November 11, 2022

A Case Study: Using Db2 for z/OS Monitor Reports to Zero In on a Performance Problem

I had an interesting exchange recently with a Db2 for z/OS systems programmer. This individual had been asked to assist in determining the cause of a performance problem impacting a Db2-accessing application. The sysprog shared with me a Db2 monitor-generated accounting report showing activity for the application process, and a Db2 monitor statistics report covering the same time period for the same Db2 subsystem. In those two reports were the clues that pointed to the source of the application performance problem, and with the probable problem source identified the associated mitigating actions were readily determined. In this blog entry, I'll take you through my analysis of the Db2 monitor-provided accounting and statistics information, and the rationale behind my recommended steps for resolving the problem. My aim is not only to shed light on a particular performance-affecting issue and related remediating moves, but also to illustrate a methodical approach for analyzing Db2 application performance issues in general.


Input for analysis

I find Db2 monitor generated accounting and statistics reports to be extremely useful for analysis of application performance problems. In both cases, what you want is the detailed form of the report. In the case of IBM's OMEGAMON for Db2 performance monitor (the one with which I'm most familiar), you're talking about the ACCOUNTING REPORT - LONG and the STATISTICS REPORT - LONG (for other Db2 monitor products, these reports might have titles like, "detailed summary of accounting information" or "statistics detail report"). For the accounting report, your preference is to see activity pertaining exclusively to the application process for which the performance issue has arisen. This is usually done by using the monitor's data-filtering capabilities to include, for report-generation purposes, only the Db2 accounting records of interest (an accounting report is basically information from Db2 accounting trace records, formatted for readability). Db2 accounting trace records have all kinds of identifier fields, so you can get pretty specific. In the case about which I'm writing here, the relevant filtering criteria were the authorization ID of the application process of interest, and the FROM and TO times that bracketed the period during which the performance problem occurred.

As for the statistics report, what you want is one that covers the same time period as the accounting report (same FROM and TO times), for the same Db2 subsystem.


Looking for clues

What the Db2 sysprog had been told by the development team is that the application in question started out performing well, and then slowed down a lot (important input). The sysprog and I looked first at the accounting report, and in doing that we focused initially on the "class 3 suspension" information (this information, obtained from Db2 accounting trace class 3 records, has to do with "known" wait events, as opposed to "not accounted for" time, about which I'll comment momentarily). Why did we look there first? Because (in my experience), when a Db2-accessing process starts out performing well and then slows way down, it's often due to a substantial increase in one or more of the "wait times" captured by accounting trace class 3 (yes, an application slowdown could be related to a significant CPU time increase, but I've seen that less often than I've seen large wait time increases).

Looking at the class 3 times for the application process, what jumped out was a very large value for the average DB2 LATCH time ("average" is average per accounting trace record, which typically equates to average per transaction or average per batch job, depending on the workload type). Here, I mean "large" in terms of average DB2 LATCH time being a large percentage of average TOTAL CLASS 3 time. Usually, DB2 LATCH time is a very small percentage of TOTAL CLASS 3 time, with "wait time" categories such as SYNCHRON DATABASE I/O and OTHER READ I/O accounting for the bulk of TOTAL CLASS 3 time. A (proportionately) really large DB2 LATCH time is usually an indicator that something's not right.

The first thing I look at when I see unusually large DB2 LATCH time for a Db2-accessing process is the "in-Db2 not-accounted-for time" for that process. Several Db2 monitor products calculate that for you - in an IBM OMEGAMON for Db2 accounting report, the field is labeled NOTACC, and it's shown, on a sideways bar chart at the top of an accounting report, as a percentage of average in-Db2 elapsed time. If you need to calculate this figure for yourself, the denominator is average in-Db2 elapsed time (aka "class 2" elapsed time), and the numerator is average in-Db2 elapsed time minus in-Db2 CPU time (general-purpose plus zIIP, or "specialty engine," CPU time) minus TOTAL CLASS 3 time. In other words, it's the percentage of in-Db2 elapsed time that is not CPU time and not "identifiable" wait time. For a transactional application process (as was the case for the situation about which I'm writing), as a general rule you want in-Db2 not-accounted-for time to be less than 10%. If that figure is substantially greater than 10%, it's indicative of a CPU-constrained environment, and if the environment is highly CPU-constrained then DB2 LATCH time can get really large (as pointed out in an entry I posted to this blog a few years ago).

Well, in this particular case the average in-Db2 not-accounted-for time for the application process was 11% of in-Db2 elapsed time - a little on the high side for a transactional process, but not high enough to explain a really large DB2 LATCH time. With that cause of elevated DB2 LATCH time pretty much eliminated, it was time to turn to the breakdown of latch wait events for different latch categories, and that's where the statistics report comes in. In a Db2 monitor-generated statistics long report, the latch suspend count information will likely look something like this (what you see below is from an OMEGAMON for Db2 statistics long report, but it is NOT from the report I reviewed with the Db2 systems programmer - we jointly viewed that report in a web meeting, and I do not have a copy of the report):

LATCH CNT   /SECOND   /SECOND   /SECOND   /SECOND
---------  --------  --------  --------  --------
LC01-LC04      0.00      0.00      0.00      0.00
LC05-LC08      0.00      0.74      0.00      0.41
LC09-LC12      0.00      0.02      0.00      0.32
LC13-LC16      0.00     12.89      0.00      0.00
LC17-LC20      0.00      0.00      0.01      0.00
LC21-LC24      0.04      0.00      1.96      2.84
LC25-LC28      0.12      0.02      0.01      0.00
LC29-LC32      0.06      0.04      0.00      0.28
LC254          0.00

What I saw in the report I reviewed in the web meeting with the Db2 sysprog (and again, that's NOT what you see above - the snippet above is provided so that you can see what the latch suspend count information looks like in a statistics report) was a particularly high value for latch class 6 suspend events (that would be in the position highlighted in green in the sample report snippet above). What is latch class 6? It has to do with index page split activity in a Db2 data sharing environment (by the way, a handy page for seeing the activities associated with various Db2 latch classes is this one from the OMEGAMON for Db2 documentation).

Let's unpack that. An index page split occurs when Db2 has to insert an entry in an index page because of an insert (or an update of an indexed column) and that page is full. In that situation, a portion of the entries in the page will be moved to what had been an empty page in the index, so that there will be room in the formerly-full page for the new entry. What does Db2 data sharing have to do with this (and in the environment about which I'm writing, Db2 is running in data sharing mode)? In a data sharing system (versus a standalone Db2 subsystem), an index page split action has a greater impact on throughput because it forces a log-write operation.

Seeing the high level of index page split activity suggested by the numerous latch class 6 wait events, we turned again to the accounting report to see the average number of insert operations executed by the performance-impacted application. Sure enough, we saw that this was an insert-intensive process - more by the average number of rows inserted, versus the number of INSERT statements executed (the average number of rows inserted, per the accounting report, was about 100 times larger than the average number of INSERT statements executed, indicating use of block-level inserts by the application).

The elevated count of latch class 6 suspend events (related to index page splits) and the insert-intensive nature of the process also dove-tailed with the observed "started out fine, then slowed down" behavior of the application: in all probability, when the process started there was a pretty good amount of free space in leaf pages of indexes on the table(s) into which rows were being inserted. After a while these "holes" in index leaf pages filled up, and that resulted in a large number of index page split actions to make space for new index entries, and THAT - partly due to the fact that this was a Db2 data sharing system - had a majorly negative impact on application performance (the keys of affected indexes were clearly not of the continuously-ascending variety, because index page split actions are not required for an index defined on a continuously-ascending key).

With the source of the performance problem identified, the next matter to consider was...


What to do about it?

The Db2 sysprog and I discussed two problem-mitigating actions - one an "absolutely do" and the other a "maybe do." The "absolutely do" step was to increase the amount of free space in indexes to accommodate new entries. That step, in turn, was comprised of two sub-steps, each of which is applicable to indexes defined on non-continuously-ascending keys. Sub-step one: increase the index's PCTFREE value. Whereas the default PCTFREE value for an index is 10, a value of 20 or 25 might make more sense for an index defined on a non-continuously-ascending key for an insert-heavy table. Sub-step two: increase the index's FREEPAGE value. The default FREEPAGE value is 0. Here's why boosting the FREEPAGE value - for example, to 5 (in which case there would be an empty index page after every 5 pages containing index entries) - can be helpful for an index defined on a non-continuously-ascending key for an insert-heavy table: as previously mentioned, when an index page is split a portion of that page's entries are moved to a previously empty page in the index. If FREEPAGE 0 (the default) is in effect, the only empty index pages will be at the very end of the index - potentially a long way from the page that was split. That situation creates a drag on performance through degradation of the index's organization (reflected in the LEAFDIST value for the index - or index partition, in the case of a partitioned index - in the SYSIBM.SYSINDEXPART catalog table). With a non-zero and relatively low value for FREEPAGE (meaning, an empty page following each relatively-low-number of populated pages), when an index split does occur then there should be a "nearby" empty page into which entries from the full page can be moved.

Both PCTFREE and FREEPAGE can be changed for an index via an ALTER INDEX statement, and both take effect when the index is subsequently reorganized (or loaded). The larger PCTFREE value will reduce index page split activity between REORGs, and the non-zero FREEPAGE value will reduce the impact of page splits if they do occur.

And what about the "maybe do" step? That would be an increase in the size of the index's pages, from the default of 4 KB to maybe the maximum of 32 KB. How can that help? Here's how: because a 32 KB index page (for example) can hold 8 times as many entries as a 4 KB page, going to 32 KB-sized pages for an index (via an ALTER INDEX statement that assigns the index to a 32K buffer pool) can potentially result in an 87.5% reduction (seven eighths) in page split activity for an index, other things being equal (e.g., same rate of inserts for the underlying table). Why is this a "maybe do" thing versus an "absolutely do" thing? Because if access to table rows through the index (e.g., for queries) is truly random in nature with respect to key values, 32 KB-sized pages could mean somewhat less-effective use of buffer pool resources versus 4 KB-sized pages. It's a matter, then, of what's more important in a particular situation: is it minimizing index page split activity, or maximizing the effectiveness of buffer pool resources for random patterns of row access by applications?


In conclusion...

I hope that this entry has provided you with some "news you can use." Utilize Db2 monitor accounting and statistics reports to determine the source of an application performance problem, then take appropriate remedial action.

One more thing: Db2 13 for z/OS made it a lot easier to verify that index page splits are an issue, through enhanced instrumentation (new IFCID 396, associated with statistics trace class 3, which is on by default, indicates when an index page split operation takes more than 1 second, which would be unusually long) and through the new REORGTOTALSPLITS, REORGSPLITTIME and REORGEXCSPLITS columns of the SYSIBM.SYSINDEXSPACESTATS real-time statistics table in the Db2 catalog.


18 comments:

  1. Hi Rob,when i ran reorg with shrlevel change,the job failed stating reorg is not allowed for .Upon checking the db2 message code DSNU1152I ,it states -remove shrlevel change or shrlevel reference for 'not logged tablespace'.I removed the shrlevel & ran the job,still it failed.But the job ran fine with shrlevel reference.How come shrlevel reference removed areor state? as per message code both shrlevel needs to be eliminated? how do we handle reorg for 'not logged ts'

    ReplyDelete
    Replies
    1. If you look at the information about the DSNU1152I message in the Db2 for z/OS online documentation (https://www.ibm.com/docs/en/db2-for-zos/12?topic=messages-dsnu1152i), you will see that for some scenarios involving REORG of a NOT LOGGED table space, SHRLEVEL CHANGE will not work, and in another scenario (partition-level REORG for a NOT LOGGED table space on which at least one non-partitioned index is defined) SHRLEVEL REFERENCE will not work.

      In your case (presumably, a full-table space REORG of a NOT LOGGED table space), SHRLEVEL CHANGE caused a problem because that option involves retrieving data-change information from the Db2 log, and no such information exists for a NOT LOGGED table space. REORG with SHRLEVEL REFERENCE worked because REORG does not have a LOG phase of execution in that case (unless SHRLEVEL REFERENCE is specified along with the PART option - see the information under "Execution phases of REORG TABLESPACE" on the Db2 for z/OS online documentation page at https://www.ibm.com/docs/en/db2-for-zos/12?topic=utilities-reorg-tablespace).

      Robert

      Delete
    2. thanks for details info Rob.In my case,it was not a part ts ,I dint specify part option in reorg.As per ibm link below is suggested:
      Take the appropriate action for the object-type value:

      NOT LOGGED TABLESPACE
      Remove the SHRLEVEL(CHANGE) or SHRLEVEL(REFERENCE) option from the utility statement.
      As per above, both shrlevel options should be skipped for 'not logged ts'.shrlevel ref worked.strange.so the suggestion action in ibm manual may not be appropriate?

      Delete
    3. The manual's recommended action is correct. It would be more clearly understood, perhaps, if it were more wordy. What is stated is, "Remove the SHRLEVEL(CHANGE) or SHRLEVEL(REFERENCE) option from the utility statement." That does NOT mean, "Remove the SHRLEVEL(CHANGE) AND the SHRLEVEL(REFERENCE) option from the utility statement." It means, "If the problem was due to SHRLEVEL(CHANGE) being specified for the REORG, use SHRLEVEL(REFERENCE), instead. If the problem was caused by SHRLEVEL(REFERENCE) being specified, either use SHRLEVEL(NONE) or keep the SHRLEVEL(REFERENCE) and execute the utility without the PART option."

      Robert

      Delete
  2. Hi Rob,

    I want to store orphan records of a table resulted after load.
    I have come across exception table.but if we have to work with high number of tables.I think creating exception table would be time consuming.Also i read through that discard dsn option cannot be used if more number of tables interrelated.can you share your insight on discard option and also suggest me what are the ways to store discarded rows/orphan/mismatch records.

    ReplyDelete
    Replies
    1. Sure, using the DISCARD option of LOAD, and having a discard data set, is a good way to retain rows rejected due to RI constraint violations. I'm not sure about the "no use of a DISCARD data set when multiple tables are interrelated" - I'm not aware of the restriction to which you are referring. If you want to make sure that rows in a discard data set pertain to one and only one table, use different LOAD jobs to load data into different tables.

      Robert

      Delete
    2. In order to store the discarded rows via sysdisc do i need to have enforce yes while loading the tables?

      Delete
    3. ENFORCE CONSTRAINTS needs to be in effect, but you do not have to explicitly include this option in the LOAD utility control statement - this because ENFORCE CONSTRAINTS is an "above-the-line" option for LOAD (referring to the LOAD utility syntax diagram). That means ENFORCE CONSTRAINTS will be in effect, even if you don't specify that option (of course, ENFORCE CONSTRAINTS will not be on effect if you specify ENFORCE NO). For a relevant example, see the example under the heading, "Enforcing referential constraints when loading data," on this page in the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-sample-control-statements.

      Robert

      Delete
  3. When a table is related like t1 is parent to t2 and t3 child of t2 as well as parent to t1,in this case when we do a load of table which is parent and child to many tables.Can the orphan records in discard dataset(orphan rows be stored automatically even when we don't specify enforce explicitly as you said) be used to correct & repopulate/validate to same table after load completion.as the rows can belong to many tables interrelated not just one table.so in this case sysdisc is not useful for storing orphan rows right?

    ReplyDelete
    Replies
    1. For this load of tables that have a "circular" RI relationship, (T1 is parent of T2, T2 is parent of T3, T3 is parent of T1), are you talking about LOAD REPLACE (replace any data already in the table space with new data) or LOAD RESUME (add data to existing data in the table space)?

      Delete
  4. It's load replace.can you also give insight of this discard case for this circular ri relationship for load resume as well.

    ReplyDelete
    Replies
    1. If the three tables start out empty, I don't see that a discard data set would be of any use - since each of the three tables is a dependent table in this "circular RI" arrangement, ALL rows loaded into a table would be rejected with ENFORCE CONSTRAINTS in effect, because there would be no matching parent table rows (because the tables start out empty). In that case, the right way forward might be to load all three tables with ENFORCE NO and then use the CHECK DATA utility to identify any rows that are in violation of RI constraints.

      If the tables do not start out empty, and you are using LOAD REPLACE, a discard data set (and ENFORCE CONSTRAINTS behavior) might be reasonable, IF the new rows being added to a table (to replace existing rows) contain foreign key values that are at least mostly (if not entirely) matched by values in rows of the associated parent table; otherwise, it would be like starting with empty tables: why use a discard data set if all rows will be rejected because they lack (at the time of load) matching parent table values.

      Similar story with LOAD RESUME: if rows being added to a table contain foreign key values that are at least mostly covered by values in existing (at the time of load) parent table rows, a discard data set could be appropriate; otherwise, if rows being added to a dependent table have foreign key values that will only have matches in the parent table after rows have been added to the parent table, a discard data set would not be helpful - all rows that you want to add to the dependent table will be rejected.

      Bottom line: it seems to me that LOAD with ENFORCE CONSTRAINTS and a discard data set makes sense when the expectation is that RI constraint violations (with respect to rows being loaded) will be the exception versus the rule.

      Robert

      Delete
    2. Will the orphan record get colleted only if I explicitly define the sysdisc ds in JCL or what if I don't define it.will the discard ds get automatically get created for all load jobs in order to collect orphan rows if it gets generated.

      Delete
    3. You have two options. Option 1: define the discard data set yourself, and reference it in a DD statement in the JCL of your LOAD utility job (see the information under "DISCARDDN" on the Db2 12 for z/OS documentation page at https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement).

      Option 2: use a Db2 template to dynamically allocate the discard data set. In that case, you would reference the template name via the DISCARDDN option of LOAD (information on the TEMPLATE Db2 utility control statement is at https://www.ibm.com/docs/en/db2-for-zos/12?topic=utilities-template).

      Robert

      Delete
  5. Hello,is there a way to know/identify how much freepage & pctfree is used/exhausted for a table?
    How do I make an estimation of best pctfree & freepage values to be given for a tablespace?

    ReplyDelete
    Replies
    1. You might be able to figure out how much free space is left in a table space, but I do not think that that would be a very productive use of your time. Free space is there to preserve clustering between REORGs, so it is best to focus on the degree to which inserts are starting to go where they should not go, from a clustering perspective. The SYSIBM.SYSTABLESPACESTATS catalog table can be helpful for monitoring this situation (see https://www.ibm.com/docs/en/db2-for-zos/12?topic=tables-systablespacestats). The REORGINSERTS column of that table will tell you how many rows have been inserted into the table space (or table space partition) since the last REORG, and the REORGUNCLUSTINS column will tell you how many of those inserts were not near the target page (the target page is identified via the table's clustering index). If you see that a growing number of inserts are not going into pages near the target page, it's time for a REORG (if maintaining clustering is important for the table space).
      You could start with a PCTFREE value in the range of 10-20, and adjust that value up or down based on the target REORG frequency (e.g., if you want to REORG the table space once per month, and PCTFREE 20 keeps the table space very well clustered between REORGs, you might be able to adjust the PCTFREE value downward somewhat if you want to boost the efficiency of disk spaceusage).

      Note that FREEPAGE is not so useful for a table space as it is for an index; and, for an index, neither FREEPAGE > 0 nor PCTFREE > 0 is needed if the key is continuously-ascending in nature. Similarly, if a table's clustering index is continuously-ascending in nature, the table space should stay well-clustered without a need for free space.

      Robert

      Delete
    2. That was helpful.Thanks Rob!

      Delete