Thursday, May 30, 2024

Db2 for z/OS: Really Big Buffer Pools are Great, but Don't Stop There

 Back in 2018, I reviewed an organization's production Db2 for z/OS environment, and saw at that time the largest z/OS LPAR real storage size I'd ever seen: 1100 GB. The Db2 subsystem running in that LPAR had (not surprisingly) the largest buffer pool configuration I'd ever seen: 879 GB (referring to the aggregate size of all of the Db2 subsystem's buffer pools). Within that buffer pool configuration was (again, not surprising) the lagest single Db2 buffer pool I'd ever seen: approximately 262 GB (66,500,000 4K buffers).

Those "biggest I've ever seen" figures sustained that status until just recently, when I reviewed another production Db2 for z/OS system. The new "biggest I've ever seen" numbers are way larger than the previous maximums:

  • A z/OS LPAR with 2.4 terabytes of real storage
  • A Db2 subsystem with 1.7 terabytes of buffer pool resources.
  • An individual buffer pool of approximately 560 GB in size (140,000,000 4K buffers)
What I saw is great, and it's the future, folks. That humongous buffer pool configuration is eliminating a humongous number of read I/Os (synchronous and asynchronous), and that, in turn, boosts CPU efficiency and throughput for the Db2-accessing application workload. z/OS LPAR real storage sizes are getting bigger and bigger, in part because mainframe memory keeps getting less expensive on a per-GB basis, and in part because z/OS can utilize ever-larger amounts of real storage (z/OS 3.1 supports up to 16 TB of memory for one LPAR - though with the way memory is currently packaged on an IBM Z server, you're advised to limit the size of one LPAR's real storage resource to 10 TB), and in part because loading up on real storage does not increase the cost of software that runs in a z/OS system (the cost of that software is based on mainframe general-purpose CPU utilization - not real storage size).

Organizations that run production Db2 subsystems in z/OS LPARs that have large amounts of real storage are increasingly going with really large Db2 buffer pool configurations. That's a smart thing to do. Db2 allows for up to 16 TB of buffer pool space for a single subsystem, and going big is a definite plus for Db2 workload performance and CPU efficiency (just be sure that you don't over-burden a z/OS LPAR's real storage resource: you want to keep the LPAR's demand paging rate - available via an RMF Summary report - below 1 per second).

Here's the point I want to make with this blog entry: while really big buffer pools are great for Db2 application performance, don't focus solely on the size of a Db2 subsystem's buffer pool configuration. In addition to going big, take other steps to maximize the positive performance benefits of a large Db2 buffer pool configuration:

  • Use large real storage page frames for the busier buffer pools - In my view, a buffer pool's GETPAGE rate (obtainable from a Db2 monitor-generated statistics long report) is the best indicator of a pool's busy-ness. My recommendation is to use large real storage page frames for every pool that has a GETPAGE rate (during a busy hour of the processing day) in excess of 1000 per second. Large real storage page frames enhance CPU efficiency for page access by reducing the CPU cost of translating virtual storage addresses to real storage addresses. Some related things to note:
    • Page-fixing of buffers in a pool is a prerequisite for using large real storage page frames - This is done for a pool via a specification (in an -ALTER BUFFERPOOL command) of PGFIX(YES). Note that actually changing from not-page-fixed to page-fixed requires deallocation and reallocation of a pool - this usually happens as a consequence of recycling (i.e., stopping and then restarting) the associated Db2 subsystem. That deallocation and reallocation is also required to go from the default real storage frame size of 4K to a larger frame size.
    • Use the appropriate real storage frame size - For a buffer pool that is at least 20 GB in size (a little over 5,000,000 buffers, if we're talking about a 4K pool), I recommend the use of 2 GB page frames (this is accomplished via a specification of FRAMESIZE(2G) in an -ALTER BUFFERPOOL command). For a pool that is smaller than 20 GB, I recommend the use of 1 MB page frames (FRAMESIZE(1M)). Note that if a pool is defined with PGSTEAL(NONE) (see below for information about such pools, which are known as "contiguous" pools), 2 GB frames cannot be used - you need to go with FRAMESIZE(1M) for a PGSTEAL(NONE) pool, unless objects assigned to the pool are quite small (e.g., less than 100 4K pages), in which case the default 4K frame size would be appropriate (for a PGSTEAL(NONE) pool, Db2 will use a given large real storage page frame for buffers that hold pages of one object assigned to the pool).
    • Set the value of the z/OS parameter LFAREA appropriately - LFAREA, a parameter in the IEASYSnn member of the z/OS data set named SYS1.PARMLIB, specifies the amount of a z/OS LPAR's real storage resource that is to be managed in 1 MB (and, possibly, 2 GB) frames. I generally like to see an LFAREA specification that provides enough 1 MB (and 2 GB, when appropriate) frames to fully back FRAMESIZE(1M) (and, if relevant, FRAMESIZE(2G)) pools, but not much more than that. Why not much more? Because a lot of processes in a z/OS system can only use 4 KB page frames. If you think you might want to later enlarge some Db2 buffer pools that have a FRAMESIZE(1M) or a FRAMESIZE(2G) specification, you can make the LFAREA value for the frame size in question larger than you need at present - just don't go overboard with that.
    • Use the output of the Db2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL to verify that pools that have a preferred frame size of 1M or 2G are fully backed by frames of the desired size - If you see, in the output of that command, that a pool with a VPSIZE of 100,000 buffers and a preferred frame size of 1M has 70,000 buffers allocated in 1M frames and 30,000 buffers allocated in 4K frames, it means that there aren't enough 1 MB frames in the system to fully back the pool. Note that for a pool that has a preferred frame size of 2G, you might see some buffers allocated in 1M frames, even if the system has enough 2 GB frames to "fully" back the pool. Why might that be the case? Well, in the interest of not wasting a lot of space in a 2 GB page frame, Db2 won't use such a frame if it can't fill at least about 95% of the frame with buffers. Suppose you have a pool that is about 41 GB in size. Db2 will use twenty 2 GB frames for that pool, and the remaining 1 GB of buffers will be allocated in 1 MB frames, As far as I'm concerned, that is not at all a big deal. No problem.

  • If you have any PGSTEAL(NONE) buffer pools, aim for zero activity in the overflow area of those pools - A PGSTEAL(NONE) buffer pool is also known as a "contiguous" buffer pool. Such a pool is intended to be used to completely cache in memory the objects assigned to the pool. A contiguous buffer pool can optimize efficiency for page access, but some of that advantage is lost if there is any activity in the pool's overflow area, which is where any buffer stealing would occur (if the pool were in fact not large enough to hold every page of every object assigned to the pool). Output of the Db2 command -DISPLAY BUFFERPOOL (ACTIVE) DETAIL will show if there is any activity in the overflow area of a PGSTEAL(NONE) pool. If you see that there is activity in the overflow area of a PGSTEAL(NONE) pool (unless it's a really small amount of activity), either take one or more objects out of the pool (by reassigning them to other pools) or make the pool larger (always keeping in mind that you want the demand paging rate of the z/OS LPAR to be less than 1 per second - don't over-burden the LPAR's real storage resource).
And there you have it. For a Db2 for z/OS buffer pool configuration, really big is really great, but take the time to go the extra mile by optimally configuring the pools in the configuration.

15 comments:

  1. Hi Robert, we had problems in the past with PGSTEAL(NONE), when a REORG would take a long time to refresh the REORGed tablespace's pages.
    Can you please tell us if this is a problem and what do we need to look out for?
    Mare details on how Db2 handles Utilities against tablespaces in PGSTEAL(NONE) buffer pools?

    ReplyDelete
    Replies
    1. OK, the thing to keep in mind is this: when a table space assigned to a PGSTEAL(NONE) buffer pool is online REORG-ed, the shadow data sets (which will become the new "original" data sets for the table space at the conclusion of the online REORG) are, of course, assigned to the same buffer pool as the table space being REORGed. If the table space is fairly large (or if the PGSTEAL(NONE) buffer pools is fairly small), that can put a lot of pressure on the PGSTEAL(NONE) pool, by potentially driving a lot of activity in the pool's overflow area. What to do about this? A few thoughts:
      * Assign the table space's indexes to a buffer pool other than the one to which the table space is assigned (because indexes are REORG-ed, too, when you online-REORG a table space).
      * Make the PGSTEAL(NONE) pool larger, if the z/OS LPAR's real storage resource would permit that (as noted in the blog entry, you want the LPAR's demand paging rate to be less than 1 per second).
      * If the table space in question is partitioned, do partition-level REORGs to reduce space requirements in the PGSTEA(NONE) buffer pool.
      * Consider whether the table space in question even needs to be reorganized. At some sites, table spaces are periodically REORG-ed whether that is needed or not. REORG when needed, and don't when it's not needed.
      * Try not to online REORG several objects assigned to the same PGSTEAL(NONE) pool at the same time.
      * If none of the above are helpful or possible, you might need to consider reassigning the table space in question to a standard PGSTEAL(LRU) pool.

      Robert

      Delete
  2. Hello Robert
    I think I heard in a DB2 webcast, due to actual hardware capacity (cpu, disk controllers, real storage) , PGSTEAL(FIFO) can be better option than LRU because simplify BPOOL algorithm to keep it.
    Can this be true?

    In a bad scenario, if DBM1 takes a system dump (or IBM asked for one) these big BPOOL will freeze lpar for several seconds, correct?
    (of course we don´t expect this to happen...)

    Thanks!


    ReplyDelete
    Replies
    1. I'd say that in the very large majority of situations, the CPU cost associated with tracking references to pages in a buffer pool when PGSTEAL(LRU) is in effect is way more than offset by the CPU savings related to the reduced read I/O activity typically delivered by the LRU buffer steal algorithm. I see PGSTEAL(FIFO) as very much a niche option - beneficial only in unusual situations. PGSTEAL(FIFO) MIGHT be beneficial for.a buffer pool when pages are rarely re-referenced after initially being read into memory - like I said, an unusual situation.

      As for a DBM1 dump, the impact of such an occurrence on a z/OS LPAR will depend on a number of factors, including the LPAR's real storage resource and the number of engines (processors) configured for the LPAR. I think that in most cases the impact of a DBM1 dump will be secondary to the impact of the Db2 failure situation that triggered the dump.

      Robert

      Delete
    2. Thank you very much Robert

      Delete
  3. Hello Robert
    Can this increase in bufferpools size affect db2 commit (gets heavier)?
    Do we need to review CHKFREQ in ZPARM?

    ReplyDelete
    Replies
    1. I don't see that larger buffer pools would make commit processing more expensive, nor do I see that larger buffer pools would impact the preferred value of CHKFREQ in ZPARM. With regard to CHKFREQ, my preference is to go with time-based Db2 checkpoint frequency, and I like the default specification of 3 minutes between checkpoints.

      What larger buffer pools could impact is the preferred value of the VDWQT specification for a given pool. A larger buffer pool might cause VDWQT to be hit less frequently, and that could lead to more synchronous write actions for the buffer pool than you'd like. As a general rule, I like to go with the default VDWQT value of 5 for a buffer pool. I might want to adjust that value downwards if I saw (via my Db2 monitor or in the output of the Db2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL) that synchronous writes were outnumbering asynchronous writes for a given pool. An exception to that rule: I wouldn't worry about synchronous writes for a pool outnumbering asynchronous writes if the volume of write activity were quite low for the pool (e.g., in the single digits per second).

      Robert

      Delete
    2. Thank's Robert!

      Delete
  4. LFAREA for 1M frames is no longer a pre-allocation since z/OS 2.3 - it's now a limit. So you can set this higher than your current allocations safely without affecting the availability of 4K pages - or worry about pressure on 4K pages causing 1M pages to be broken down.

    ReplyDelete
    Replies
    1. That's very useful information - thanks for posting this comment.

      Robert

      Delete
  5. Hi Robert.
    We have a contiguous bufferpool of 280K pages. One object in the pool takes up a majority of the pages (TS: 104K pages, 4 IX's totaling another 100K pages). This object is reorged infrequently, however when it does we receive the messages that pages are being read into overflow for the pool. The documentation indicates that when pages are read into overflow, they remain there until the object is stopped. Are you able to clarify how the utility processes this object during reorg processing? In particular, will the objects reside entirely in the contiguous BP following reorg completion and no longer remain in the overflow? Does DB2 'tidy-up' the contiguous BP following reorg and if so, can you give a brief overview of the process?

    ReplyDelete
    Replies
    1. Online REORGs are an important consideration when planning for the use of Db2 for z/OS contiguous buffer pool functionality. What people often fail to think about are the shadow data sets created by Db2 as part of online REORG processing.

      In your situation, you have a contiguous pool with a VPSIZE of 280,000. The overflow area for that pool will be 6400 buffers, which means the contiguous part of the pool has (280,000 - 6400) = 273,600 buffers. 204,000 of those 273,600 buffers will be occupied by the table space in question and its associated indexes. When you online REORG the table space, shadow data sets roughly equal in size to the "original" table space and data sets will be created by Db2, and those will of course be assigned to the same contiguous buffer pool to which the "original" table space and index data sets are assigned. The approximately 204,000 buffers needed for those shadow data sets will of course overwhelm the 69,600 buffers available in the contiguous part of the pool (and that 69,600 number, which is 273,600 buffers of contiguous space minus the 204,000 buffers needed for the table space and its indexes, assumes that the table space and its indexes are the ONLY objects assigned to the contiguous pool). That, in turn, will lead to a lot of I/O activity for the overflow area of the pool, since that is where buffer stealing occurs.

      Bottom line: in the scenario you have described, if you want to be able to online REORG the table space while minimizing (or even eliminating) associated activity in the pool's overflow area, you will need to substantially enlarge the pool so that Db2 will be able to work with the shadow data sets in the contiguous part of the pool, versus having to pound the overflow area.

      Robert

      Delete
    2. Thanks Robert. Appreciate the prompt response.
      We will proceed with increasing the contiguous BP.
      Just a quick follow-up. What/How does DB2 manage the contiguous pool following completion of the reorg if the shadow object has been allocated to the overflow area and/or paged out?

      Delete
    3. I'm not 100% sure on that, but my thinking is that a) when the SWITCH phase of online REORG has completed and the formerly "original" data sets of the objects REORGed (table space, indexes) are no longer open and allocated to Db2, the buffers in the PGSTEAL(NONE) pool that held pages of the data sets will be freed; and, b) (if necessary), the pages of the former shadow data sets (now the new original data sets) will be asynchronously read into the contiguous part of the pool (I put "if necessary" because this should not be necessary if the pages of the shadow data sets could all go in the contiguous part of the pool in the first place.

      I'd say this is the way I hope it works.

      Robert

      Delete
    4. Thanks again Robert. That is also what I expect/hope is the case.

      Delete