I want to start out this memory-for-MIPS series with information on using the buffer pool virtual storage resource that you already have more beneficially, from a performance perspective, for three reasons: 1) the buffer pool configuration is typically the primary DB2 consumer of a system's memory resource, 2) you can really do some good by tuning what you've got, and 3) before asking for more memory for DB2 buffers, you ought to be making good use of the memory you've got.
Following, then, are some of my favorite hints and tips for getting your DB2 buffer pool house in order:
A buffer pool dedicated to the work file table spaces is a different animal. Let me start by saying that it's a REALLY good idea to have buffer pools that are dedicated to your work file table spaces (one for the 4K table spaces, and one for the 32K table spaces -- the latter being, often, MUCH more heavily utilized in a DB2 9 or 10 environment, versus a DB2 V8 system). In configuring these buffer pools, keep a couple of things in mind:
- Almost all of the reads of data in work file table spaces will be of the prefetch variety. Why should you care about this? Because the default value for the virtual pool sequential steal threshold, VPSEQT, is 80. Basically, that setting means that only about 80% of the buffers in a pool dedicated to work file table spaces will get a lot of use, with the other 20% being reserved for relatively infrequent random (single-page) reads. Raise that threshold value significantly, and it's like getting more memory for free (you should see fewer prefetch read I/Os for the pool in question, and fewer read I/Os means less CPU spent in driving I/O operations). I've often seen a VPSEQT value of 95-99% for a work file-dedicated buffer pool used to good effect.
- The motivation for externalizing updated work file table space pages to disk is not the same as it is for application table spaces and indexes. For application data (in table spaces and indexes), you don't want too much time to pass between the updating of a page and the externalization of that changed page to disk. Why? Because if the DB2 subsystem should fail, restart processing will be elongated if a large number of pending writes have to be processed (part of restart following abnormal termination involves DB2 making sure that committed updates made to pages that were in memory and not yet externalized at the time of the failure are reflected in the data on disk; otherwise, the data on disk would be inconsistent). It's a different story for pages belonging to work file table spaces. Those table spaces are conceptually like scratch pads and don't have to be processed during restart as do application table spaces and indexes and the DB2 catalog. So, for the work file table spaces, page externalization is about ensuring that the associated buffer pools don't get too full of non-stealable pages. That being the case, the deferred write and vertical deferred write queue thresholds (DWQT and VDWQT) can be raised for work file-dedicated buffer pools without adversely affecting DB2 restart performance. Raising these thresholds for the pools used for work file table spaces reduces disk write activity and that reduces CPU consumption. Some folks specify 70 and 40 for DWQT and VDWQT, respectively, some go with 80 and 50, and some go with even higher values (the defaults are 30 for DWQT and 5 for VDWQT). A word of caution: at one site I saw that DWQT and VDWQT for a work file-dedicated buffer pool were both set to 90, and the data manager threshold (indicating that 95% of a pool's buffers are unavailable) was hit a great many times per hour for that pool. That is REALLY BAD for performance (big spike in GETPAGE activity), so watch that you don't go overboard with choosing elevated values for the deferred write thresholds for your work file buffer pools (a DB2 monitor will tell you if the data manager threshold has been reached for a buffer pool; so, too, will output of the DB2 command -DISPLAY BUFFERPOOL DETAIL).
Steal smart. By default, DB2 utilizes a least-recently-used (LRU) algorithm in identifying buffers that are to be stolen to accommodate new pages being brought into memory from disk. That's the right algorithm to use in most cases; however, if you have a DB2 9 (or earlier) system and you "pin" some objects in memory by assigning those objects to a pool that has as many or more buffers than the objects have pages (e.g., you assign index A, with 200 pages, and index B, with 300 pages, to a pool that has 500 or more buffers) then you should use the first-in, first-out (FIFO) page steal algorithm for the pin-pool. Why? Because FIFO is a simpler, and therefore less costly (CPU-wise), page steal algorithm versus LRU. For a pool used for object pinning, little to no buffer stealing will occur (depending on whether or not you actually have enough buffers in the pool to hold all the pages of the objects assigned to the pool), so why go with the more expensive page steal algorithm?
In a DB2 10 environment you have a new choice here: PGSTEAL(NONE). When that option is specified for a buffer pool, here's what happens:
- When an object assigned to a PGSTEAL(NONE) pool is first accessed, DB2 will, in the background, prefetch every page of that object into the pool (the application process that first accessed the object doesn't have to wait for this prefetch-all process to finish -- it gets the pages that it needs and goes on from there).
- For further access to the now-fully-in-memory object, the DB2 optimizer will assume that no I/Os will be required, and this assumption will factor into SQL statement access path selection.
- If in fact the pool does not have enough buffers to hold all of the pages of an object assigned to the pool, buffers will be stolen as needed based on the FIFO page steal algorithm.
Use page-fixing of buffers appropriately. Since DB2 V8, you've had the option of fixing a pool's buffers in memory via the PGFIX(YES) option of the -ALTER BUFFERPOOL command. Page-fixing a pool's buffers (meaning that the page frames holding those buffers can't be stolen by z/OS) delivers CPU savings by eliminating the need to fix in memory and then release a buffer every time it receives a page read in from disk or provides a page to be written out to disk (those page fix and release actions become unnecessary when the buffers are always in a fixed-in-memory state). Page-fixing then, can boost a DB2 system's CPU efficiency when it's used for pools that have a high level of I/O activity (and the focus here is usually on read I/O activity). If there are 10 or 20 read I/O operations per second for a buffer pool, fixing that pool's buffers in memory won't have much of an impact on CPU consumption. If the rate of read I/O activity for a pool is hundreds or thousands per second, page-fixing buffers can deliver a nice CPU savings benefit. You might think that it would be easier to specify PGFIX(YES) for all of a DB2 subsystem's buffer pools, but unless you're really awash in mainframe memory I wouldn't recommend that approach. I think that it's a good idea to have some pools -- those with lower rates of read I/O activity -- for which buffers are not page-fixed. That approach provides z/OS with something of a safety valve, referring to stealable page frames holding DB2 buffers, that can be utilized if the system memory resource becomes overtaxed.
You may have heard that DB2 10 will use 1 MB page frames (if available) for page-fixed buffer pools. That's true, and this feature of DB2 10 further boosts the CPU savings effect of fixing buffers in memory by improving the efficiency of virtual-to-real storage address translation (versus a 4 KB page frame usage scenario). Keep this in mind, though: DB2 9 can't use 1 MB page frames for buffer pools, whether page-fixed or not. What this means is that you don't want to configure an overly large portion of your z/OS system's real storage resource to be managed using 1 MB page frames (that's done via the LFAREA parameter of the IEASYSxx member of PARMLIB) until you've migrated to DB2 10 and are likely past the point of needing to fall back to DB2 9. If you want to be conservative, migrate to DB2 10 and THEN configure your z/OS system for 1 MB page frames.
Consider robbing Peter to pay Paul, if Peter has more than he needs and Paul doesn't have enough. As I mentioned up front, this first of my three-part memory-for-MIPS blog entry is focused on making the best use of the memory already allocated for your DB2 buffer pools. That doesn't mean that you can't make a pool larger -- you can, if you decrease the size of another pool by the same amount. Suppose that buffer pools BP2 and BP3 are both sized at 40,000 buffers each in your system. If the read I/O rate for BP2 is 20 per second and the rate for BP3 is 2000 per second, I'd seriously consider taking a big chunk of BP2's buffers away and giving them to BP3 (for example, you might take BP2 from 40,000 down to 20,000 buffers, and take BP3 from 40,000 buffers to 60,000. Such a move might put a serious dent in the high read I/O rate seen for BP3 without substantially increasing the read I/O rate associated with BP2.
As for these read I/O rates I've mentioned, you can get those from a DB2 monitor statistics detail report, or perhaps from a monitor's online display of buffer pool activity. You can also get this information from the output of the DB2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL (and I specify ACTIVE in the command to get information for pools that are actually being used). To get per-second figures using the command output, do the following:
- Issue -DISPLAY BUFFERPOOL(ACTIVE) DETAIL at the beginning of some busy hour of the day (referring to workload being processed by the DB2 subsystem). The output of this issuance of the command does not have to be retained.
- Issue -DISPLAY BUFFERPOOL(ACTIVE) DETAIL a second time, one hour after issuing the command the first time. Retain the output from this issuance of the command.
- A few minutes after issuing the command the second time, issue it a third time. All you want from the output of this issuance of the command is the timestamp value that appears in the DSNB409I part of the output (e.g., "INCREMENTAL STATISTICS SINCE 10:00:48 MAR 19, 2012"). That's the "to" time for the interval of activity captured in the second issuance of the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command. The "from" time for this interval is the timestamp value found in the output of the second issuance of the command. Subtract the "from" timestamp value from the "to" value, and you have the number of seconds in the interval. Divide the read I/O figures in the output of the second issuance of the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command by this number of seconds, and you have per-second rates of activity.
Whatever the source of your numbers (DB2 monitor or DB2 command output), what you want, for each pool, is the TOTAL read I/O rate per second. That's the sum of synchronous read I/Os (random + sequential) and asynchronous read I/Os (sequential prefetch + list prefetch + dynamic prefetch). This rate is your guide to appropriate use of buffer page-fixing, and your measuring stick to gauge the impact of, for example, increasing VPSEQT for a work file-dedicated pool, or increasing the size of a high-read-I/O pool by adding buffers taken from a low-read-I/O pool. In part two of my three-part memory-for-MIPS blog entry, we'll revisit read I/O rates in the context of boosting the aggregate size of a buffer pool configuration. Look for that next entry to be posted in a week or two.