Recently, a DB2 for z/OS professional I’ve known for some years sent to me a question about the relationship between the PGSTEAL and PGFIX options of the DB2 command -ALTER BUFFERPOOL. It took a few iterations of question and answer to get things straightened out, and I thought, “Hmm. If this person, who has lots of DB2 for z/OS experience and knowledge, needed a little help in getting PGSTEAL and PGFIX straightened out in his mind, it's likely that other DB2 people are in the same boat.” And so I’m writing this blog entry, in the hope that it will be helpful to people who have some uncertainty about the dependencies – if any – between PGSTEAL and PGFIX.
OK, the “if any” in the preceding sentence might suggest to you that maybe there isn’t anything in the way of interdependency with regard to PGSTEAL and PGFIX. In fact, there really isn't. Yes, there’s a recommended combination of PGSTEAL and PGFIX settings, and I’ll get to that, but the suggested combination is about the way in which PGSTEAL and PGFIX can both support a particular performance objective, as opposed to having anything to do with the way these two buffer pool specifications affect each other, because they don't affect each other.
“What?” you might ask. “How is it that two buffer pool configuration options that both have 'page' in their long names (‘PG’ is short for ‘page’) don’t really have anything to do with each other?” This is so because PGSTEAL and PGFIX have very different functions. This functional difference might have been more readily apparent to people if PGSTEAL had instead been labeled something like BSTEAL – short for “buffer steal,” because PGSTEAL is about management of buffers, which is a DB2 responsibility, and PGFIX is about management of real storage – a responsibility of the z/OS operating system.
Let me make this distinction even more clear: PGSTEAL is related to virtual storage management – DB2’s management of virtual storage space that belongs to DB2 – and PGFIX is related to real storage management – something in z/OS’s bailiwick.
Consider the possible values of PGSTEAL: LRU, FIFO, and NONE. The first two are specified when you anticipate that there will be some buffer-steal activity for the pool in question (i.e., there are more pages in objects assigned to the pool than there are buffers in the pool). You’d go with LRU (least recently used) if you anticipate that there will be quite a bit of buffer-steal activity for the pool (“OK, DB2, you’re probably going to have to frequently steal buffers for this pool. When you do that – when you have to replace a table or index page currently in a buffer with another page – steal the buffer holding the page that’s gone the longest time without being referenced.”). You’d choose FIFO (first in, first out) when you expect some – but very little – buffer-stealing for the pool (“Hey, DB2. You’re going to have to do a very small amount of buffer stealing for this pool. That being the case, don’t waste CPU cycles keeping track of which buffers hold the pages that have gone the longest time without being referenced. Just steal the buffer holding the page brought into memory the longest time ago.”).
How about PGSTEAL(NONE)? When would you go with that option? You’d go this route if you anticipate that there will be NO buffer-steal activity for a pool – in other words, the pool has more buffers than there are pages belonging to objects assigned to the pool (“Hey, DB2. I’m planning on using this pool to cache one or more table spaces and/or indexes in memory in their entirety. I’m doing that because I want maximum performance for these babies. Help me out.”). And, given the message you’ve sent to DB2, DB2 will help you out: it will asynchronously read into the pool every page belonging to an object assigned to the PGSTEAL(NONE) pool when the object is first referenced after a DB2 start-up, and in optimizing SQL statements targeting objects assigned to the pool, it will perform access path selection with the assumption that read I/O costs will be zero (not counting possible reads associated with use of work file table spaces).
Next, PGFIX settings. There are two: YES and NO. When NO (the default) is in effect, a real storage page frame occupied by a DB2 buffer can be stolen by z/OS (i.e., the contents of the page frame can be moved to auxiliary storage, also known as the page data sets) in order to make room for some other page that has to be brought into real storage (in performing such page steal actions, z/OS utilizes a least-recently-used algorithm). To prevent a z/OS real storage page frame steal action from interfering with a buffer read I/O (DB2 copies a table space or index page from disk – or maybe from a group buffer pool, in a data sharing system – into a buffer) or write I/O operation (DB2 copies a page in a buffer to disk or to a group buffer pool), a buffer will be fixed in memory (i.e., made non-page-able) prior to the I/O action, and released (made page-able again) after the I/O action. When PGFIX(YES) is in effect for a pool, buffers belonging to the pool are fixed in memory (made non-page-able) as soon as the pool is allocated, and they remain fixed in memory as long as the pool is allocated. Because the buffers are always in a fixed-in-memory state, the page-fix/page-release actions required for all buffer read and write I/O operations when PGFIX(NO) is used are not necessary. That makes buffer read and write I/Os more CPU-efficient when PGFIX is set to YES, and that makes PGFIX(YES) particularly beneficial for pools with high rates of read and write I/Os.
PGFIX(YES) is also a prerequisite if you want a DB2 buffer pool to be backed by large real storage page frames (1 MB or, starting with DB2 11 for z/OS, 2 GB frames, versus the traditional 4 KB frames). When a pool’s buffers are located in large real storage page frames, further CPU savings (beyond less-expensive I/O operations) are realized, because large real storage page frames make virtual-storage-to-real-storage address translation more efficient.
If we take a look at some combinations of PGSTEAL and PGFIX settings, you’ll see what I meant about there being essentially nothing in the way of interdependency. Take PGSTEAL(LRU) and PGFIX(YES). Will the fact that the pool’s buffers are fixed in memory have any impact on the level of buffer stealing done by DB2 for this pool? NO! REGARDLESS of whether a pool’s buffers are fixed in memory or not, DB2 will ALWAYS steal a buffer in the pool WHENEVER IT HAS TO, and a buffer HAS TO BE STOLEN whenever all the pool’s buffers are occupied (by table space and/or index pages) and a page has to be read into the pool from disk (or from a group buffer pool in a coupling facility). The PGFIX(YES) setting for this pool simply means that z/OS will not move any of the pool’s buffers out of real storage to auxiliary storage – it in no way restricts DB2’s ability to steal buffers so that pages not yet in the pool can be brought in from disk (or from a coupling facility).
Change the combination of PGSTEAL and PGFIX settings to NONE and NO, respectively, and again you’ll see that there is nothing in the way of an interdependency. PGSTEAL(NONE) means that DB2 does not EXPECT to have to do any buffer stealing for the pool, because there is an assumption that the number of pages belonging to objects assigned to the pool does not exceed the quantity of buffers in the pool (i.e., does not exceed the pool’s VPSIZE). Of course, if the number of pages in objects assigned to the pool does exceed the pool’s VPSIZE (perhaps VPSIZE was too small to begin with, or it was OK initially but objects assigned to the pool have gotten larger), and a buffer has to be stolen because all buffers are occupied and a page has to be brought into the pool, a buffer WILL BE STOLEN because (as noted previously) a buffer will ALWAYS BE STOLEN when a buffer HAS TO BE STOLEN (when PGSTEAL is set to NONE for a pool and some buffer stealing has to be performed, it will be done on a FIFO basis – first in, first out). What about the PGFIX(NO) setting for this pool? Will the pool’s PGSTEAL(NONE) setting in any way restrict the ability of z/OS to move buffers belonging to the pool out of real storage to auxiliary storage, as necessary, to free up real storage page frames so that new pages can be brought into real storage? NO! With PGFIX(NO) in effect, a pool’s buffers are absolutely fair game for being paged out to auxiliary storage, REGARDLESS of the pool’s PGSTEAL setting (that said, buffers in PGFIX(NO) pools are not often paged out of real storage, because z/OS, as noted, steals real storage page frames on a least-recently-used basis, and DB2 tends to “touch” its buffers with a frequency that makes them unlikely to be among the least-recently-used pages in a z/OS LPAR’s real storage).
DB2, then, manages its buffers as it needs to (with some guidance provided by you via a pool’s PGSTEAL setting), and z/OS manages its real storage resource as it needs to (with the understanding that buffers in a PGFIX(YES) pool are off-limits with regard to page-out actions), and the DBMS and the OS attend to these respective tasks pretty much independently. PGSTEAL settings do not impact z/OS page-frame-stealing, and PGFIX settings do not impact DB2 buffet-stealing.
Now, early on in this post I mentioned that while there essentially aren’t any interdependencies between the various settings of PGSTEAL and PGFIX for a buffer pool, there is a recommendation I have concerning a particular combination of PGSTEAL and PGFIX values. Here it is: if you are going to specify PGSTEAL(NONE) for a buffer pool, specify PGFIX(YES) for that same pool, unless the demand paging rate of the z/OS LPAR is higher than it should be (if the demand paging rate – available by way of a z/OS monitor – is in the low single digits or less per second, it’s OK). Why I make this recommendation: presumably, you assign objects to a PGSTEAL(NONE) buffer pool because you in fact want them to be cached in memory in their entirety. You would do this, I imagine, for objects for which maximum performance is really important. If that’s the case, why not really max out performance as it pertains to accessing these objects? Make the pool PGFIX(YES) as well as PGSTEAL(NONE), so that you can get the CPU efficiency benefit of large real storage page frames (of course, to get that benefit you need to have large page frames available in the LPAR to back the PGFIX(YES) pool – information about that can be found in an entry I posted to this blog a fewmonths ago).
And there you have it. Just remember that PGSTEAL is related to DB2’s management of its buffers, while PGFIX is related to z/OS’s management of the LPAR’s real storage resource. Two different buffer pool configuration settings, for two different aspects of DB2 performance.
Very good explanation. We do not yet have large page frames defined but are looking into that too. I have a small bufferpool that contains only small, high read only activity. I'm going to alter it to PGSTEAL(NONE). My question is: will db2 load the pages from the tablespaces after making this change or is there something else that needs to be done? Your comments say the pages will be loaded the first time the object is referenced after a db2 startup. What about if db2 has been up for quite a while already?ReplyDelete
Hi Robert. As an update to my previous question, I found that when I set PGSTEAL=NONE on a bufferpool, it does load all pages. The problem is that it loads all NACTIVE pages, not just the pages containing data. So all the preformatted NACTIVE pages that do not contain data are also loaded into the bufferpool. I've asked IBM about this but haven't gotten a response yet.ReplyDelete
Thanks for relating this experience, Russell (and apologies for taking so long to respond).Delete
I know I'm late to this question - but do you have any recommendations for audit or statistics reporting to determine tables that would be good candidates for PGSTEAL(NONE)?ReplyDelete
Generally speaking, tracing that would provide data-access statistics at the table level would be pretty expensive, overhead-wise.Delete
You might want to make use of information in the real-time statistics tables in the Db2 catalog (SYSTABLESPACESTATS and SYSINDEXSPACESTATS). What you're looking for are objects that are not super-big (might not make sense to try to completely cache a really big table or index in memory), and that are accessed very frequently. NACTIVE in SYSTABLESPACESTATS and SYSINDEXSPACESTATS gives you an idea of an object's size. To get an idea as to frequency of access, check REORGSCANACCESS in SYSTABLESPACESTATS and REORGINDEXACCESS in SYSINDEXSPACESTATS.