Recently, a DB2 for z/OS professional I’ve known for some
years sent to me a question about the relationship between the PGSTEAL and
PGFIX options of the DB2 command -ALTER BUFFERPOOL. It took a few iterations of
question and answer to get things straightened out, and I thought, “Hmm. If
this person, who has lots of DB2 for z/OS experience and knowledge, needed a
little help in getting PGSTEAL and PGFIX straightened out in his mind, it's likely that other
DB2 people are in the same boat.” And so I’m writing this blog entry, in
the hope that it will be helpful to people who have some uncertainty about the dependencies
– if any – between PGSTEAL and PGFIX.
OK, the “if any” in the preceding sentence might suggest to
you that maybe there isn’t anything
in the way of interdependency with regard to PGSTEAL and PGFIX. In fact, there
really isn't. Yes, there’s a recommended combination of PGSTEAL and PGFIX
settings, and I’ll get to that, but the suggested combination is about the way
in which PGSTEAL and PGFIX can both support a particular performance objective,
as opposed to having anything to do with the way these two buffer pool
specifications affect each other, because
they don't affect each other.
“What?” you might ask. “How is it that two buffer pool
configuration options that both have 'page' in their long names (‘PG’ is short
for ‘page’) don’t really have anything to do with each other?” This is so
because PGSTEAL and PGFIX have very different functions. This functional
difference might have been more readily apparent to people if PGSTEAL had
instead been labeled something like BSTEAL – short for “buffer steal,” because
PGSTEAL is about management of buffers, which is a DB2 responsibility, and
PGFIX is about management of real storage – a responsibility of the z/OS
operating system.
Let me make this distinction even more clear: PGSTEAL is
related to virtual storage management
– DB2’s management of virtual storage space that belongs to DB2 – and PGFIX is
related to real storage management –
something in z/OS’s bailiwick.
Consider the possible values of PGSTEAL: LRU, FIFO, and
NONE. The first two are specified when you anticipate that there will be some
buffer-steal activity for the pool in question (i.e., there are more pages in
objects assigned to the pool than there are buffers in the pool). You’d go with
LRU (least recently used) if you anticipate that there will be quite a bit of
buffer-steal activity for the pool (“OK, DB2, you’re probably going to have to
frequently steal buffers for this pool. When you do that – when you have to
replace a table or index page currently in a buffer with another page – steal
the buffer holding the page that’s gone the longest time without being referenced.”).
You’d choose FIFO (first in, first out) when you expect some – but very little
– buffer-stealing for the pool (“Hey, DB2. You’re going to have to do a very
small amount of buffer stealing for this pool. That being the case, don’t waste
CPU cycles keeping track of which buffers hold the pages that have gone the
longest time without being referenced. Just steal the buffer holding the page
brought into memory the longest time ago.”).
How about PGSTEAL(NONE)? When would you go with that option?
You’d go this route if you anticipate that there will be NO buffer-steal
activity for a pool – in other words, the pool has more buffers than there are pages
belonging to objects assigned to the pool (“Hey, DB2. I’m planning on using
this pool to cache one or more table spaces and/or indexes in memory in their
entirety. I’m doing that because I want maximum performance for these babies.
Help me out.”). And, given the message you’ve sent to DB2, DB2 will help
you out: it will asynchronously read into the pool every page belonging to an
object assigned to the PGSTEAL(NONE) pool when the object is first referenced
after a DB2 start-up, and in optimizing SQL statements targeting objects
assigned to the pool, it will perform access path selection with the assumption
that read I/O costs will be zero (not counting possible reads associated with
use of work file table spaces).
Next, PGFIX settings. There are two: YES and NO. When NO (the
default) is in effect, a real storage page frame occupied by a DB2 buffer can
be stolen by z/OS (i.e., the contents of the page frame can be moved to
auxiliary storage, also known as the page data sets) in order to make room for
some other page that has to be brought into real storage (in performing such
page steal actions, z/OS utilizes a least-recently-used algorithm). To prevent
a z/OS real storage page frame steal action from interfering with a buffer read
I/O (DB2 copies a table space or index page from disk – or maybe from a group
buffer pool, in a data sharing system – into a buffer) or write I/O operation (DB2
copies a page in a buffer to disk or to a group buffer pool), a buffer will be
fixed in memory (i.e., made non-page-able) prior to the I/O action, and
released (made page-able again) after the I/O action. When PGFIX(YES) is in
effect for a pool, buffers belonging to the pool are fixed in memory (made non-page-able)
as soon as the pool is allocated, and they remain fixed in memory as long as
the pool is allocated. Because the buffers are always in a fixed-in-memory
state, the page-fix/page-release actions required for all buffer read and write
I/O operations when PGFIX(NO) is used are not necessary. That makes buffer read
and write I/Os more CPU-efficient when PGFIX is set to YES, and that makes
PGFIX(YES) particularly beneficial for pools with high rates of read and write
I/Os.
PGFIX(YES) is also a prerequisite if you want a DB2 buffer
pool to be backed by large real storage page frames (1 MB or, starting with DB2
11 for z/OS, 2 GB frames, versus the traditional 4 KB frames). When a pool’s
buffers are located in large real storage page frames, further CPU savings (beyond
less-expensive I/O operations) are realized, because large real storage page
frames make virtual-storage-to-real-storage address translation more efficient.
If we take a look at some combinations of PGSTEAL and PGFIX
settings, you’ll see what I meant about there being essentially nothing in the
way of interdependency. Take PGSTEAL(LRU) and PGFIX(YES). Will the fact that
the pool’s buffers are fixed in memory have any impact on the level of buffer
stealing done by DB2 for this pool? NO! REGARDLESS of whether a pool’s buffers
are fixed in memory or not, DB2 will ALWAYS steal a buffer in the pool WHENEVER
IT HAS TO, and a buffer HAS TO BE STOLEN whenever all the pool’s buffers are
occupied (by table space and/or index pages) and a page has to be read into the
pool from disk (or from a group buffer pool in a coupling facility). The PGFIX(YES)
setting for this pool simply means that z/OS will not move any of the pool’s
buffers out of real storage to auxiliary storage – it in no way restricts DB2’s
ability to steal buffers so that pages not yet in the pool can be brought in
from disk (or from a coupling facility).
Change the combination of PGSTEAL and PGFIX settings to NONE
and NO, respectively, and again you’ll see that there is nothing in the way of
an interdependency. PGSTEAL(NONE) means that DB2 does not EXPECT to have to do
any buffer stealing for the pool, because there is an assumption that the
number of pages belonging to objects assigned to the pool does not exceed the quantity
of buffers in the pool (i.e., does not exceed the pool’s VPSIZE). Of course, if
the number of pages in objects assigned to the pool does exceed the
pool’s VPSIZE (perhaps VPSIZE was too small to begin with, or it was OK
initially but objects assigned to the pool have gotten larger), and a buffer
has to be stolen because all buffers are occupied and a page has to be brought
into the pool, a buffer WILL BE STOLEN because (as noted previously) a buffer
will ALWAYS BE STOLEN when a buffer HAS TO BE STOLEN (when PGSTEAL is set to
NONE for a pool and some buffer stealing has to be performed, it will be done
on a FIFO basis – first in, first out). What about the PGFIX(NO) setting for
this pool? Will the pool’s PGSTEAL(NONE) setting in any way restrict the
ability of z/OS to move buffers belonging to the pool out of real storage to
auxiliary storage, as necessary, to free up real storage page frames so that
new pages can be brought into real storage? NO! With PGFIX(NO) in effect, a
pool’s buffers are absolutely fair game for being paged out to auxiliary
storage, REGARDLESS of the pool’s PGSTEAL setting (that said, buffers in
PGFIX(NO) pools are not often paged out of real storage, because z/OS, as
noted, steals real storage page frames on a least-recently-used basis, and DB2
tends to “touch” its buffers with a frequency that makes them unlikely to be
among the least-recently-used pages in a z/OS LPAR’s real storage).
DB2, then, manages its buffers as it needs to (with some
guidance provided by you via a pool’s PGSTEAL setting), and z/OS manages its
real storage resource as it needs to (with the understanding that buffers in a
PGFIX(YES) pool are off-limits with regard to page-out actions), and the DBMS
and the OS attend to these respective tasks pretty much independently. PGSTEAL
settings do not impact z/OS page-frame-stealing, and PGFIX settings do not
impact DB2 buffet-stealing.
Now, early on in this post I mentioned that while there
essentially aren’t any interdependencies between the various settings of
PGSTEAL and PGFIX for a buffer pool, there is a recommendation I have
concerning a particular combination of PGSTEAL and PGFIX values. Here it is: if
you are going to specify PGSTEAL(NONE) for a buffer pool, specify PGFIX(YES)
for that same pool, unless the demand paging rate of the z/OS LPAR is higher
than it should be (if the demand paging rate – available by way of a z/OS
monitor – is in the low single digits or less per second, it’s OK). Why I make
this recommendation: presumably, you assign objects to a PGSTEAL(NONE) buffer
pool because you in fact want them to be cached in memory in their entirety. You
would do this, I imagine, for objects for which maximum performance is really
important. If that’s the case, why not really max out performance as it
pertains to accessing these objects? Make the pool PGFIX(YES) as well as
PGSTEAL(NONE), so that you can get the CPU efficiency benefit of large real
storage page frames (of course, to get that benefit you need to have
large page frames available in the LPAR to back the PGFIX(YES) pool –
information about that can be found in an entry I posted to this blog a fewmonths ago).
And there you have it. Just remember that PGSTEAL is related
to DB2’s management of its buffers, while PGFIX is related to z/OS’s management
of the LPAR’s real storage resource. Two different buffer pool configuration
settings, for two different aspects of DB2 performance.
Very good explanation. We do not yet have large page frames defined but are looking into that too. I have a small bufferpool that contains only small, high read only activity. I'm going to alter it to PGSTEAL(NONE). My question is: will db2 load the pages from the tablespaces after making this change or is there something else that needs to be done? Your comments say the pages will be loaded the first time the object is referenced after a db2 startup. What about if db2 has been up for quite a while already?
ReplyDeleteHi Robert. As an update to my previous question, I found that when I set PGSTEAL=NONE on a bufferpool, it does load all pages. The problem is that it loads all NACTIVE pages, not just the pages containing data. So all the preformatted NACTIVE pages that do not contain data are also loaded into the bufferpool. I've asked IBM about this but haven't gotten a response yet.
ReplyDeleteThanks for relating this experience, Russell (and apologies for taking so long to respond).
DeleteRobert
I know I'm late to this question - but do you have any recommendations for audit or statistics reporting to determine tables that would be good candidates for PGSTEAL(NONE)?
ReplyDeleteGenerally speaking, tracing that would provide data-access statistics at the table level would be pretty expensive, overhead-wise.
DeleteYou might want to make use of information in the real-time statistics tables in the Db2 catalog (SYSTABLESPACESTATS and SYSINDEXSPACESTATS). What you're looking for are objects that are not super-big (might not make sense to try to completely cache a really big table or index in memory), and that are accessed very frequently. NACTIVE in SYSTABLESPACESTATS and SYSINDEXSPACESTATS gives you an idea of an object's size. To get an idea as to frequency of access, check REORGSCANACCESS in SYSTABLESPACESTATS and REORGINDEXACCESS in SYSINDEXSPACESTATS.
Robert