Robert's Db2 blog: DB2 for z/OS Data Sharing: The Evolution of a GBP Sizing Formula

Sunday, July 21, 2013

DB2 for z/OS Data Sharing: The Evolution of a GBP Sizing Formula

Recently I had an e-mail exchange with my friend, colleague, and fellow DB2 blogger Willie Favero, on the subject of DB2 for z/OS group buffer pool sizing. One of my messages was a little wordy (imagine that), and Willie responded with, "[This] reads like it could be your next blog post." And so it is. In this entry I'll share with you what I think is an interesting story, along with some information that I hope will help you with group buffer pool monitoring and tuning.

For those of you who aren't familiar with the term "group buffer pool," a bit of introduction: DB2 data sharing is technology that enables multiple DB2 for z/OS subsystems to share concurrent read/write access to a single database. The DB2 subsystems are members of a data sharing group that runs on a cluster of z/OS systems called a Parallel Sysplex. Within the Parallel Sysplex are resources known as coupling facilities. These are essentially shared memory devices. Within a coupling facility one would find several structures, and among these would be the DB2 group buffer pools, or GBPs. Basically, group buffer pools are used for two things:

Page registration -- When a DB2 member subsystem reads into a local buffer pool a page belonging to a GBP-dependent data set (i.e., a table space or index -- or a partition of a partitioned table space or index -- in which there is inter-DB2 read/write interest), it registers that page in a directory entry of the corresponding GBP (so, a page read into buffer pool BP1 would be registered in GBP1). DB2 member X registers a locally cached page of a GBP-dependent data set so that it can be informed if that page is changed by a process running on DB2 member Y. Such a change effected on another member of the data sharing group would cause the copy of the page cached locally in a buffer pool of DB2 member X to be marked invalid. On re-referencing the page, DB2 member X would see that its copy of the page is invalid, and would then look for the current version of the page in the associated GBP (and would retrieve that current version from disk if it were not found in the GBP).
Caching of changed pages -- If a process running on DB2 member Y of the data sharing group changes a page belonging to a GBP-dependent data set, that member will write the changed page to the associated GBP in a coupling facility LPAR (usually at commit time, but sometimes before a commit). The changed page will remain in the GBP -- from whence it can be retrieved by a member DB2 subsystem in a few microseconds -- for some time following the update. Eventually the page will be overwritten by another changed page, but before that happens the GBP data entry occupied by the page will be made stealable via a process called castout, through which changed pages written to a GBP are externalized to disk.

OK, now for the story: back in the mid-1990s, when data sharing was introduced with DB2 Version 4 for z/OS, an organization that was one of the early adopters of the technology wanted to know how large a GBP ought to be. I was working at the time in IBM's DB2 for z/OS national technical support group, and I fielded this question and thought that "as large as possible" was an unacceptably imprecise answer; so, I set about trying to determine the right size for a GBP in a quantitative way. I focused on GBPs with 4K-sized data entries because buffer pools that hold 4K-sized pages were (and to a somewhat lesser extent still are) dominant versus buffer pools that hold 8K, 16K, or 32K-sized pages. Further, I assumed that the default ratio of five GBP directory entries for every one data entry would be in effect. I also knew that a GBP directory entry occupied about 200 bytes of space (that was then -- I'll get to the current size of a GBP directory entry in a bit).

The last piece of the puzzle was a GBP sizing objective. A GBP sized right would be beneficial because... what? That "what," I decided, should be 1) avoidance of GBP write failures due to lack of storage (something that can occur if a GBP is so small that casting out of changed pages to disk -- required to make GBP data entries stealable -- cannot keep up with the volume of writes of changed pages to the GBP) and 2) avoidance of GBP directory entry reclaims (if a page must be registered in a GBP but all of that GBP's directory entries are in use, a directory entry has to be reclaimed, and when that happens the copies of the page cached in local buffer pools of member DB2 subsystems have to be preemptively marked invalid). Knowing that GBP write failures due to lack of storage are most likely to occur for a way-undersized GBP, I decided to concentrate on determining a GBP size that would virtually eliminate the possibility of directory entry reclaims, my thinking being that a GBP so sized would also be large enough to avoid GBP write failures due to lack of storage (and that has proven to be the case, in my experience).

How, then, to avoid GBP directory entry reclaims? I realized that the key to accomplishing that goal was having at least as many GBP directory entries as there were "slots" into which different pages of table spaces and indexes could be cached. These slots would be the buffers of local pools, PLUS the data entries in the corresponding GBP. If, for example, a data sharing group had two member DB2 subsystems, and if BP1 on each member had 30,000 buffers, you'd have 60,000 page slots right there. 60,000 directory entries in GBP1 would cover that base, right? Yeah, but with the standard 5:1 ratio of GBP directory entries to data entries, those 60,000 directory entries would bring along 12,000 data entries in GBP1. That's 12,000 additional slots in which different pages could be cached, so you'd need 12,000 more directory entries to eliminate the possibility of directory entry reclaims. Well, those 12,000 GBP directory entries would bring along another 2400 GBP data entries (with the aforementioned 5:1 ratio in effect). Providing 2400 more directory entries to account for those page slots would give you 480 more GBP data entries, so you'd need 480 additional directory entries, and so on. I came up with a formula into which I plugged numerical values (200 bytes for a directory entry, 4K for a data entry, 5 directory entries for every 1 data entry), and saw that I had a converging sequence. That sequence converged to approximately 0.3125, meaning that combining the size (in megabytes) of local BPn buffer pools and multiplying that figure by 0.3125 would give you a size for GBPn that would virtually ensure zeroes for directory entry reclaims and for GBP write failures due to lack of storage (something that could be verified using the output of the command -DISPLAY GROUPBUFFERPOOL(GBPn) GDETAIL(*)).

Nice, but I was concerned that 0.3125 was a number that people would not be able to readily bring to mind, and I really wanted a GBP sizing formula that a person could carry around in his or her head; so, I nudged 0.3125 up to 0.33, and thus arrived at the GBP sizing rule of thumb that came to be known as "add 'em up and divide by 3" (example: given a 4-way DB2 data sharing group, with a BP3 sized at 30,000 buffers on each member, and the default 5:1 ratio of directory entries to data entries in the GBP, a good initial size for GBP3 would be one-third of the combined size of the BP3 pools, or (30,000 X 4KB X 4) / 3, which is 160 MB). That formula was widely and successfully used at DB2 data sharing sites, and it continued to work well even as the size of a GBP directory entry increased (with newer versions of coupling facility control code). Now that the size of a directory entry is about 400 bytes, "add 'em up and divide by 3" has become "add 'em up and multiply by 0.375." So, given that prior example of a 4-way data sharing group with BP3 sized at 30,000 buffers on each member, a GBP3 size that would virtually eliminate the possibility of directory entry reclaims (because it would give you, with the default 5:1 directory-to-data entry ratio, a number of directory entries about equal to the number of "slots" into which different pages could be cached in the local BP3 buffer pools and in GBP3) would be:

(30,000 X 4KB X 4) X 0.375 = 180 MB

Now, I've mentioned multiple times that the formula I derived was based on two assumptions: a local buffer pool with 4K-sized buffers, and a GBP with five directory entries for every one data entry. What about buffer pools with larger buffer sizes (e.g., 8K or 16K)? What about GBP directory-to-data-entry ratios (which can be changed via the -ALTER GROUPBUFFERPOOL command) other than the default 5:1? Regardless of the size of the buffers and the directory-to-data-entry ratio in effect for a GBP, the basic goals remain the same: avoid directory entry reclaims and avoid write failures due to lack of storage. Because accomplishing the first of these goals is likely to result in achievement of the second objective as well, you focus on sizing a GBP to eliminate directory entry reclaims. Directory entries won't be reclaimed if there is at least one directory entry for every "slot" in which a different page could be cached (i.e., you need to have a number of directory entries that is at least as great as the number of buffers in the local buffer pools PLUS the number of data entries in the GBP).

To see how this GBP sizing approach can be used, suppose that you have a 2-way DB2 data sharing group, with a BP16K1 that has 10,000 buffers on each member. Suppose further that you have reason to want to use a 7:1 directory-to-data-entry ratio for GBP16K1 instead of the default 5:1 ratio. In that case, how should GBP16K1 be sized so as to prevent directory entry reclaims? Well, to begin with, you know that you'll need a directory entry in GBP16K1 for every local BP16K1 buffer. That's 20,000 directory entries. Given the 7:1 directory-to-data-entry ratio in effect for GBP16K1, along with the 20,000 GBP directory entries you'll have 2858 data entries (20,000 divided by 7, and rounded up to the nearest integer value). To cover those additional 2858 page "slots," you'll need another 2858 directory entries in the GBP. Again with the 7:1 directory-to-data-entry ratio in mind, the 2858 additional directory entries will come with another 409 data entries (2858 divided by 7, rounded up). You'll need another 409 directory entries to cover those 409 data entries, but that means an additional 59 data entries. The 59 directory entries needed to cover those data entries will bring along 9 more data entries, and the 9 directory entries needed to cover those data entries will mean 2 more data entries, for which you'll need 2 more directory entries. Add one more data entry to go along with those last 2 directory entries (2 divided by 7, rounded up), and you've taken the sequence far enough.

Now, just add up the directory entries you got with each iteration of the sequence, starting with the initial 20,000:

20,000 + 2858 + 409 + 59 + 9 + 2 = 23,337

Divide that figure by 7 (and round up to the nearest integer) to get a number of data entries for the GBP:

23,337 / 7 = 3333.86, or 3334 when rounded up

Now you can size GBP16K1 by multiplying the number of directory entries by 400 bytes and the number of data entries by 16KB:

(23,337 X 400 bytes) + (3334 X 16KB) = 62.7 MB

I'd round that up to 63 MB for good measure. And there you have it: a GBP16K1 sized at 63 MB (given our use of a 7:1 directory-to-data-entry ratio for this GBP) should result in your seeing zero directory entry reclaims for the GBP (and zero write failures due to lack of storage, to boot). As mentioned previously, you can check directory entry and write failure values using the output of the DB2 command (using GBP16K1 in this case) -DISPLAY GROUPBUFFERPOOL(GBP16K1) GDETAIL(*). Dealing with a different buffer size? A different GBP directory-to-data-entry ratio? Doesn't matter. You can use the approach laid out above for any buffer size and any GBP directory-to-data-entry ratio.

I hope that this information will be useful to you in right-sizing your group buffer pools, if you're running DB2 for z/OS in data sharing mode (if you're not, consider the use of this technology at your site -- you can't beat it for availability and scalability).

29 comments:

troycolemanJuly 22, 2013 at 2:45 PM
Great post.. thank you for sharing.
ReplyDelete
Replies
SriniAugust 13, 2013 at 3:45 AM
Thank you for sharing ..
ReplyDelete
Replies
UnknownApril 15, 2014 at 8:34 AM
Robert, this is great.. have a couple of questions though.

1. Is a data page read by a DB2 member registered in a directory entry in the GBP always or only when it is GBP dependant? I was thinking always.

2. Whats the typical size of directory entry? is it 200 KB or 432 KB?
ReplyDelete
Replies
Sven GrothMay 14, 2014 at 2:32 AM
Thank's for the interesting information. Well i can't follow in all the points. So you're expecting that in case each member has a local bufferpool so in the coupling facility should be the summation of all the matching pool parts. So the fiction is that every page could be groupbuffer dependent. Sure it could be but in practise i never will be. Also it depends on the application that are running. In our installation up to 6 members in a datasharing group i defined a trickle size.
Interesting for me was also the comparison between the display groupbufferpool command output and the output from the smf PM Report. In case we have a view on the 32K Groupbuffer you will see that SMF is thinking in 4096 KSize so dividing the figure for the directory ent. with 32 and multiply it with 4 (4KPage) will result in the figure for entries in the display command.
ReplyDelete
Replies
icecubeJuly 4, 2014 at 8:55 AM
Is it possible to use directory to data ratio as 1:1 . If yes , how the formula will be for calculating the size of group buffer pool ?
ReplyDelete
Replies
Don TreadwaySeptember 24, 2014 at 1:02 PM
I'm starting to review our GBP sizing now, and am wondering how you determine the appropriate entry to page ratio. What factors would prompt you to change from the apparent 5:1 default to a 7:1 you use in one example?

Thanks a bunch for all this information, by the way!
ReplyDelete
Replies
AnonymousNovember 6, 2014 at 9:05 AM
Thank you for sharing. Please share your mail to provide the RATIO details.
ReplyDelete
Replies
UnknownJanuary 21, 2015 at 1:46 PM
Hi Robert,
Thank you for your article. Interesting approach in calculating the GBP size.
However, I dont see how your approach reflect the Write activity.

The formula to calculate number of Data entries , provided by IBm, as follows :
Data Entries = U * D * R
where U is degree of sharing ( from 0.5 to 1)
D - number of pages written per second for all members during peak activity
R- avg page residency time
then DATA portion (Mb) = Data entries * Page size /1024
This Data portion is the highest number in calculating total GBP size ( GBP Size = Data + Directory ) .

In this formula , Total size of Local buffers not even consider .
So, for example , using simplified formula ( your approach close to this ) to calculate GBP size :
Local BP = 50,000 pages ; Page size = 4 K , Number of Sharing members = 4; Medium sharing = 30%
GBP Size = (50,000 X 4 members ) 4 kb page size X 0.3 = 240 Mb

However, if consider Writing activity , and if this actitivty is pretty heavy - let's say 10,000 pages per second ( I took this number from my instalation - sometimes we have even higher numbers ), formula for calculating DATA portion will look like

Data Entr = 0.7 ( medium sharing) X 10,000 X 60 sec ( default page residency time ) = 420,000 x 4/1024 = 1,640 Mb - and it's not even considering Directory entires

Adn this is much bigger value then using simplified formula.
I think Writing activity has much bigger impact on sizing GBP pool , and should be taken into consideration .

Your comments would be appreciated .

Regards Ilya
ReplyDelete
Replies
RobertAugust 11, 2015 at 7:07 PM
NOTE: I updated this entry on August 11, 2015, to reflect the fact that the current size of a group buffer pool directory entry is about 400 bytes. I had stated that the size of a GBP directory entry had grown to about 250 bytes from the original size of roughly 200 bytes. That was true for a while, but the GBP directory entry size continued to grow, so that it is now about 400 bytes. I adjusted calculations and formulas in the blog entry accordingly.

Robert
ReplyDelete
Replies
ndtdb2November 24, 2016 at 5:32 AM
Hi Robert,
I compare the result from your formula with the calculated size by CFsizer.
First, CF Sizer doesn't take in account the directory to data ratio.
For a total vpsize of 865K pages, CF Sizer gives an initsize of 839Mb. And with your formula, i get a size of 206836 Kb, (with ratio of 9:1) , which is nearly 1/4 of the CFsizer provided size.
Can you comment on this difference (i've tried the default ratio of 5:1 and get 327666 with your formula which is still far from the CFsizer result)
Thank you very much
Duc
ReplyDelete
Replies
Paul BNovember 14, 2018 at 7:19 AM
Hi Robert,
Is this formula still valild under DB2 11 and 12, zOS release 2.02? Have seen many old posts (older than this one) on CF sizing, but nothing recent. Need a reality check. Thanks, Paul
ReplyDelete
Replies
SebastianJuly 7, 2020 at 9:15 AM
Hi Robert,

I have an interesting problem here I wanted to share. We have two way datasharing and all 4KB tablespaces are assigned to BP1 and indexes to BP2 - both are 15GB ( 3700000 4KB buffers)

I have found that direcrory to data ratio for GBP2 buffer pool is 159, leaving too few data entries in GBP2. Hence, GBP2 SYN READ HIT RATIO is about 44%.

CURRENT DIRECTORY TO DATA RATIO = 159

PENDING DIRECTORY TO DATA RATIO = 159

GBP SYN.READ(XI) HIT RATIO(%) 44.00

SYN.READ(XI)-DATA RETURNED 3220.1K

SYN.READ(XI)-NO DATA RETURN 4098.1K

STRNAME: DSNDBPG_GBP2

POLICY INFORMATION:

POLICY SIZE : 12000 M

POLICY INITSIZE: 10000 M

POLICY MINSIZE : 10000 M

FULLTHRESHOLD : 90

ALLOWAUTOALT : YES

When I display the GPB2 information is shows me there as many directory entires as 19 times data entries.

DSNB759I # NUMBER OF DIRECTORY ENTRIES = 25358035

NUMBER OF DATA PAGES = 159484

What strikes me is that the number of entries really used in the structure is 17% of the total. That makes me thing we don't need as many directory entries as it was was automatically derived by CF.

STORAGE CONFIGURATION ALLOCATED MAXIMUM %

ACTUAL SIZE: 10000 M 12000 M 83

SPACE USAGE IN-USE TOTAL % CHANGED %

ENTRIES: 4419347 25358035 17 27024 0

ELEMENTS: 158588 159484 99 27024 16

I'm going to change the ratio to 10 and let CF adjust it again if there are to few directory entries after the change.

-ALTER GROUPBUFFERPOOL (GBP2) RATIO (10)

Have you ever come across similar issue?

Regards
Sebastian
ReplyDelete
Replies
RobertJuly 7, 2020 at 8:19 PM
Hello, Sebastian

First of all, consider whether this situation is one about which you should be concerned. You indicate that the XI read hit ratio for GBP2 is 44%. That is pretty low, but do you care? That should depend, I think, on the rate of synchronous reads due to XI, in terms of the number of sync read XI requests you see per second for GBP2. If the rate of sync read XI requests (requests with data returned + requests with data not returned, divided by seconds in reporting interval) is less than 10 per second, I would not be concerned about a relatively low XI read hit ratio - it's just not making much of a difference in the system's performance.

If the rate of sync read XI requests is greater than 10 per second then yes, it would be good to bring that down; however, you should be careful about reducing the directory-to-data entry ratio based on the percentage of directory entries that appear to be in-use. That right likely indicates the number of directory entries that are associated with pages of GBP-dependent objects that are currently cached in the local buffer pool of at least one member of the data sharing group. Other directory entries are ready to accommodate new page registrations, which can come in big surges.

If you really want to reduce the directory-to-data ratio for the GBP, it would be best, I think, to also increase the size of the GBP (if available CF LPAR memory would accommodate a size increase). In any case, try to avoid directory entry reclaim activity, as that is a drag on performance.

Robert
ReplyDelete
Replies
SebastianJuly 10, 2020 at 7:50 AM
Thanks Robert. The sync XI request rate is above 90/s. The directory entry utilisation in the GBP2 hasn't exceeded 20% so I'm going to bring the ration down in a few steps and keep an eye on this.
ReplyDelete
Replies
AnonymousMarch 12, 2024 at 8:28 AM
Hi Robert. Recently I was asked to increase the ratio on one of our group buffer pools from 5 to 100. I did that but the change is still pending and has been for over a month now. Is there are way to manually force the change? We couldn't find anything in the documentation regarding this.
ReplyDelete
Replies
AnonymousMarch 12, 2024 at 11:24 AM
Read the manual carefully, there is a section « When new values take … » in the command reference manual for ALTER GBP !
ReplyDelete
Replies
AnonymousSeptember 30, 2024 at 9:07 AM
Hi Robert, Any update to the formula with recent z16 hardware and latest CF levels ? Any change in the size of directory entry and hence to the formula?
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.