Tuesday, June 26, 2018

Db2 for z/OS PBG Table Spaces: The Ins and Outs of Partition Addition

In an entry posted to this blog about three years ago, I advocated for putting some thought into the PBG versus PBR decision (partition-by-growth versus partition-by-range) for larger tables (for smaller tables, PBG is often the sensible choice, especially when the associated table space is likely to have only one partition). I pointed out in the aforementioned blog entry several advantages pertaining to PBR table spaces in a large-table context, but acknowledged as well the attractiveness of PBG table spaces from a DBA's perspective (easy to define, pretty easy to administer). In the entry I'm writing today, I want to shed some light on an aspect of PBG table spaces that is a) important, b) not often taken into account, and c) not very well understood by a lot of Db2 people. What am I talking about? Partition addition - in particular, the dynamic and automatic adding of a partition by Db2 to a PBG table space as needed to provide space for more table rows.

Now, you might be thinking, "I'm quite familiar with that mechanism. It's one of the features that I most like about PBG table spaces. When partition n of a PBG table space gets full, Db2 automatically adds partition n+1, to accommodate addition row-insert operations." The thing is, if that's your understanding of how PBG partition addition works then your understanding is a bit off the mark (aside from an exception I'll get to momentarily).

Here's the straight skinny: if a PBG table space has n partitions, Db2 will add partition n+1 not when partition n is full, but when the table space is full. Consider a PBG table space with 100 partitions. Partition 100 gets full. When will partition 101 be added? Partition 101 will be added when Db2 verifies that ALL of the table space's partitions are full: Db2 will check partitions prior to partition 100 - all 99 of them, if needs be - to see if space for a new row can be found. If no space is found in any of the table space's partitions, partition 101 will be added to hold a new row.

If receiving that bit of information has you felling as though you're about to break into a cold sweat, settle down. This doesn't necessarily mean that you've made a mistake in going the PBG route for a large table. Let's consider a couple of scenarios:

  • Table gets inserts, but no deletes. In this case, yes, Db2 will check all partitions of the table space before adding a new one to accommodate a new row, but that space-check will be really fast: with a quick look at the "end" of a partition (referring to the end of the data in the partition), Db2 can determine if there is space available for a new row.
  • Table has a continuously-ascending clustering key. With such a clustering key in effect (every new row has a clustering key value that is greater than that of any pre-existing row), a new row should always go to the end of the table. That can make APPEND YES a viable choice for the table, and when a table in a PBG table space is defined with APPEND YES then Db2 will NOT look in all of the table space's partitions before adding a new partition to accommodate a new row: if partition n is full then partition n+1 will be added, regardless of whether or not space for the row is available in a partition other than partition n (that's the exception to the check-all-partitions rule that I mentioned up in the second paragraph of this entry). Caveat: APPEND YES can lead to an insert "hot spot" at the end of the table, and that could be an issue in a high-volume insert situation. In that case, going with MEMBER CLUSTER for the PBG table space, in addition to APPEND YES for the table, could enable achievement of the desired level of insert throughput.

What if you have a situation that is not like either of the two scenarios described above? What if you have a PBG table space that has, or is likely to eventually have, a large number of partitions? Should you be anxious about the possibly negative performance impact of an all-partition search for space that would precede dynamic addition of a new partition to a PBG table space? I think maybe not. Below I've provided some points to ponder. One or more of them may assuage your worries.

  • Partition size can be a mitigating factor. You have, of course, a number of options with regard to the DSSIZE specification for a PBG table space (DSSIZE indicates the size of the table space's partitions). Why do I bring this up? Because, for a given volume of data rows, a larger DSSIZE means fewer partitions, and as a general rule the backward search for space in previous partitions that precedes the automatic adding of a new partition to accommodate more data will be accomplished more quickly if there are fewer larger partitions versus a greater number of smaller partitions. Now, before jumping to the conclusion that DSSIZE 256G (the maximum value for a PBG table space) is the way to go, consider that there can be challenges in managing larger data sets. That said, it could be that a specification of DSSIZE 32G or 64G could be a better choice than 2G or 4G for some of your larger PGB table spaces.
  • Online REORG can add a partition to a PBG table space before it's needed to accommodate new rows. You may already be aware that a REORG of an entire PBG table space can result in a new partition (or partitions) being added to the end of the table space - that will happen if the reestablishment of free space (e.g., as specified through the table space's PCTFREE value) via REORG requires additional space beyond that in the table space's existing partitions. What you might not know is that Db2 12 for z/OS added a nice enhancement pertaining to REORG of a partition (or partitions) of a PBG table space. The background: prior to Db2 12, if you executed a partition-level REORG for a PBG table space, and the data in a partition would not fit in the partition with free space reestablished by REORG, the utility execution would fail (that's why the ZPARM parameter REORG_IGNORE_FREESPACE was introduced - you could set that parameter's value to YES to cause Db2 to use 0 for PCTFREE and FREEPAGE for a partition-level REORG job). With Db2 12, if a partition of a PBG table space is REORGed and the partition's data won't fit in the partition with free space reestablished, Db2 will add a partition to the table space to hold the overflow rows - the utility job won't fail (and the REORG_IGNORE_FREESPACE parameter is no longer part of the ZPARM module).
  • You can "pre-allocate" some partitions at CREATE TABLESPACE time. Do you think that the NUMPARTS option of CREATE TABLESPACE is only applicable to PBR (partition-by-range) table spaces? If so, time to update your thinking. When you create a PBG table space, NUMPARTS (if specified) tells Db2 how many partitions to initially create for the table space (unless you also specified DEFINE NO in the CREATE TABLESPACE statement); so, with a few partitions available from the get-go, it could be a while before a space constraint would induce Db2 to automatically add another partition for a PBG table space.
  • You can add the "next" partition yourself, with a SQL statement. If you have a situation in which a PBG is approaching "full" status and you're concerned about an "exhaustive backwards partition space search" that would precede automatic addition of a new partition, you can preclude that space search by simply adding the next partition yourself. How would you do that? With an ALTER TABLE ADD PARTITION statement, of course. Oh, you were under the impression that that statement could only be used with a PBR table space? Join the club - I thought that myself, and for quite some time after it was no longer true (ALTER TABLE ADD PARTITION became applicable for PBG table spaces starting with Db2 10 for z/OS). Anyway, you could monitor (using Db2 real-time statistics information) the full-ness of a PBG (or of the last partition of a PBG - particularly in the case of a continuously-ascending partitioning key), and if it looks as though a new partition might soon be needed, add it yourself.

OK, to sum up: if you had assumed that a new partition got added to a PBG table space when the last partition could no longer accommodate a row insert, you need to set that assumption aside (except for the previously mentioned APPEND YES scenario) and understand that an exhaustive backwards search through preceding partitions - potentially through ALL preceding partitions - for row-insert space will occur before a new partition will be automatically added to the table space. Keep in mind that in some cases (especially when there are fewer larger partitions versus lots of smaller ones, and particularly when the table in question hasn't seen row-delete action), that space search will conclude quite rapidly. When the PBG table space has a large number of partitions, and when there have been DELETE as well as INSERT operations targeting the table in the table space, the exhaustive space search preceding automatic addition of a new partition could take longer to complete. In such situations, keep in mind that REORGs (of the entire table space, or - with Db2 12 - of one or more partitions of the table space) can result in added partitions that can hold new rows. Also remember that NUMPARTS can be used to "pre-allocate" partitions at CREATE TABLESPACE time, and that ALTER TABLE ADD PARTITION is a means by which you can add a new partition to a PBG table space at a time of your choosing.

Can PBG table spaces still provide ease-of-use benefits for DBAs, even for very large (i.e., multi-partition) tables? Sure they can. It's just that they are not quite so "set-it-and-forget-it" as you might have thought. With a clear-eyed understanding of how PBG partition addition works, you can take the steps needed to make PBG table spaces work well for you.