Friday, August 28, 2020

Db2 12 for z/OS RPN Table Spaces: New Possibilities for Range-Partitioned Tables

Sometimes, I assume that I've blogged about some Db2 for z/OS-related topic when in fact I have not. RPN table spaces is one such topic. In thinking about something to post this month, I briefly considered RPN table spaces and thought, "Nah. I've probably already blogged on that subject, given that it's a big deal." Lo and behold, I haven't blogged about RPN table spaces. I've spoken of them numerous times in presentations over the past few years, but written nothing about them in this blog (save for a brief mention in a 2017 response to a comment associated with an entry on universal table spaces). Time to rectify that situation.

RPN table spaces (I'll explain the acronym in a moment) were introduced with Db2 12 for z/OS (with function level 500 or higher activated). RPN is essentially a new type of universal range-partitioned table space. Before getting into the details, I'll provide a few examples of gripes and concerns that Db2 DBAs have regarding traditional universal partition-by-range (PBR) table spaces:

  • "I have a table in a PBR table space that is really big and getting bigger. I'm starting to sweat about only being able to put 16 TB of data in the table. Yeah, I could get up to 128 TB of data in the table if I went with a page size of 32K, but we had good reason to go with 4K pages for this table space."
  • "I don't like the fact that my choice for DSSIZE for a PBR table space impacts the maximum number of partitions for the table space. 256G is the DSSIZE we want for a particular PBR table space, but the limit on partitions for the table space would then be 512 - and that only if we go with 32K for the page size (if we go with 4K for the page size and 256G for DSSIZE, I can only have 64 partitions)."
  • "Ugh. I have a big table with 1000 partitions in a PBR table space. DSSIZE is 8G, and that has been just fine for 999 of the partitions. One partition is about to hit the 8G size limit. I'm going to have to go to 16G for DSSIZE for THE WHOLE TABLE SPACE, even though ONLY ONE PARTITION NEEDS THE LARGER DSSIZE VALUE."

The RPN table space feature of Db2 12 addresses all of those issues, and more. It does that by way of a very important change versus traditional PBR table spaces. This change pertains to the numbering of pages in a range-partitioned table space. In a traditional PBR table space, every page of the table space has a unique number. In an RPN table space, we number the pages in partition 1, and then for partition 2 we start over again with regard to numbering pages. For partition 3, same thing - start over again with page numbering. Same for partition 4, partition 5, and so on. What this means: if I have an RPN table space with 1000 partitions, the first page page in each and every partition has the same page number (referring to the number stored in the data page itself). You read that right: in the table space, there will be 1000 first-in-the-partition pages that all have the same page number. There will also be 1000 second-in-the-partition pages that all have the same page number (assuming that each partition has at least 2 pages). Does this sound like something that would lead to chaos? In fact there is no chaos because in an RPN table space it is still very much possible to uniquely identify a page. The twist, versus a traditional PBR table space, is that in an RPN table space a page is uniquely identified by the combination of its page number and its partition number. That page numbering scheme is called relative page numbering, and that gives us the acronym RPN (the type of page numbering used for a traditional PBR table space is called absolute page numbering).

OK, so what can I do with an RPN table space that I can't do with a traditional PBR table space? Plenty:

  • You can put up to 4 petabytes (that's 4096 terabytes) of data into one table in an RPN table space, and that's true for any page size you choose for the table space.
  • Row-capacity-wise, you can put up to 280 trillion rows (yes, that's "trillion," with a "t") into one table in an RPN table space, if you use 4K pages for the table space (if you go with 32K pages, you can put "only" 35 trillion rows into a table in an RPN table space).
  • With an RPN table space, you can have up to 4096 partitions of any allowable size - in other words, the limit on the number of table space partitions does not go down as DSSIZE goes up; AND, the maximum DSSIZE is 1024G (versus 256G for a traditional PBR table space); AND, DSSIZE for an RPN table space can be nG, with "n" being any integer between 1 and 1024 (in other words, "n" no longer has to be a power of 2, as is the case for a traditional PBR table space - if you want to go with a DSSIZE of, for example, 131G for an RPN table space, go for it).
  • Different partitions of one RPN table space can have different DSSIZE specifications. Consider the scenario I described previously, in which a DSSIZE of 8G is fine for all but one partition in a 1000-partition table space. Let's say the one partition about to hit the 8G size limit is partition 327, and a DBA wants to take DSSIZE for that partition to 17G. With an RPN table space (which makes 17G a valid value for DSSIZE, as noted above), the DBA alters only partition 327 to have DSSIZE 17G; and, get this: that ALTER is an immediate change - partition 327 can immediately get larger than its former 8G DSSIZE, without even a REORG of the partition required to put the DSSIZE change into effect (if the DBA had altered DSSIZE for the partition to a value smaller than the previous value, it would be a pending change and a REORG of the partition would be required to put the smaller DSSIZE value into effect).
  • The DSSIZE flexibility provided by an RPN table space extends to partitioned indexes defined on the table in the table space: DSSIZE for partitions of such an index can be nG, with "n" being any integer between 1 and 1024, and different DSSIZE values can be used for different partitions of an index. Speaking of indexes, note that an index on a table in an RPN table space will be slightly larger than an index defined on the same table in a traditional PBR table space. That is so because the size of a RID goes from 5 bytes to 7 bytes when data is in an RPN table space versus a traditional PBR table space.

That's some good stuff, eh? Wanna know how you can have RPN table spaces in your Db2 12 system? I'll tell you. First, for a new table space, in the CREATE TABLESPACE statement you can include the option PAGENUM RELATIVE. You can also create an RPN table space without even specifying PAGENUM RELATIVE. How? By setting the new (with Db2 12) ZPARM parameter PAGESET_PAGENUM to RELATIVE. Doing that will make RPN the default for a new universal PBR table space created in the system (whether PAGESET_PAGENUM is set to RELATIVE or to its default value of ABSOLUTE, that specification can be overridden at the individual table space level via use of the PAGENUM option of CREATE TABLESPACE).

How about an existing universal PBR table space of the traditional variety? Can it be changed to an RPN table space? Yep - just ALTER the table space with PAGENUM RELATIVE, and then do an online REORG of the table space to materialize the change (yes, you have to REORG the table space in its entirety to get to RPN, but the good news is that once the conversion is done you should have fewer situations that would necessitate a REORG of the whole table space - that's part of the RPN value proposition).

Can a "classic" range-partitioned table space be converted to RPN ("classic" referring to a table space that is not universal and is associated with a table that uses table-controlled partitioning versus index-controlled partitioning)? Yes. First, you ALTER the table space with a SEGSIZE specification to indicate that you want to convert it from non-universal to universal PBR (a value of 64 for SEGSIZE is generally appropriate). After that, issue a second ALTER for the table space, this time specifying PAGENUM RELATIVE. After both of those ALTER TABLESPACE statements have been executed, a single online REORG will make the table space both universal and RPN.

If I were administering a Db2 12 system, I think I'd want to make extensive use of relative page numbering for my range-partitioned table spaces, for the DSSIZE flexibility (and the ability to immediately increase a partition's maximum size) and for enhanced scalability. I encourage you to put this new Db2 feature to work for the benefit of your organization.

16 comments:

  1. Hi Rob,
    I'm very much interested to learn on the utlities in db2 z/os.I know its vast topic,but i think we have so many articles/pgs/posts for other topics in db2.but its really rare that everyone of us would have come across some detailed understanding on utilites.I request you to share your knowledge on db2 utilites,which will be very useful.

    ReplyDelete
    Replies
    1. That's certainly something I could do. As you noted, Db2 utilities is a broad topic. Are there particular aspects of the IBM Db2 utilities and/or usage of same that you'd like to see covered in a blog entry?

      Robert

      Delete
    2. As far as i know,i think it will be useful to cover-what each phases really do/how it really works,parms that is really required/will be useful to know,new features,things to be noted/take care when job abends in each phase

      Delete
    3. That would be too much to cover in this blog. When I write an entry that has to do with Db2 for z/OS utilities, I am typically addressing a particular aspect of a particular utility. Broader-scope information on IBM Db2 for z/OS utilities can be found online in the IBM Db2 for z/OS Knowledge Center (https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/ugref/src/tpc/db2z_introutilities.html) and in IBM "redbooks" such as "Db2 for z/OS Utilities in Practice" (download-able from http://www.redbooks.ibm.com/abstracts/redp5503.html?Open).

      Robert

      Delete
  2. The above estimation "Row-capacity-wise, you can put up to 280 trillion rows" is based on a TS without an NPI. An NPI with a unique key would limit the total number of rows in the TS to less than 1 trillion for a 4K page NPI and less than 8 trillion for a 32k page NPI.
    The NPI is really due for a redesign.

    ReplyDelete
    Replies
    1. True statement, as the size of an NPI is limited to 16 TB for an index with 4 KB-sized pages, and 128 TB for an index with 32 KB-sized pages. It will be interesting to see if the NPI design gets an overhaul in the future to expand capacity.

      Robert

      Delete
  3. After altering Tablespace to PAGENUM RELATIVE to Defer alter materialized by Part level inline Reorgs . Do all the subsequent reorg tablespace has to be part level Inline copy Reorg ?

    ReplyDelete
    Replies
    1. Not sure that I understand your question. If you alter a range-partitioned table space with PAGENUM RELATIVE, only a REORG of the entire table space will materialize that change - a partition-level REORG cannot be used to effect a change to relative page numbering from absolute page numbering.

      Robert

      Delete
  4. Hi Rob,I'm confused with difference between partitioned index & DPSI.By the definition & index statement both looks same to me.
    Both :
    1)Defined with PARTITIONED keyword
    2)has as many partitions as the number of partitions in the table space
    I'm finding for an example index statement to really know how this differs & used.
    Can you help me in understanding this

    ReplyDelete
    Replies
    1. A partitioned index defined on a table in a table space that uses table-controlled partitioning (either a universal partition-by-range table space or a "classic" partitioned table space of the type introduced with Db2 V8 for z/OS) will be either a partitioning partitioned index or a data-partitioned secondary index. If the index key is equal to, or begins with, the column(s) of the underlying table's partitioning key, it is a partitioning partitioned index; otherwise, is a data-partitioned secondary index. For example, if a table is partitioned on (C1, C2), a partitioned index on (C1, C2) is partitioning. So is a partitioned index on (C1, C2, C3). A partitioned index on (C4) is a DPSI. So is a partitioned index on (C2, C1), the key of which contains the columns of the table's partitioning key, but in the wrong order.

      A good explanation of partitioned indexes - both the partitioning and the data-partitioned secondary kind - can be found in sections 3.7 and 3.8 of the IBM redbook titled, "DB2 UDB for z/OS Version 8: Everything You Ever Wanted to Know, ... and More." That redbook can be downloaded from the Web page at http://www.redbooks.ibm.com/abstracts/sg246079.html?Open.

      Note: the above-referenced redbook states that a DPSI cannot be defined as UNIQUE. Since Db2 9 for z/OS, a DPSI can be defined as UNIQUE if the columns of its key are a super-set of the columns of the underlying table's partitioning key.

      Robert

      Delete
    2. Wow!!!that clarifies it .thankyou Rob

      Delete
    3. I believe partitioned partitioning index & partitioning index are different.partitioning index-must have same columns in same order as that of partitioning key defined in table.

      Delete
    4. Sorry typo.I meant to be partitioning partitioned index & partitioning index

      Delete
    5. A partitioned index is one that id defined with a PARTITIONED specification. A partitioning index is one whose key either is the underlying table's partitioning key, or whose key begins with the underlying table's partitioning key; thus, the only difference between a partitioned partitioning index and a non-partitioned partitioning index is the PARTITIONED specification. Frankly, I do not see why a partitioning index would not also be defined as PARTITIONED.

      Robert

      Delete