Sometimes, an important DB2 for z/OS feature can be out there for years, and yet not be fully appreciated by lots of people in the DB2 user community. Table-controlled partitioning is such a feature. In this blog entry I want to generally set the record straight about table-controlled partitioning, and in particular I want to make sure that mainframe DB2 folks understand a key aspect of this physical database design option that for some reason eludes many users to this day.
First, the basics: table-controlled partitioning refers to a means of range-partitioning the data in a table, so that, for example, sales records for retail store numbers 1-25 go into table space partition 1, records for store numbers 26-50 go into partition 2, records for store numbers 51-75 go into partition 3, and so on. Table-controlled partitioning is so called to distinguish it from index-controlled partitioning, which for years and years was the only way to assign a table's rows to table space partitions based on the value of a user-specified partitioning key: you created a partitioning index for a table, and in the definition of that index you associated table space partitions with ranges of the partitioning key.
That changed with DB2 Version 8 for z/OS, and this brings me to another point of clarification: some people get table-controlled partitioning and universal table spaces a little mixed up, in that they believe that table-controlled partitioning was introduced along with universal table spaces. If you've been under that impression, you're off by one DB2 release. DB2 V8 delivered table-controlled partitioning. DB2 9 delivered universal table spaces. There are two types of universal table space, and one of these, called a partition-by-range universal table space, is conceptually very similar to a DB2 V8 table-controlled partitioned table space (the other type of universal table space is known as partition-by-growth). Those who lump DB2 V8-introduced table-controlled partitioning in with DB2 9-introduced universal table spaces may be under the impression that you cannot convert an index-controlled partitioned table space to a table-controlled partitioned table space until you're running with DB2 10, because that's the release in which a smooth path to universal table spaces from other table space types was provided. In fact, in a DB2 V8 or DB2 9 (or DB2 10) environment you can easily convert an index-controlled partitioned table space to a table-controlled partitioned table space. How is that done? It's as simple as this: for the partitioning index of the index-controlled partitioned table space, issue an ALTER INDEX with a NOT CLUSTER. Bingo -- you've now got a table-controlled partitioned table space.
This simple means of converting an index-controlled partitioned table space to a table-controlled partitioned table space highlights the really key feature of table-controlled partitioning that -- as I mentioned up front -- has escaped the notice of many a DB2 person: data in a table-controlled partitioned table space can be partitioned by one key and clustered within partitions by another key. For a number of mainframe DB2-using organizations, the fact that the partitioning index of an index-controlled partitioned table space had to also be the table's clustering index was a pain. I worked in the early 2000s in the IT department of a financial services company, and I well recall that we had, in our DB2 for z/OS V7 database, a partitioned table space that presented us with a Hobson's choice: we could go with a key that would be good for partitioning (benefiting the performance of high-volume insert operations) but lousy for clustering (hurting query result set retrieval performance), or vice versa. When DB2 for z/OS V8 was announced, and with it the ability to specify data partitioning at the table level (versus the index level) and -- as a consequence -- to specify a clustering index with no dependence on the partitioning scheme, we jumped on that like a cat on a mouse. In my mind, this feature ranked right up there with 64-bit addressing as the MOST valuable capability provided via DB2 V8.
Obviously, not everyone in the DB2 user community shared this assessment of mine, as evidenced by the fact that I still find people who are under the impression that the partitioning key for a table-controlled partitioned table space has to be the table's clustering key. Again, NOT SO. Suppose you have a table for which it is important, for data load and offload purposes, that rows be partitioned by date (so, for example, one week's data goes into partition 1, the next week's into partition 2, and so on). For that same table, good query performance may depend on rows within the date-based partitions being clustered by account number. FINE. NO PROBLEM. Define the table with a specification of the date column as the partitioning key, and create an index, with the CLUSTER option specified, on the account number column. What if this table exists today in index-controlled partitioned form, with the partitioning/clustering index defined on the data column (good for load/offload, not good for result set retrieval)? In that case, you could first issue, as previously mentioned, an ALTER INDEX statement, with the NOT CLUSTER option, targeting the existing partitioning/clustering index on the table. Because the partitioning index on an index-controlled partitioned table space must also be the clustering index on the table, execution of this ALTER INDEX statement will have the effect of changing the table space's partitioning scheme from index-controlled to table-controlled (it will still be partitioned by the date column, but that partitioning information will now be tied to the table definition instead of being tied to an index's definition). Next, if you already have an index defined on the account number column, issue an ALTER INDEX statement, with the CLUSTER option specified, for that index (if an index on the account number column doesn't exist already, create one and include CLUSTER in the CREATE INDEX statement).
Here's another potential benefit you can get by going from index-controlled to table-controlled partitioning for a table space: you might be able to get rid of an index that is useful ONLY for partitioning purposes (i.e., it provides no query access path performance benefit, isn't defined on a foreign key, etc.). For the purpose of illustrating this point, consider an index-controlled partitioned table space that is partitioned on a column, called PARTITION_KEY, that is populated with a value by the program that inserts rows into the table. Not only will users never search on this column, it's probably a stinko clustering key, to boot. Issue the handy-dandy ALTER INDEX statement with the NOT CLUSTER option for this index, and alter some other index on the table to make it the clustering index. After that (and after checking that the index on the PARTITION_KEY column, now no longer needed for partitioning, is not needed for any other reason), drop the index on PARTITION_KEY. You'll save the disk space that had been occupied by the index, and insert and delete operations targeting the table will be more CPU-efficient (due to your having one less index on the table that would have to be maintained by DB2). Remember, the table space is still partitioned by the PARTITION_KEY column -- it's just that you no longer need an index to effect that partitioning scheme.
There you go. Nice capability, eh? Could you take advantage of it? Have you taken advantage of it? I hope so. Thanks for reading.