Robert's Db2 blog: The ONE Key Aspect of DB2 for z/OS Table-Controlled Partitioning that People Seem Most Likely to Forget

Tuesday, June 5, 2012

The ONE Key Aspect of DB2 for z/OS Table-Controlled Partitioning that People Seem Most Likely to Forget

Sometimes, an important DB2 for z/OS feature can be out there for years, and yet not be fully appreciated by lots of people in the DB2 user community. Table-controlled partitioning is such a feature. In this blog entry I want to generally set the record straight about table-controlled partitioning, and in particular I want to make sure that mainframe DB2 folks understand a key aspect of this physical database design option that for some reason eludes many users to this day.

First, the basics: table-controlled partitioning refers to a means of range-partitioning the data in a table, so that, for example, sales records for retail store numbers 1-25 go into table space partition 1, records for store numbers 26-50 go into partition 2, records for store numbers 51-75 go into partition 3, and so on. Table-controlled partitioning is so called to distinguish it from index-controlled partitioning, which for years and years was the only way to assign a table's rows to table space partitions based on the value of a user-specified partitioning key: you created a partitioning index for a table, and in the definition of that index you associated table space partitions with ranges of the partitioning key.

That changed with DB2 Version 8 for z/OS, and this brings me to another point of clarification: some people get table-controlled partitioning and universal table spaces a little mixed up, in that they believe that table-controlled partitioning was introduced along with universal table spaces. If you've been under that impression, you're off by one DB2 release. DB2 V8 delivered table-controlled partitioning. DB2 9 delivered universal table spaces. There are two types of universal table space, and one of these, called a partition-by-range universal table space, is conceptually very similar to a DB2 V8 table-controlled partitioned table space (the other type of universal table space is known as partition-by-growth). Those who lump DB2 V8-introduced table-controlled partitioning in with DB2 9-introduced universal table spaces may be under the impression that you cannot convert an index-controlled partitioned table space to a table-controlled partitioned table space until you're running with DB2 10, because that's the release in which a smooth path to universal table spaces from other table space types was provided. In fact, in a DB2 V8 or DB2 9 (or DB2 10) environment you can easily convert an index-controlled partitioned table space to a table-controlled partitioned table space. How is that done? It's as simple as this: for the partitioning index of the index-controlled partitioned table space, issue an ALTER INDEX with a NOT CLUSTER. Bingo -- you've now got a table-controlled partitioned table space.

This simple means of converting an index-controlled partitioned table space to a table-controlled partitioned table space highlights the really key feature of table-controlled partitioning that -- as I mentioned up front -- has escaped the notice of many a DB2 person: data in a table-controlled partitioned table space can be partitioned by one key and clustered within partitions by another key. For a number of mainframe DB2-using organizations, the fact that the partitioning index of an index-controlled partitioned table space had to also be the table's clustering index was a pain. I worked in the early 2000s in the IT department of a financial services company, and I well recall that we had, in our DB2 for z/OS V7 database, a partitioned table space that presented us with a Hobson's choice: we could go with a key that would be good for partitioning (benefiting the performance of high-volume insert operations) but lousy for clustering (hurting query result set retrieval performance), or vice versa. When DB2 for z/OS V8 was announced, and with it the ability to specify data partitioning at the table level (versus the index level) and -- as a consequence -- to specify a clustering index with no dependence on the partitioning scheme, we jumped on that like a cat on a mouse. In my mind, this feature ranked right up there with 64-bit addressing as the MOST valuable capability provided via DB2 V8.

Obviously, not everyone in the DB2 user community shared this assessment of mine, as evidenced by the fact that I still find people who are under the impression that the partitioning key for a table-controlled partitioned table space has to be the table's clustering key. Again, NOT SO. Suppose you have a table for which it is important, for data load and offload purposes, that rows be partitioned by date (so, for example, one week's data goes into partition 1, the next week's into partition 2, and so on). For that same table, good query performance may depend on rows within the date-based partitions being clustered by account number. FINE. NO PROBLEM. Define the table with a specification of the date column as the partitioning key, and create an index, with the CLUSTER option specified, on the account number column. What if this table exists today in index-controlled partitioned form, with the partitioning/clustering index defined on the data column (good for load/offload, not good for result set retrieval)? In that case, you could first issue, as previously mentioned, an ALTER INDEX statement, with the NOT CLUSTER option, targeting the existing partitioning/clustering index on the table. Because the partitioning index on an index-controlled partitioned table space must also be the clustering index on the table, execution of this ALTER INDEX statement will have the effect of changing the table space's partitioning scheme from index-controlled to table-controlled (it will still be partitioned by the date column, but that partitioning information will now be tied to the table definition instead of being tied to an index's definition). Next, if you already have an index defined on the account number column, issue an ALTER INDEX statement, with the CLUSTER option specified, for that index (if an index on the account number column doesn't exist already, create one and include CLUSTER in the CREATE INDEX statement).

Here's another potential benefit you can get by going from index-controlled to table-controlled partitioning for a table space: you might be able to get rid of an index that is useful ONLY for partitioning purposes (i.e., it provides no query access path performance benefit, isn't defined on a foreign key, etc.). For the purpose of illustrating this point, consider an index-controlled partitioned table space that is partitioned on a column, called PARTITION_KEY, that is populated with a value by the program that inserts rows into the table. Not only will users never search on this column, it's probably a stinko clustering key, to boot. Issue the handy-dandy ALTER INDEX statement with the NOT CLUSTER option for this index, and alter some other index on the table to make it the clustering index. After that (and after checking that the index on the PARTITION_KEY column, now no longer needed for partitioning, is not needed for any other reason), drop the index on PARTITION_KEY. You'll save the disk space that had been occupied by the index, and insert and delete operations targeting the table will be more CPU-efficient (due to your having one less index on the table that would have to be maintained by DB2). Remember, the table space is still partitioned by the PARTITION_KEY column -- it's just that you no longer need an index to effect that partitioning scheme.

There you go. Nice capability, eh? Could you take advantage of it? Have you taken advantage of it? I hope so. Thanks for reading.

18 comments:

AnonymousNovember 15, 2012 at 1:41 AM
Is there a simple way to back out a change from index-controlled to table-controlled partitioning, or would that require a table restore? We have a largeish table I'd like to try this on (358 Million rows, 2429K pages) but we're short on space for creating another copy to experiment with.
ReplyDelete
Replies
AnonymousJanuary 11, 2013 at 9:38 AM
I'm a big fan of table-based partitioning, but I've run into a snag. I have an exisiting index-based partitioned tablespace, with the partitioning/clustering index defined on (col1, col2, col3). My limitkey only specifies values for col1, while the other 2 columns are there only for clustering reasons. What I'd like to do is:
1) convert to table-based partitioning
2) partition the table using col1 ONLY
3) cluster on col2, col3
4) later on, define additional DPSIs using col1
If I try to simply convert to table-based partitioning, the PARTITION BY clause now reflects all 3 columns in the old index definition, but I really want it to just reflect col1 (same as the limitkey). This is a 5 billion row table, and I really don't want to drop/recreate, just to change the PARTITION clause. Is there any way to do this without the drop/recreate?

Randy
ReplyDelete
Replies
Brian BobergJanuary 11, 2013 at 12:08 PM
When you speak of table-controlled partitioning are you talking exclusively about “Partition by Growth” and “Partition by Range” partitioning or are there other types of partitioning schemes available on DB2 for z/OS? We are using DB2 for z/OS version 9 if that makes any difference. Thanks.
ReplyDelete
Replies
AnonymousNovember 8, 2013 at 7:34 AM
Is there a way to identify tables that are index partitioned, so we can begin to change them to table based partitioned?
ReplyDelete
Replies
AnonymousJune 23, 2014 at 11:55 AM
In your example of partitioning by date, you only have an index on account, so selecting by date only would scan all partitions in your date range, right? And even if you specified other columns (except account), DB2 would still scan the entire partition(s), because there is no good index to use?
ReplyDelete
Replies
UnknownMay 6, 2015 at 9:14 AM
Hi Bob,

Say i have the following table columns partitioned by table based on COL1.

Col1 INT
Col2 char(10)
Col3 Char(3)

I have the following partitioned index on table - Col2.

I have a select query - select * from table where col1 = 1 and col2 = 'AAAAAAAAAA'

Do i need to have col1 and col2 as part of index? As what i understand is db2 optimizer automatically knows which part to look at (even though there is no index on partitioned column as in above case)

Will it scan entire index for col2 or it db2 optimizer will know that it has to scan only partition 1 and then use index on col2?

ReplyDelete
Replies
JithApril 15, 2020 at 11:44 AM
Hi, We are in v11. Is there a way we can drop the partition from 256 to 150 when it is partition by range. It would be great if you can let me know the high-level steps if possible.
ReplyDelete
Replies
AnonymousApril 13, 2021 at 11:34 PM
Hi Robert,

Is there any way by which we can find out significant columns in a IX-controlled partitioned table.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.