Wednesday, October 29, 2014

DB2 for z/OS: Thoughts on Table Space and Index Page Size Selection

The other day I got a note, from a DB2 for z/OS DBA, in which I was asked for my thoughts regarding the selection of page size for a table space. I typed out what I hoped would be a useful response, and found myself thinking, as I sometimes do, "Hey -- I could turn this into a blog entry." And so I have. I'll start here by offering some guidelines on page size selection for table spaces, and then I'll shift to the index perspective on the matter.

Table spaces

A table space (or index) gets its page size from the buffer pool to which it is assigned; so, a table space assigned to a buffer pool with buffers of 8K bytes in size will have pages of 8K bytes in size.

First, and obviously, the row length of a table is an important factor in selecting the page size for the associated table space. Row length determines the minimum page size that you can use for a table space. If the rows of a table are 7000 bytes in length, the pages of the associated table space will have to be at least 8K bytes in size, because a table row cannot span pages of a table space (an exception to this rule: a long LOB or XML value would span multiple pages in a LOB table space or an XML table space).

Once you've determined the minimum possible size for the pages of a table space, you can think about whether or not a page size larger than the minimum would be a good choice. Why might you want to go with a larger-than-minimum page size? One reason would be optimization of disk and server memory space utilization. Suppose that the row length for a table is 5000 bytes. The minimum page size for the associated table space would be 8K bytes; however, as rows can't span pages, that page size would waste quite a bit of space on disk and in memory (in each page you'd have 5K bytes occupied by a table row, and the remaining 3K bytes of space would be empty). In this case, a 16K page size would boost space utilization considerably: 15K bytes in each page could be occupied by 3 rows, leaving only 1K bytes of unused space). If you're wondering whether data compression would change this calculus, the answer is probably not -- that is to say, the right page size for an uncompressed table space is probably the right page size to use if you want to compress the table space; so, base page size selection on the uncompressed length of a table's rows.

Here's another reason to consider a page size that is larger than the minimum possible size for a table space (which, again, is determined by uncompressed row length): data access patterns. If data in a table space is typically accessed sequentially (i.e., if sequential scans are the rule for a table space, versus random single-page access), a larger page size could mean fewer GETPAGEs for those sequential scans, and that could boost CPU efficiency for applications that drive the sequential scans. Note that "sequential scan" doesn't necessarily mean a table space scan. It could be a matching index scan that would drive dynamic prefetch.

Just as access patterns for data in a table could lead you to select a larger-than-minimum page size for a table space, so they could push you in the other direction. When access to data in a table space is dominated by random, single-page reads, going with the minimum possible page size is probably the right move, so as to maximize buffer pool effectiveness -- in particular when SELECT statements targeting a table tend to have single-row result sets, you don't want to use an overly-large page size, because larger pages would bring into memory many rows that are not needed by a query, in addition to the one row that is needed (exception: if singleton SELECTs are executed via a "do loop" in a batch program, each time issued using a search value obtained from a file that is ordered in such a way that access to the target table is in fact sequential, dynamic prefetch would be expected to kick in and a larger page size could therefore be an efficiency-booster).

One more consideration: if you want to utilize LOB in-lining for a table (a DB2 10 for z/OS new-function mode feature), that could affect your table space page-size decision. Suppose a table has a LOB column, and the length of the table's rows without the LOB values would make 4K bytes a good page size for the associated base table space. If you determined that a high percentage of the table's LOB values could be completely in-lined with (for example), a 16K page, and if that inlining would deliver performance benefits for you, you might go with that 16K page instead of the minimum 4K page. You can read more about LOB inlining in an entry that I posted to this blog a couple of years ago.


Indexes

Prior to DB2 9 for z/OS, there was one choice for the page size of an index: 4K bytes. Starting with DB2 9 in new-function mode, the page size for an index could be 4K, 8K, 16K, or 32K bytes. Why might you go with a larger-than-4K page size for an index on a DB2 table? One reason would be to enable index compression -- something that requires the use of a greater-than-4K index page size (as noted in an entry I posted a few years ago to the blog that I maintained while working as an independent DB2 consultant).

Compression isn't the only reason to consider a larger-that-4K page size for an index. Another motivator could be a desire to reduce index page split activity. If an index is defined on a key whose values are not continuously-ascending (i.e., if key values for rows newly inserted into the underlying table are likely to be less than the key's current maximum value), throughput for high-volume insert operations could be constrained by index page split activity. [Unlike rows in a table, entries in an index MUST be physically ordered by key value, and that leads to index page splits when new key values are inserted into the "middle" of an index, as will be the case for an index defined on a non-continuously-ascending key -- if the page into which a new index entry must go is full, the page will be split, and a portion of the page's entries will be moved to a previously empty page in the index.] Index page splits can have a particularly significant impact on insert throughput in a DB2 data sharing environment, owing to an attendant increase in log write activity. When index pages are larger, index page split activity tends to decrease; so, if a primary concern is optimized throughput for high-volume insert operations, an index page size of 16K or 32K bytes could be preferred over the traditional 4K size.

Larger index page sizes can also improve the performance of index scans by reducing associated GETPAGE activity. On top of that, larger page sizes could reduce the number of levels for an index (referring to the root page level on top (logically speaking), the leaf page level on the bottom, and one or more non-leaf levels in-between in the B-tree index structure), and that would in turn result in reduced GETPAGE activity for index probes.  

On the other hand, if compression is not desired for an index, and if the primary performance concern is optimization of data retrieval involving index access, and if access to entries in an index tends to be random (versus sequential) in nature, a 4K page size is probably your best choice.

A note on changing an object's page size

If you implement a table space or an index with a certain page size and you later determine that a different page size would deliver better performance, you can make that change by way of ALTER TABLESPACE or ALTER INDEX, with a buffer pool specification that would reassign the object to a buffer pool with larger- or smaller-sized buffers, as desired. Starting with DB2 10 for z/OS running in new-function mode, a change to an object's page size is a pending DDL operation that can be non-disruptively accomplished with an ALTER (as mentioned above) followed by an online REORG of the target object. Note, however, that page-size changing via pending DDL is only possible if the table space being altered (or the underlying table space on which a to-be-altered index is defined) is of the universal variety -- one of several incentives to get to universal table spaces (prior to DB2 10, a table space could not be altered to have a different page size, and altering an index to change the page size would place the index in rebuild-pending status).

In conclusion...

Put some thought into your table space and index page-size decisions, and know that with universal table spaces (and indexes defined thereon), you can pretty easily change your mind down the road.

2 comments:

  1. Hi Robert,

    I study regularly all of your blogs. It helps me a lot. In my organisation, we still use v10. Can you please come with a blog for PRIQTY and SECQTY calculation for an Index?
    That would help me a lot

    Thanks in Advance

    Regards,
    Abinash

    ReplyDelete
    Replies
    1. If you are looking for information on how to calculate the disk space needed for an index, that is here in the IBM DB2 for z/OS Knowledge Center: http://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/admin/src/tpc/db2z_calcuateindexspace.html

      As for PRIQTY and SECQTY, my recommendation is to use the system default for PRIQTY, and let DB2 manage secondary space allocation. More information can be found in an entry I posted to the blog I maintained while working as an independent consultant, prior to my re-joining IBM in 2010: http://catterallconsulting.blogspot.com/2009/09/db2-managed-disk-space-allocation.html

      Robert

      Delete