Thursday, May 28, 2015

DB2 11 for z/OS: Setting Aside Space for Updates

A DB2 table space is well-organized when rows are physically located where they should be per the clustering index of the associated table. Because a high degree of "clustered-ness" is a desirable quality of a table space (mostly for performance reasons), a DB2 DBA will often take steps to preserve row sequencing between REORGs. One such step that is very commonly executed is the setting aside of space in a table space's pages to accommodate inserts of new rows (relevant for a table that is clustered by a key that is NOT continuously ascending -- in the case of a continuously-ascending clustering key, new rows will be inserted at the "end" of the table). This insert-accommodating space set-aside is accomplished by way of the PCTFREE option of CREATE and ALTER TABLESPACE.

PCTFREE has always been a nice tool in the DB2 DBA's physical database design tool kit, but it had a shortcoming: it didn't apply to UPDATE operations that caused a row's length to increase. If a row in a DB2 for z/OS table space becomes longer as a result of an UPDATE, and if that longer row will no longer fit into the page in which it had been stored, it will be moved to another page in the table space. Will the row's location, as indicated by the RID (row ID) for the row in entries of indexes defined on the table, be changed to reflect the update-driven move of the row to a new page? No. Index entries associated with the row will continue to point to the row's former location. When DB2, in using an index to build a query result set that will include the row that was moved because of a length-increasing UPDATE, arrives at the "original home" page to retrieve the row, it will find, in effect, a "We've moved!" sign, along with a pointer to the row's new page. And, if that moved row is again lengthened by an UPDATE and again is moved (if it became too long to go back into its "second home" page), another "We've moved!" sign will be posted, along with a pointer to the row's "thIrd home" page. And so on.

These "We've moved!" signs related to update-induced row relocations are known, in DB2 parlance, as indirect references. Why are things done this way? Because updating a row's RID in a table's indexes every time a row-lengthening UPDATE caused a row move would add too much cost to UPDATE operations (the affected RIDs will be corrected with the next table space REORG). Obviously, rows with varying-length columns (such as VARCHAR columns) can be lengthened (and potentially relocated) as a result of UPDATE operations. So, too, can rows containing only fixed-length columns. How's that? Compression, that's how -- a row in a COMPRESS YES table space might compress differently (and with more length) with different column values resulting from an UPDATE.

Because indirect references are a drag on query performance (because of the hops to "new home" pages to which lengthened-beyond-fitting rows are relocated), one would like to reduce their occurrence. But how? There wasn't a good answer to this question until DB2 11 came along and provided the new PCTFREE FOR UPDATE option for CREATE and ALTER TABLESPACE. Here's how it works: if you set PCTFREE FOR UPDATE for a table space to, say, 10 then 10% of the space in the table space's pages (following a REORG or a LOAD REPLACE) will be set aside for the sole purpose of accommodating length-increasing  UPDATEs. In other words, that space will NOT be available for newly inserted rows. Can you still set aside space for to-be-inserted row's in a table space's pages? Of course you can. If you wanted 15% of the space in pages of a table space to be set aside (again, following a REORG or a LOAD REPLACE) for new INSERTs, and 10% set aside for length-increasing UPDATEs, you'd specify the following:

PCTFREE 15 FOR UPDATE 10

Would it be appropriate to specify a PCTFREE  FOR UPDATE value for each and every one of your DB2 tables? Probably not. An unneeded, non-zero PCTFREE FOR UPDATE value will result in wasted space in a table space's pages. It's best to use this DB2 enhancement for tables that are the targets of length-increasing UPDATE operations. Is it easy to identify table spaces that are associated with such tables? Sure it is -- just check the new (with DB2 11) UPDATESIZE column of the SYSIBM.SYSTABLESPACESTATS real-time statistics table in the DB2 catalog.The value in this column shows the extent to which a table space grew (or shrank) as a result of length-changing UPDATE operations since the most recent REORG or LOAD REPLACE. Speaking of real-time stats, here's an interesting option: if you specify PCTFREE FOR UPDATE -1 for a table space, the initial space set aside in the table space's pages for length-increasing UPDATEs will be 5%, and that amount will subsequently be adjusted by DB2 based on certain real-time statistics values. This is one of a growing number of examples of DB2 using its own real-time statistics to enhance efficiency of operation -- I'll be writing more about that soon.

One more thing: there is a new (with DB2 11) ZPARM, PCTFREE_UPD, that provides the default value for PCTFREE FOR UPDATE for ALTER and CREATE TABLESPACE statements. Setting PCTFREE_UPD to AUTO would make PCTFREE FOR UPDATE -1 (described above) the default.

How do you know if you've done good by specifying a non-zero PCTFREE FOR UPDATE value for a table space? Check to see that the incidence of indirect references (i.e., "We've moved!" signs) has decreased. Indirect references for table spaces can be tracked via the NEARINDREF and FARINDREF columns of the SYSIBM.SYSTABLEPART table, and the REORGNEARINDREF and REORGFARINDREF columns of SYSIBM.SYSTABLESPACESTATS. [An indirect reference is "near" if the associated row was moved to a page within 16 pages (or SEGSIZE/2, for a segmented table space -- and universal table spaces are segmented) of its previous "home" page; otherwise, the indirect reference is considered to be "far".]

By the way, PCTFREE FOR UPDATE is expected to be particularly useful for tables that have VARCHAR columns that are initially set to NULL and are later updated to have non-null values -- that is a scenario in which you'd expect rows to become significantly longer as a result of UPDATEs.

And there you have it. Nice feature, eh? When you get to DB2 11 (or if you're already there), take advantage of it.

4 comments:

  1. Thanks You!!! it was very helpful!!!

    ReplyDelete
  2. Good article Robert. I know it's been a while since you posted it, but i'm a little confused with where you say " it (PCTFREE) didn't apply to UPDATE operations that caused a row's length to increase".

    Looking at this link for version 10:
    https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/perf/src/tpc/db2z_reservefreespacetable.html

    it seems to imply in several places that PCTFREE does reduce indirect references caused by size increase (see second bullet under About This Task, and second bullet under Procedure). Is the doc incorrect, or am i misunderstanding/misreading?

    Thanks for any clarification

    ReplyDelete
    Replies
    1. Hey, Barry.

      I'd say this was a case of my not being as precise with my wording as I should have been. What I meant to convey was that, prior to PCTFREE FOR UPDATE, we didn't have a way of setting aside space in a Db2 for z/OS table space's pages that would be EXCLUSIVELY available for accommodating length-increasing updates. My bad. Thanks for seeking clarification and giving me an opportunity to crisp this up.

      Robert

      Delete
    2. Yep, that makes sense. Thanks.

      Delete