Sunday, March 26, 2017

DB2 for z/OS: Running REORG to Reclaim Disk Space

Think of why you run the DB2 for z/OS REORG utility, and a number of reasons are likely to come quickly to mind: to restore row order per a table's clustering key; to reestablish free space (for inserts and/or for updates); to remove the AREO* status set for a table space following (for example) an ALTER TABLE ADD COLUMN operation; or to materialize a pending DDL change such as an enlargement of a table space's DSSIZE. How about disk space reclamation? If that REORG motivation has not previously occurred to you, perhaps it should.

Recently, a DBA at a large DB2 for z/OS site communicated to me the success that his team has had in reclaiming substantial amounts of disk space through online reorganization of certain table spaces. He also asked for a recommendation with regard to identifying table spaces for which a REORG could potentially deliver significantly reduced disk space consumption. In this blog entry, I'll describe the disk space reclamation scenario reported by the referenced DBA, I'll explain why there was space to be reclaimed in some of the table spaces administered by the DBA, and I'll provide the "reclamation indicator" metric that I suggested to the DBA as a means of identifying table spaces that could be reorganized in order to free up disk space.

First, the scenario. At the DBA's site, there are some tables, in segmented table spaces ("traditional" segmented table spaces, as opposed to universal table spaces, which also happen to be segmented), that have these key characteristics: they are clustered by a continuously-ascending key (so that "new" rows go to the "end" of the table), and the number of inserts into the table is roughly equaled by the number of rows that are deleted from the table over a period of time.

The DB2 DBA knew that for table spaces with the above-described characteristics, REORGs were not needed to maintain "clusteredness," because of the continuously-ascending clustering key that sent new rows to the end of the table (at least, clustering would remain in good shape until the table space reached its size limit -- more on this in a moment). For the same reason, free space for inserts in "interior" pages of the table space was not a concern. Still, with DB2 real-time statistics showing a very large number of inserts since the last REORG of a couple of these table spaces, the DBA determined that reorganizations might be in order. Online REORGs of the table spaces were executed, and the result was a freeing up of 64 GB of disk space: one table space went from 21 to 4 data sets of 2 GB apiece, and the other went from 17 data sets to 2 (a DB2 segmented table space is comprised of up to 32 data sets of 2 GB apiece, and that is why its size limit is 64 GB).

Why was there so much unused space in these table spaces? Because the continuously-ascending clustering key kept pushing the "end" of the table spaces "outward." Why would that happen? Why would DB2 grow these table spaces as a result of inserts, given the like number of row-delete operations that were targeting the associated tables? Shouldn't DB2 have been using the space freed up by deletes to accommodate new inserts, without growing the table space's size? Actually, DB2 was working as designed. It's true that, given a continuously-ascending clustering key and some deletes of older rows from the "front" of a table space, DB2 can "wrap" to the front and start inserting new rows in space cleared by deletes, but that will only happen if DB2 cannot extend the table space (i.e., if DB2 cannot make the table space larger). If DB2 can extend a segmented table space, it will in order to preserve a table's row-clustering order; so, in advance of hitting the 64 GB size limit for a segmented table space, DB2 would keep making the table space larger so that it could keep adding rows to the end of a table (assuming a continuously-ascending clustering key), and deletes of older rows would result in ever-larger amounts of available-but-unused space in the table space. That's why the disk footprint of the two table spaces became so much smaller following reorganization.

[It is important to keep in mind that, given a continuously-ascending clustering key and at least some row-delete operations, DB2 will insert new rows in the "front" of a segmented table space, using space freed up by DELETEs, if the table space cannot be made any larger (either because of reaching the 64 GB limit or as a result of running into a maximum-extents or a maximum-volumes situation). In that case, "wrapping to the front" for new inserts is better than failing the inserts.]

Having achieved this disk space reclamation success through REORGs, the aforementioned DBA wanted to add "potential for significant disk space reclamation" to the criteria used at his site for identifying table spaces that should be reorganized (a good proactice -- REORG table spaces when you have a good reason for doing so, not just "because it's been X amount of time since the last time this table space was REORGed"). How could he and his colleagues spot table spaces with large amounts of unused space? My recommendation: use for this purpose the ratio of disk space occupied by the table space to space in the table space occupied by rows in the table space. For the numerator, I'd use the SPACE value in the row for the table space in the SYSTABLESPACE catalog table. That value is updated when the STOSPACE utility is executed, so you would want to run STOSPACE on a regular basis (that should not be a big deal -- STOSPACE should be a very inexpensive utility to execute, CPU-wise). For the denominator, I would use the product of TOTALROWS from SYSTABLESPACESTATS (set by REORG and updated when INSERTs and DELETEs are executed) and AVGROWLEN in SYSTABLESPACE (updated by RUNSTATS, or by in-line statistics collected during REORG or LOAD). You can decide when that ratio would prompt you to run REORG to reclaim space. Would you do that when disk-space-to-row-space hits 2 (i.e., when the size of the table space is 2X the space occupied by rows)? When it hits 3? When it hits 4? One of those values might be reasonable for your environment.

One more thing: I have focused on traditional segmented table spaces in this blog entry because that is the table space type to which space reclamation via REORG is most relevant. For a range-partitioned table space, a given partition's size limit is determined by the DSSIZE specification, and the same is true for a partition-by-growth table space. Yes, you could see a partition-by-growth table space come to contain a high percentage of unused space given the combination of a continuously-ascending clustering key and a good deal of DELETE activity, but you could put a limit on that growth by way of a not-larger-than-needed MAXPARTITIONS value. With that said, even with range-partitioned and partition-by-growth table spaces you could see situations in which the ratio of table space size to space occupied by rows (the ratio described in the preceding paragraph) could get to be high enough to make a REORG attractive from a disk space reclamation perspective. And here there's some good news: starting with DB2 11 for z/OS, you can tell DB2 to drop partitions of a partition-by-growth table space made empty by a REORG or the entire table space (that functionality is enabled via the REORG_DROP_PBG_PARTS parameter in ZPARM).

So, add disk space reclamation to your reasons for running REORG (if you have not already done so), and consider using the ratio I've provided to look for candidate table spaces.

7 comments:

  1. Why TOTALROWS*AVGROWLEN, why not DATASIZE?

    Michael Harper, TD Bank

    ReplyDelete
    Replies
    1. Great question, Michael. The reason: as I understand it, DATASIZE is NOT updated as a result of SQL data-change operations (UPDATE, INSERT, DELETE). DATASIZE (in SYSIBM.SYSTABLESPACESTATS) is updated as a result of executing REORG or LOAD. That being the case, DATASIZE only tells you how much space was occupied by rows the last time the table space was REORGed or LOADed. If there have been a lot of INSERTs since that time, DATASZE under-reports the amount of space occupied by rows.


      The effect of utilities, and of SQL statements, on the DB2 real-time statistics fields is documented in the DB2 for z/OS "Managing Performance" manual. Also online in the DB2 for z/OS Knowledge Center, of course (searching there for DATASIZE will bring up, among the "hits," to section on utility and SQL impact on real-time stats fields).

      Robert

      Delete
  2. Hi Robert, excuse me if this is not the right blog post.
    Being in V12 we noticed that after migrating to v12 when inserting records,
    the allocated space of the tablespace increases considerably,
    also extending them and decreasing substantially the Pct of space used by ACTIVE tables.
    when we reorganize it returns to the original sizes.
    Before with v11 it did not happen.
    The tablespace is a UTS PBR

    Why could this behavior have changed?

    I attach the history
    Note that from the migration of db2 to v12 in 2018-11-23
    changes the increment of allocate space while increasing the number of rows
    The first reorg turned to RRF that's why the change in allocacion

    Part Date/Time of Update Space (KB) Rows Pct Ac Pct Dp Exts
    ---- ------------------- ----------- ----------- ------ ------ ------
    2 2019-02-10-20.29.11 1617504 384299 0 0 48
    2 2019-02-03-14.12.22 1357776 375146 0 0 46
    2 2019-01-27-18.18.54 1065456 368624 1 0 37
    2 2019-01-20-16.49.38 809088 363353 1 0 32
    2 2019-01-13-15.01.41 510720 357345 2 0 24
    2 2019-01-06-13.31.03 311136 347015 3 0 19
    2 2018-12-30-18.53.03 184464 339751 5 0 15
    2 2018-12-23-16.09.50 24192 334111 43 0 6
    2 2018-12-16-17.21.36 762048 326013 1 0 35
    2 2018-12-09-17.51.33 510720 317613 2 0 29
    2 2018-12-02-18.38.16 311136 306212 3 0 20
    2 2018-11-25-17.57.04 38304 301272 25 0 7
    2 2018-11-18-15.35.31 13104 295903 73 0 3
    2 2018-11-11-17.14.28 13104 287465 71 0 3
    2 2018-11-04-16.22.33 13104 277529 87 0 3
    2 2018-10-28-17.23.37 13104 269506 67 0 5
    2 2018-10-21-16.15.35 13104 263974 65 0 5
    2 2018-10-14-17.58.13 13104 258109 64 0 5
    2 2018-10-07-16.02.30 13104 248715 87 0 5
    2 2018-09-30-14.17.34 13104 238653 59 0 3
    2 2018-09-23-16.06.47 8736 232407 87 0 3
    2 2018-09-16-14.44.28 8736 226012 84 0 3


    Thanks,
    Elvio

    ReplyDelete
    Replies
    1. If the only thing that changed about the table and table space (besides changing from BRF to RRF) is the change in Db2 for z/OS Version (Version 11 to Version 12), I do not know why the size of the table space increased so dramatically. You might need to open a ticket with the IBM Support Center to get this matter resolved.

      Robert

      Delete

  3. Thanks Robert, I'll deal with the problem with Ibm Support Center.

    ReplyDelete
  4. Rob,I ran reorg which has list of indexes using listdef,the job failed as there were claimers .I saw the display claimers results in the sysprint.Does the reorg list the cliamers results,in which phase? I have never seen display db(*) claimers messages in any reorg job failures.Any idea?

    ReplyDelete
    Replies
    1. This was an enhancement for REORG that was introduced with Db2 12 for z/OS. The message should be issued as part of the SWITCH phase of an online REORG, if REORG cannot get the drain lock on the "original" data sets (referring to the data sets of the table space or partitions and the indexes processed by the utility) that is required before REORG can switch to the shadow data sets containing the reorganized data. For the drain to be obtained, all processes with claims on original data sets must release those claims by committing. Sometimes, a process with a claim on a data set related to an online REORG will not commit in a timely manner, and the online REORG fails because it can't get the drain lock required for the switch at the end of utility processing. Displaying, in utility output, the claimers that caused the drain to fail helps you to do something to keep that from happening in the future (for example, if application XYZ held a claim that caused the drain to fail, you can work with the application XYZ team to determine why the application didn't commit in a timely manner, and the team can then take action to increase the application's commit frequency).

      You can read more about this Db2 12 enhancement in section 11.3.6 of the IBM redbook titled, "Db2 12 for z/OS Technical Overview." That redbook can be downloaded from this URL: https://www.redbooks.ibm.com/abstracts/sg248383.html

      Robert

      Delete