Wednesday, August 25, 2010

DB2 9 for z/OS: an Important Enhancement of Partition-Level Online REORG Functionality

For a number of organizations, a key issue related to DB2 for z/OS Version 8 to Version 9 migration has been the elimination, in a DB2 9 environment, of the BUILD2 phase of partition-level online REORG. The concern with the elimination of BUILD2 had to do with the attendant restriction on the ability to execute online REORGs of multiple non-contiguous partitions of a tablespace in a parallel fashion. The good news that I want to highlight here is the removal of this restriction via the recently released fix for APAR PK87762. In the remainder of this post I'll explain what the BUILD2 phase of online REORG is (or was, from a DB2 9 perspective), the benefit of eliminating this utility phase, and the challenge that BUILD2 elimination posed for some DB2 for z/OS users prior to the availability of the aforementioned fix (the PTF is UK59095).

With Version 8 (and previous versions) of DB2 for z/OS, BUILD2 is the name of one of the phases of an online REORG TABLESPACE executed at the partition level (versus an online REORG of a partitioned tablespace in its entirety). During an online REORG of a tablespace partition, rows in the shadow data set for the partition are relocated relative to their position in the partition's active data set (this, of course, to restore clustering sequence for the partition). If there are any non-partitioned indexes on the tablespace (these are also known as NPIs), the index entries in those NPIs that point to rows in the partition being reorganized have to be updated to reflect the location of the rows in the shadow data set (which will be the active data set when the REORG job completes). BUILD2 is the utility phase in which these index entry corrections are made, and during that phase the logical partition of the NPI(s) that corresponds to the tablespace partition being reorganized is unavailable to application program. This means that INSERT and DELETE operations targeting the partition being reorganized cannot run during BUILD2 (nor can UPDATEs that change the value of a column in a key of an NPI, and SELECTs that would use the NPI to access data in the partition).

If BUILD2 completes in, say, 60 seconds or less, this may not be such a big deal. Even if BUILD2 runs for several minutes because of the number of rows in the partition being reorganized, the situation can be dealt with by causing BUILD2 to execute during a maintenance window. People do this by submitting the partition-level online REORG with a specification of MAXRO DEFER and LONGLOG CONTINUE, to put off execution of the REORG's final phases (including BUILD2). Upon entering the maintenance window, an -ALTER UTILITY command is issued to change MAXRO to a smallish number of seconds (e.g., 300), and REORG soon performs the final "catch-up" application of logged data changes to the shadow data sets, switches the shadow and active data sets, and completes BUILD2 processing.

What if you very rarely have a maintenance window? What if the partition being reorganized holds over 100 million rows, and BUILD2 runs too long to complete during a (for example) one-hour maintenance window? What then? For some DB2 Version 8 users, this meant that online reorganization of partitioned tablespaces could only be run at the tablespace level -- kind of a bummer, as one of the key benefits of tablespace partitioning is the ability to run utilities at the partition level.

So, BUILD2 elimination in the DB2 9 environment is a good thing, but the WAY in which BUILD2 was eliminated created, for some organizations, a new challenge. When a partition-level online REORG is run in a DB2 9 for z/OS system, any NPIs on the partitioned table will be reorganized in their entirety. That means that they have their own shadow data sets to which logged changes to the corresponding active data sets are applied. This makes BUILD2 unnecessary, but it also means that only one partition-level online REORG job can be run against the tablespace at any one time. That posed a problem for users who liked to get parallel reorganization of different partitions of a tablespace by running multiple REORG jobs targeting different partitions of a tablespace at the same time. Suppose, for example, that you want to run parallel online REORGs of partitions 2, 9, and 22 of a partitioned tablespace. In a DB2 Version 8 system, you'd run three separate online REORG jobs concurrently: one for each partition to be reorganized. In a DB2 9 environment, this approach does not work because the first of these jobs to get ahold of the tablespace's NPIs (to REORG these in their entirety) would block out the other jobs. You could not get around this restriction by submitting one online REORG job with a specification of PART 2, 9, 22 because this syntax was invalid. You could specify PART 2:22 and that would result in parallel online REORGs of partitions 2, 9, and 22, but also of all the other tablespace partitions in that range. There is a way, using an undocumented syntax, to run a REPAIR utility job that will flag certain partitions of a tablespace (e.g., 2, 9, and 22) as being in advisory REORG-pending status, after which an online REORG of the tablespace with the SCOPE PENDING option will result in the desired behavior (REORG only partitions 2, 9, and 22, in parallel); however, many DB2 users are understandably reluctant to go with this approach, and it is not one that I'd recommend.

At the IDUG North American Conference this past May, one of the IBM DB2 for z/OS developers mentioned that DB2 10 (now in beta release) would enable parallel online reorganization of non-contiguous partitions of a tablespace (i.e., partitions that aren't next to each other with regard to their sequence in a partitioned tablespace) with a single execution of REORG. That was great news for the future. What's great news NOW is the fix for the DB2 9 APAR I mentioned, PK87762. With this fix applied, DB2 9 REORG will now accept this syntax: PART(2, 9, 22). Not only that, but you can specify both individual partitions and a partition range in the same PART clause, as in PART(2, 9, 20:22). And there's more: now, when a LISTDEF used in an online REORG job references multiple individual partitions belonging to one tablespace, DB2 will execute REORG once for all of the tablespace's partitions referenced in the LISTDEF, versus running one REORG job for each partition in a serial fashion.

BUILD2 made partition-level online REORG not online-enough for some DB2 users, and I was pleased to see it go away with DB2 9. The (now former) inability to get parallel online REORGs of multiple non-contiguous partitions of a tablespace in a DB2 9 environment was seen by more than a few folks as being an irritatingly high price to pay for BUILD2 elimination. That problem has now been addressed. So, if you're planning a migration to DB2 9 and this issue has been a concern for you, cross it off your list. If you're already on DB2 9 and you've been pining for the days of unrestricted parallel partition-level online REORGs, get UK59095 on your system and pine no more.

6 comments:

  1. Hi Robert,
    I have got some doubt on REORG but i dont know where i can post in your blog so i am posting here as it is somewhat relevant.

    I ran REORG nealry for 50+ TS objects , But i did not notice that our DB2 storage group(Volumes) for INDEX components was in short of storage to REBUILD during REORG so the job failed with the below error message. ( Nearly while it was processing 26th among 50 Objects)

    DSNT500I 026 15:08:25.71 DSNUGBAC - RESOURCE UNAVAILABLE
    REASON 00D70014
    TYPE 00000220
    NAME DB82.DSNDBC.DBNAME.INDEXNAME.J0001.A001

    As the primary solution, I then identified Unused Indexes and dropped the same to reclaim the space and almost got 20 percent of the free space from the INDEX storage group. i then restarted the same REORG job with RESTART(CURRENT) phase but the job again failed with same error message as above.

    My question:
    I know that DB2 records entries for the utilties in SYSUTILX.
    If SO, SYSUTILX even records the Storagegroup volume names in specific for the datasets which it were trying to allocate during rebuild and not able to find another volumes which have space after my reclamation?

    My assertion is that , If i would have restarted the job with RESTART (UNLOAD), It would have run again from UNLOAD Phase of 26th object successfully .

    Please advise.

    ReplyDelete
    Replies
    1. I don't know that RESTART(UNLOAD) would have changed anything.

      DB2 wasn't able to build the shadow data set for the index in question. It may have been a space issue, or it may have been that there was enough space but it was so fragmented that DB2 hit the extent limit before running out of space for the data set.

      Certainly one solution would be to add volumes to the SMS storage group, if you have some disk volumes to spare.

      If the underlying table space is partitioned and you are reorganizing multiple partitions in parallel and the table space data sets are allocated in the same SMS storage group as the index data sets, you might get some disk space constraint relief by reorganizing fewer partitions in a single REORG job.

      Additionally, you could look at FREEPAGE and PCTFREE values for the index. Perhaps reducing the value of these parameters would enable DB2 to build the shadow data set in the available space.

      Robert

      Delete
  2. Thanks for the response. I have missed few points to clearly specify the issue.

    After the intial failure, I have reclaimed the space worth of one volume(around 3000 cyls), which was more than enough to rebuild the failed index.
    And I then restarted the job but again the job failed with the same error message.
    DSNT500I 026 15:08:25.71 DSNUGBAC - RESOURCE UNAVAILABLE
    REASON 00D70014
    TYPE 00000220
    NAME DB82.DSNDBC.DBNAME.INDEXNAME.J0001.A001

    To avoid the outage , I then terminated the existing UTIL ID and RERUN the job (Restarted the utility from the beginning) and it went through.

    My question:

    Why the job did not run with the existing utilid though i reclaimed the space? But it went through when i terminated and restarted it again from the first phase. What made it to fail when i restarted the job with the failed utild ID using RESTART(CURRENT)?
    Please advise

    ReplyDelete
    Replies
    1. What was your SHRLEVEL specification for this job? SHRLEVEL REFERENCE? SHRLEVEL CHANGE?

      Did you use a LISTDEF to REORG a list of table spaces?

      Does the index in question use STOGROUP-defined data sets?

      Robert

      Delete
    2. 1) Options were SHRLEVEL REFERENCE,
      2) Yes LISTDEF were used as below
      OPTIONS LISTDEFDD SYSLISTD TEMPLATEDD SYSTEMPL EVENT(ITEMERROR, HALT,WARNING, RC4)
      3)Yes they (Indexes) are defined to use specific set of storage groups.

      Delete
    3. OK, here is what I think might have happened in your situation:
      * The original failure (data set extend failure in building the shadow data set for a non-partitioned index) may well have occurred because of hitting the data set extent limit, versus running out of space.
      * When the utility failed but was not terminated, it retained the data sets with which it had been working, including the one associated with the non-partitioned index the utility was trying to build when it initially failed.
      * When you restarted the utility with RESTART(CURRENT), REORG got going again where it had left off, and again tried to extend the shadow data set for the non-partitioned index, and again ran into the data set extend failure error.
      * When you terminated the utility, the data set associated with the shadow for the non-partitioned index was deleted.
      * When you re-ran the REORG, the shadow data set for the big non-partitioned index was allocated anew, and this time it could be successfully expanded to the necessary size because of the space-freeing action you took (referring to your dropping of unneeded indexes with data sets in the same SMS storage group).

      Robert

      Delete