Monday, March 30, 2020

Db2 for z/OS Online LOAD REPLACE - Minimizing Impact on Transactions

A couple of years ago, IBM introduced an important enhancement for the Db2 for z/OS LOAD utility, via the fix for an APAR (the APARs are PI69095 for Db2 12, and PI67793 for Db2 11). This enhancement provided a capability that is popularly known as "online LOAD REPLACE," but is more formally referred to as LOAD REPLACE SHRLEVEL REFERENCE.  I'll provide an overview of this functionality in a moment, but I'll first get to the impetus for my writing this blog entry: recently I've worked with Db2 people at a pair of sites, helping them to answer this question: How can I execute an online LOAD REPLACE with minimal impact on transactions that concurrently access the target table? The answer to that question involves utility control options that will likely be recognized by people who are familiar with IBM Db2 online REORG processing.

OK, the scenario that the online LOAD REPLACE enhancement was designed to address: suppose you have a need to periodically "switch out" data in a table. In other words, you want the set of rows now in table T1 to be replaced by another set of rows. This might be done, for example, on a quarterly basis if table T1 contains data for the most recently completed quarter. You could of course use traditional LOAD REPLACE functionality to get the task done, but with that approach the table will be inaccessible to application programs for the duration of the LOAD REPLACE job, and it the amount of data you're putting into the table via LOAD REPLACE is large, and if several indexes are defined on the target table, that period of table inaccessibility could be unacceptably long.

How about using clone table functionality for this purpose? Yes, that could be an appropriate option. Clone table functionality, introduced with Db2 9 for z/OS, allows you to create a clone of a given table (via a DDL statement of the form ALTER TABLE... ADD CLONE). Once a clone has been created for table Tab1 (I'll call the clone Tab1_CLONE, though there is no need for "clone" to be in the object's name), you can put data in it any way you want - via LOAD or INSERT - with no impact whatsoever on access to Tab1. Once Tab1_CLONE contains the data for which you want the data now in Tab1 to be switched out, you issue the SQL statement EXCHANGE DATA for the target table, and as soon as Db2 can get a drain on that table (which should not take long if transactions accessing the table are committing frequently), the switch happens and the next time a program issues a SQL statement referencing table Tab1, that table name will resolve to the physical object that had been called Tab1_CLONE prior to execution of the EXCHANGE DATA statement (and the physical object that had been called Tab1 prior to execution of the EXCHANGE DATA statement will subsequently be called, in this example, Tab1_CLONE). To put this another way, from a transaction program's perspective table Tab1 contained a certain set of rows a few seconds ago, and now (following execution of the EXCHANGE DATA statement) Tab1 contains a different set of rows.

Problem solved, right? Not necessarily. You see, a lot of Db2 for z/OS DBAs don't much like clone tables, and I get that. The concern boils down to this: once a clone has been created for a certain table, there are a number of things you can't do with the table without first dropping the clone, necessitating a re-creation of the clone afterwards. A prime example of this hassle factor: what would be a simple ALTER of a table becomes a more convoluted process if said table has a clone (I wrote about such a scenario in an entry I posted to this blog some years ago). Trying to avoid clone-related complications, some DBA teams came up with data switch-out processes that involved two different tables and use of the SQL statement RENAME TABLE. The main problem with that approach, aside from complexity, is the fact that RENAME TABLE leads to invalidation of dependent packages. Ugh.

Enter online LOAD REPLACE functionality, which has been described as "clone table functionality without the hassles of clone tables." When you run LOAD for a table using the REPLACE option and SHRLEVEL REFERENCE, the table will be placed in read-only mode for application access (programs that access tables for which a data "switch-out" is periodically required are typically of the data-retrieval variety, anyway), and the input data for the utility will be loaded into "shadow" objects corresponding to the table, its table space and its indexes. When the shadow objects are all loaded, online LOAD REPLACE does what Db2 online REORG does - it goes through the SWITCH phase, after which references to what had been the original table (and table space and indexes) resolve to the "new" original objects (which, prior to the SWITCH phase of the utility, were the shadow objects). The SWITCH phase of online LOAD REPLACE generally completes very quickly.

So there you have it, right? An easy, quick (from an application program's perspective) way to switch out data in a table for other data, right? Well, in fact some contention issues can arise when an online LOAD REPLACE job is executed. Twice over the past few months I have had Db2 for z/OS people contact me with questions about application disruption related to execution of the LOAD utility with REPLACE and SHRLEVEL REFERENCE. Similar points came up in these discussions, and I want to share pertinent information with you in the remainder of this blog entry.

The problem the aforementioned Db2 people faced was this: when an online LOAD REPLACE job was executed, some transactions would time out. The "why" of those timeouts has to do with Db2 claims and drains. Every program process that accesses Db2 for z/OS data has to get a claim (read or write) on every table space or index it accesses. Claims acquired by an application process are released when the process commits. When a Db2 utility requires exclusive access to an object (as a LOAD REPLACE SHRLEVEL REFERENCE job does when it is time to do the data set switch between the "original" and shadow data sets), it requests a drain on that object. Once a drain has been requested, in-flight units of work associated with programs accessing the target object are allowed to run to a commit point, but new units of work that will access the object are prevented from acquiring the necessary claim, effectively holding up their execution. If these held-up transactions have to wait too long on a claim, they will time out. Will that happen because a Db2 lock timeout value is reached? Possibly, but more probably the concern is a timeout value enforced by an application server. In this era of transactions that can be initiated at any time by someone using a smartphone app, application-side timeout values can be quite stringent - 10 seconds, for example, is not unusual (if a transaction accesses a data server such as Db2 for z/OS, and the application server waits for 10 seconds without getting a response, the application server will time the transaction out).

Why might it take more than a second or two for a Db2 utility, such as an online LOAD REPLACE that is ready for its SWITCH phase, to get the drain it needs to proceed? Won't in-flight transactions reach a commit point very quickly, since they tend to be sub-second in terms of elapsed time? Sure, most - maybe very nearly all - of the in-flight transactions accessing a table space or index (or a partition of same, if the object is partitioned) on which a utility wants a drain will get to a commit point very quickly, but all it takes is one transaction that's taking its time before committing to cause other transactions to be held up long enough to be timed out. As long as any ONE claim on a Db2 page set or partition (table space or index) has not been released via commit, the drain requested by a utility such as LOAD REPLACE SHRLEVEL REFERENCE (at SWITCH phase time) will not be acquired. When a not-committing-any-time-soon process is in the picture, something's gotta give. Either the utility will give up it's drain request (or will fail because it waited too long to get the drain), or transactions held up because of the drain will time out.

Here is where, in the case of an online LOAD REPLACE, you have to answer a question: what is your priority? Is it getting data in the target table switched out by time X, no matter what? Or, if you have at least some wiggle room regarding the time at which the data switch-out for the table is to be accomplished (at least a few minutes, let's say), is the priority on getting the data switch-out done with minimal (preferably zero) application impact in the form of transaction time-outs? If the priority is getting the data switch-out done by time X, no matter what, the probability of at least a few transactions timing out is going to be greater. If, on the other hand, the data switch-out for the target table needs to be accomplished not more than 10 minutes (for example) after time X, and transaction time-outs must be minimized (ideally, eliminated), you can provide the online LOAD REPLACE job with some flexibility regarding drain acquisition when it's time for the SWITCH phase.

How is that flexibility implemented? By way of the same options available when you're executing a Db2 online REORG utility. I'm talking about the utility options DRAIN_WAIT, RETRY, and RETRY_DELAY:

  • DRAIN_WAIT - If transaction time-out limits are really stringent (and again, that is very often enforced at the application server level) and time-outs must be avoided if at all possible, you might want to go really low with this value - maybe something like 8 seconds. If you do that, and an online LOAD REPLACE requests a drain on a target object in order to perform a data set switch with the associated shadow object, and the drain has not been acquired within 8 seconds (perhaps because some transaction that can run long but is infrequently executed is accessing the object and hasn't committed), the utility will give up on trying to get the drain. The thinking here is that we may be able to avoid transaction time-outs if transactions will be held up by a utility-driven drain request for no more than 8 seconds. Will giving up on a drain request cause the utility to fail? No - not if the utility can try again (see the next item, below).
  • RETRY - If you make the drain wait limit really short, thereby increasing the chance that the drain won't be acquired within the specified wait time, use RETRY to allow the utility to try again for the drain a little later. A RETRY value of 20, for example, would give the online LOAD REPLACE job quite a few chances to acquire the drain needed for SWITCH processing. How long will the utility wait to try again for the drain, if at first it doesn't succeed? That's determined by the setting of the next item, below.
  • RETRY_DELAY - If the online LOAD REPLACE tries and fails to get the drain it needs to get the SWITCH phase done, trying again immediately for the drain might not make sense. Better to give the process that prevented drain acquisition a little time to complete and get out of the way; but, don't wait too long - we haven't got all day to get the job done. Maybe waiting 20 seconds between successive attempts to get the drain needed for utility completion would make sense (in that case you'd specify RETRY_DELAY 20).

OK, so online LOAD REPLACE is a very nice Db2 for z/OS utility enhancement. That said, it doesn't do what it does using pixie dust. When the shadow table space and indexes are loaded and ready for switching for the corresponding "original" objects, the online LOAD REPLACE job needs to have - briefly - exclusive access to the "original" objects that are perhaps being concurrently accessed by transactions. Exclusive access is achieved via a drain request, and that drain request will hold up incoming transactions that are going to access the target objects. Transactions can be timed out if they are held up too long. Minimizing transaction impact, then, can necessitate a really short drain wait time for the utility. That short drain wait time increases odds of drain acquisition failure, but let the utility try again for a the drain a few times, with a reasonable time between successive drain requests, and you can have your cake and eat it, too: switch-out of data in a table (even involving a whole lot of data) accomplished easily, with minimal - quite possibly zero - time-out impact on transactions concurrently accessing the target table.

160 comments:

  1. I have a question , should the online Load replace task allow new transactions to come into the target table, once the Switch phase is started .When I tried this online Load replace , a new SELECT transaction on the target table was allowed to start and run ,once the online Load utility was in Switch phase , I was expecting that Switch should not be allowing new transaction on target table

    ReplyDelete
    Replies
    1. Per my understanding, that should not have happened - the switch phase of a LOAD REPLACE SHRLEVEL REFERENCE should be preceded by a "drain all." You might want to open a case with IBM Support to investigate this situation.

      Robert

      Delete
    2. Hi Rob, I understood its function correctly now , actually,Though the switch phase had begun , Drain all was not acquired still ,mean time , another thread came into the target table , which the utility allowed .So at that time , Utility had not aquired exclusive access on the target table , so it works correctly . Thank you

      Delete
    3. Certainly - you're welcome.

      Robert

      Delete
  2. Hi Rob,
    Application team wants to match prod table rows as that of sandbox.when I did load of a table,it reported with rows mismatch ,upon checking I found that there are six parent tables to it and among that two tables are already loaded(in random order).let me know how to proceed so that there are no records mismatch

    ReplyDelete
    Replies
    1. If you want to copy production Db2 for z/OS data into a test/dev environment, and you want that data to be referentially complete (i.e., no "orphan" rows in tables that are dependent in RI relationships), you will of course need to copy data from the whole set of referentially-related tables. One way to do that pretty easily is with Db2 tools. If you want to copy referentially related table spaces (and associated indexes) from one Db2 system to another in their entirety, that can be easily and efficiently and quickly done using IBM's Db2 Cloning Tool for z/OS (see https://www.ibm.com/docs/en/db2-cloning-tool/3.2.0?topic=documentation-db2-cloning-tool-overview). If you want to copy a subset of table space rows (as opposed to entire table spaces), in a referentially complete way, from production to dev/test, that can be easily done using IBM Optim Test Data Management (see https://www.ibm.com/docs/en/iotdmfz/11.7?topic=overview-optim-test-data-management-solution-zos).

      If you want to do this on your own, without using tools such as those just mentioned, you could try this approach:
      1) Determine the set of referentially related objects, using the Db2 REPORT utility (see https://www.ibm.com/docs/en/db2-for-zos/12?topic=utilities-report).
      2) Do a redirected recovery of that set of objects (see https://community.ibm.com/community/user/datamanagement/blogs/kate-wheat1/2020/11/05/redirected-recovery-with-recover-u).
      3) Unload from the table spaces for which you did the redirected recovery, and load that data into table spaces in the test/dev environment.
      4) Create the needed indexes on the table spaces in the test/dev system.

      Robert

      Delete
  3. Hello ,One of my load replace job had message stating 30 rows deleted from this table as parent rows were not matching .looks like whichever rows not in sync with that of parent has been deleted right away in same job.Is it possible to prevent that deletion of rows from this table? and how do i collect the deleted rows/can I store the deleted rows for me to look back?

    ReplyDelete
    Replies
    1. Running LOAD with ENFORCE NO instead of the default ENFORCE CONSTRAINTS should prevent the row deletion you saw, but it will leave the target table space in check-pending status. If you stay with ENFORCE CONSTRAINTS, Db2 will place deleted rows in the data set named via the DISCARDDN option (see https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement).

      Robert

      Delete
    2. If I dint include discardddn explicitly in my job ,Will the job create some default discard dsn implicitly while running & place the deleted rows ? or only if we include discardddn explicitly in job,will i able to collect the deleted rows ?

      Delete
    3. Please refer to the online Db2 12 for z/OS documentation page previously cited (https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement). On that page, in the description of the DISCARDDN option, you will see this sentence: "If you omit the DISCARDDN option, the utility application program saves discarded records only if a SYSDISC DD statement is in the JCL input."

      The online Db2 for z/OS documentation is a very comprehensive informational resource.

      Robert

      Delete
  4. Hello Robert,
    I was running three loads in same job .1st step was load replace(with nocopypend option),2nd step load resume (without nocopypend) 3rd step-load resume (without nocopypend).After 2nd step,objects were placed in copypending.Next load resume ran on same object which was in copypend status.Isnt it restricted status? I'm surprised how load resume was allowed and many subsequent loads resume also ran fine after that.Any advise.

    ReplyDelete
    Replies
    1. Sorry about the delay in responding.

      My understanding is that COPY-pending status prevents changing data in a table space via SQL (e.g., INSERT, UPDATE, DELETE). The LOAD utility (unless executed with RESUME YES and SHRLEVEL CHANGE) does not use the SQL interface.

      Robert

      Delete
  5. Inspite of having log no replace nocopypend keyword in load card.The job still shows messages that it placed part 1 in cope pending state.Why that happens?
    LOAD DATA INDDN TBL LOG NO REPLACE NOCOPYPEND STATISTICS TABLE(ALL) INTO TABLE DB1.TB1
    message:
    TABLESPACE xxxx PARTITION 1 IS IN COPY PENDING

    ReplyDelete
    Replies
    1. Only two possibilities come to my mind:
      A) The object was already in copy-pending status prior to execution of the LOAD job (NOCOPYPEND does not change that),
      -or-
      B) The table space is defined with the NOT LOGGED attribute (in which case the NOCOPYPEND option would be ignored).

      If neither A nor B is true, I do not know why the object is in a copy-pending state. In that case, you might need to open a case with IBM Support.

      Robert

      Delete
    2. Right.I see that there was an load to this table(logged attribute) without nocopypend keyword which placed in copy pending state.Shouldn't this prevent from any further loads to be happening for same table without clearing copy pend ? I see another load replace with nocopypend was attempted for same table and that still dint remove the copy pend state as you said.

      Delete
    3. Refer to this page in the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement. On that page, look at the description of the LOG option for LOAD. In the description of LOG NO, you will see that when a table space is placed in COPY-pending status, "No table or partition in the table space can be updated by SQL until the restriction is removed." The operative phrase there is, "updated by SQL." COPY-pending status prevents updating of data via SQL, but does not prevent execution of LOAD.

      Robert

      Delete
  6. I get below when I run load using dsnuproc.UTILITY BATCH MEMORY EXECUTION ABENDED, REASON=X'00E40002'.
    load job:
    //JOBLIB DD DSN=DBXX.SDSNLOAD,DISP=SHR
    //*
    //PROCLIB JCLLIB ORDER=xxxxxxx
    //*
    //LOAD0 EXEC DSNUPROC,UID='LOAD1',UTPROC='',SYSTEM='Dxx'
    //SYSIN DD DISP=SHR,DSN=xxxxxxxxxxx
    // *
    Note:even if i add steplib and remove proclib step.Still I get this error.Any advise?

    ReplyDelete
    Replies
    1. Typo-I have tried adding steplib in above jcl (but dint work) and but not made any chnages to proclib.

      Delete
    2. I can't add anything to what is in the "response" information in the documentation on the 00E40002 reason code (https://www.ibm.com/docs/en/db2-for-zos/12?topic=codes-00e40002). If that information is not sufficient to resolve the problem, you might need to open a case with IBM Support.

      Robert

      Delete
  7. I get below error when I attempted to do load.
    DSNU398I DSNURWBF - UNEXPECTED PROCESSING ERROR, REASON=X'00E40318' ON TABLE
    DSNT500I DSNUGBAC - RESOURCE UNAVAILABLE
    REASON 00D70014
    TYPE 00000220
    DSNU017I DSNUGBAC - UTILITY DATA BASE SERVICES MEMORY EXECUTIon ABENDED, REASON=X'00E40347'
    CAUSE=X'00E40318'
    Per reason code-00D70014 it states an attempt to extend dataset failed.I tried to give more space in SYS* DD cards,but still it is failing .How can this be resolved?

    ReplyDelete
    Replies
    1. I do not know why the data set in question could not be extended. You will likely need to open a case with IBM Support to get this issue resolved.

      Robert

      Delete
  8. I have read manuals about override(identity) and ignorefiled(yes) in load card but still I'm not able to conceptually understand the use of it & when I need to use it?
    Any idea?

    ReplyDelete
    Replies
    1. Sometimes you want to load values into a column defined with GENERATED ALWAYS. The OVERRIDE option of LOAD causes the values you want to be loaded into the GENERATED ALWAYS column to be loaded, versus Db2 generating its own values for the column (as GENERATED ALWAYS indicates). Basically what you're overriding is the GENERATED ALWAYS characteristic of the column.

      There is a scenario for using IGNOREFIELDS on this page of the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-loading-tables-special-column-types-by-using-generated-statements.

      Robert

      Delete
  9. Robert,
    We have ri in table(parent table with three child table),should we give check data exception for a table listing all child table underneath? or should we give check data exception for child table with corresponding exception table for child table? to remove orpan/mismtached records with that of parent?

    I see both the below options are used in my shop involving ri.So I'm confused which is used for what?
    option 1:
    check data ts " parent tsname" for exception in " parent table name" use "exception table name of allchild tables" deletes yes;
    option 2:
    check data ts " child tsname" for exception in " child table name" use "exception table name of same table" delete yes;

    ReplyDelete
  10. My load job is failed with REASON 00C90099 TYPE 00000210
    .Per manual it states ,partition is full.I'm unloading from prod and loading into dev.PCTFREE & FREEPAGE is 5.The partition structure is same across system.Wondering why would it fail? And what can be the possile ways to resolve this ?

    ReplyDelete
    Replies
    1. The first possibility that comes to my mind is the one mentioned in the description of the 00C90099 reason code (see https://www.ibm.com/docs/en/db2-for-zos/12?topic=codes-00c90099): the source table sees INSERT activity, and the inserts executed since the last REORG or LOAD REPLACE of the source partition largely filled up that partition's free space. In that case, the LOAD into the target partition might exhaust available space by adding the free space specified (via PCTFREE and FREEPAGE) to the data being loaded. If this is indeed what happened, the LOAD of the target partition should succeed if the target table space is defined with 0 free space (i.e., PCTFREE 0 and FREEPAGE 0).

      Robert

      Delete
  11. Pctfree and freepage is same for both source & target.If that is the case,then inserts are accepted/accumulated in source according to free space.when same thing is loaded to target(including inserted records) why dont the freespace accept that(as that is what source has as well as target freespace is available to) ?

    ReplyDelete
    Replies
    1. Think about it. Let's say the source partition size is X (doesn't matter what X is). Suppose the partition is reorganized. After the REORG, the partition has some amount of free space (specified via PCTFREE and/or FREEPAGE). Imagine that inserts of new rows subsequent to the REORG fill up all of the partition's PCTFREE and/or FREEPAGE space, and not only that - the post-REORG inserts almost take space utilization in the partition up to the limit of X - say that 99% of the X amount of space is filled.

      Now, you unload the production partition and attempt to load the data into a test system partition that also has a size limit of X. You're going to need 99% of X just to hold all the data unloaded from the production partition, but Db2 is also going to try to reserve 5% free space in each data-holding page and and leave an empty page after every 5 data-holding pages. All told, providing that specified level of free space is going to require 0.99 * X (to hold the rows unloaded from the production LPAR) plus about 0.2 * X (for the free space requested via the PCTFREE and FREEPAGE specifications). That.s (0.99 + 0.2) * X, or 1.19 * X, and that takes you past the X partition size limit and the LOAD job fails. Either make the test system partition's DSSIZE larger or change the FREEPAGE and PCTFREE values for the test partition LOAD to 0. If you do that latter, you'll just need .99 * X for that data rows, and they will fit within the X partition size limit (just barely, but they will fit).

      Robert

      Delete
  12. My load job failed with jcl error .When I checked the job log,there was no sysprint dd/sysprint log.There was only jesmsglog,jesjcl,jesysmsg.In jesysmsg-I saw there was dataset not found messages(it was all sysrec,syspunch datasets).
    So probably load did not run as datasets dint exist.But why it was shown in jesysmsg rather than sysprint.Sysprint was not even printed.My thought was, it should have printed in sysprint.

    ReplyDelete
  13. I see a column termed as "TIME" in RC query tool .Is that start time of load or end time/completion of load ?HOw sgould we consider it?
    DATABASE OPERATION DATE TIME
    TS1 LOAD REPLACE 11-11-2023 22:58:25

    ReplyDelete
    Replies
    1. RC Query is not an IBM product. You'll have to ask Broadcom/CA (the vendor) about that.

      Robert

      Delete
  14. As you said,rename invalidates the packages.Other thing I noticed is-As per manual,syntax of rename is rename table source tablename to new table name.But when I execute rename statement without table keyword still it executes fine whereas rename index does not execute without index keyword,how is that possible for rename to execute without table keyword and how rename index strictly expects index keyword?

    ReplyDelete
    Replies
    1. First, look at the diagram under the heading, "Syntax," on the Db2 for z/OS online documentation page that describes the RENAME statement (that page is at https://www.ibm.com/docs/en/db2-for-zos/12?topic=statements-rename). You will see that TABLE is "above the line" in the syntax diagram, and INDEX is "below the line." Next, look at the information in the online documentation on how to read SQL statement syntax diagrams (https://www.ibm.com/docs/en/db2-for-zos/12?topic=zos-how-read-syntax-diagrams). You will see that when an item is above the line and another item in the same "stack" is below the line, the above-the-line item is the default specification; Thus TABLE is the default for the RENAME statement, and TABLE is assumed unless you specify INDEX instead.

      Robert

      Delete
    2. Wow! thanks for pointing out that.

      Delete
  15. One of distributed application reported us -SQLCODE=-904, SQLSTATE=57011, SQLERRMC=00C900A3;00000210;xxxxxxx.0000001, DRIVER=4.19.49"
    }
    I checked syscopy and db2 rc tools to know if there is any load happneed to all parent & child tables related to this table.But i dont see any load or alter to table.and no modify recovery too.
    It's very strange to see check pending.

    ReplyDelete
    Replies
    1. LOAD and ALTER TABLE are not the only actions that can cause a page set or partition to go into check-pending status (to my knowledge, MODIFY RECOVERY does not cause check-pending status to be set). The various actions that can cause a page set or partition to go into check-pending status are listed on this page of the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=states-check-pending-status.

      Robert

      Delete
    2. Thanks for article rob.But none of listed options was performed on this table.Interesting.No alter,no load,no recover,no check lob,no reorg discard, etc .I checked syscopy to find the operations on this table as well the parent table(this table had few parent tables).only ic and quiesce was performed.But onething I noticed is,this parent tables in turn had few parent attached to it,but Im not sure if we need to check whole tree to see if above operations was all done.I just checked only the direct related parent table to this child table.Wondering what would have caused this? would an failed utlity (i think failed utlity will not be recorded in syscopy) cause this kind of error?

      Delete
    3. I don't see that a failed utility would put a partition or page set into check pending status.

      I suppose a "cascading check pending" situation could explain what you're seeing. If table T1 is a parent of T2, and T2 is a parent of T3, and T2's table space is in check-pending status, that is an indication that there might be "orphan rows" in T2. If that's true - if there are rows in T2 that should not be there - then that could mean that there are rows in T3 that should not be there.

      A case opened with IBM Support might help you determine how the partition in question got into check-pending status.

      Robert

      Delete
    4. Thanks for the quick response Rob!
      "I don't see that a failed utility would put a partition or page set into check pending status."-Lets says a load failed during enforce phase for some reason or during any phase .Do you mean to say that it will place in check ending state ?

      Delete
    5. Good point. Check "Table 1" on this page in the Db2 for z/OS online documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-termination. You will see that in the BUILD phase, the target page set or partition will be placed in check-pending status; thus, if the utility fails in that phase or any subsequent phase, the page set or partition will be in check-pending status. Resetting of check-pending status requires successful completion of the ENFORCE phase of LOAD.

      Robert

      Delete
    6. I was able to catch that.

      Delete
  16. I have a tablespace with "not logged attribute" put in "Icopy status" .Having to find the reason for icopy state,when I was checking syscopy ,I see the an load replace(log no) was run on the table.Would that be the cause of icopy?if so why it places in icopy and what's the implication of it?
    1)Only the tablespace and table is in icopy state.Ideally the index should also be in icopy right?.But here the index was not in icopy.clue?
    2)log keyword in load card and logged/not logged attribute in tablespace .which one will take precedence?
    3)If we include nocopypend keyword in load ,would an icopy be ignored?
    4)what is the prefernce/why do we prefer "not logged attribute" in tablespace.what is the use of it and disadvantage of it.

    Appreciate your help on this

    ReplyDelete
    Replies
    1. I will not be able to directly address your questions - there are too many of them, and they are too general in nature. The best I can do for you is point you to the Db2 for z/OS documentation on informational copy-pending status (https://www.ibm.com/docs/en/db2-for-zos/12?topic=states-informational-copy-pending-status), on the NOT LOGGED attribute of CREATE TABLESPACE (search on "logged" on this page: https://www.ibm.com/docs/en/db2-for-zos/12?topic=statements-create-tablespace), and on the NOCOPYPEND option of LOAD (search on "nocopypend" on this page: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement).

      Regarding indexes, I can tell you that image copies are not applicable unless the index is defined with COPY YES, and the default value for an index is COPY NO (search on "copy" on this page: https://www.ibm.com/docs/en/db2-for-zos/12?topic=statements-create-index).

      Robert

      Delete
  17. Rob,
    I ran below check data for an tablespace(which has only 1 table) that was placed in check pending.

    //CHK EXEC PGM=DSNUTILB,PARM='DB2O'
    //UTPRINT DD SYSOUT=*
    //SYSPRINT DD SYSOUT=*
    //SYSIN DD *
    CHECK DATA TABLESPACE DB1.TS1 SCOPE PENDING

    Eventhough I have specified to run on entire tablespace.I'm interested to know why it says entire ts must be checked in output?

    DSNU050I 337 05:47:04.77 DSNUGUTC - CHECK DATA TABLESPACE DB1.TS1
    DSNU726I -DB2R 337 05:47:04.78 DSNUKINP - SCOPE PENDING IS ACTIVE, BUT THE ENTIRE
    TABLE SPACE 'DB1.TS1 ' MUST BE CHECKED

    ReplyDelete
    Replies
    1. I do not know why this message would be issued for a single-table table space. You might need to open a case with IBM Support to get clarification on the meaning of the DSNU726I message.

      Robert

      Delete
  18. What are all the utilities/dml/ddl can be run on a table/tablespaces placed in check pending.I see ic ,quiesce and modify recovery allowed for an ts (placed in check pending).I dont find an page which tells that ddl/dml/db2 utlities allowed on check ending ts.Appreciate for an help.

    ReplyDelete
    Replies
    1. I am not aware that such a page exists in the Db2 for z/OS documentation. Pretty much all that is stated in the doc is that access to data in a table space in check-pending status is "severely restricted." From a DML perspective, I suspect that access is zero. For DDL, I suppose that DROP would work - I don't know that anything else would. As for utilities, I don't know that anything other that REPAIR would work (aside, of course, from the CHECK DATA utility).

      Bottom line: if a table space is in check-pending status, in my mind it is effectively inaccessible from a SQL and a utility perspective. You need to act to get that table space out of check-pending status.

      Robert

      Delete
  19. Hi,I have only run check data with exceptable table for child table but I see few shops run check data with exception for parent table along with listing child tables.Which of the below format is correct?
    CHECK DATA TABLESPACE parent tablespace name
    FOR EXCEPTION
    IN parent table name USE exp1
    IN child 1 table name USE exp2
    IN child 2 table name USE exp3
    DELETE YES

    or

    CHECK DATA TABLESPACE parent tablespace name
    FOR EXCEPTION
    IN child 1 table name USE exp2
    DELETE YES

    ReplyDelete
    Replies
    1. Neither is "correct" as opposed to "incorrect" - they represent two options: you can run CHECK DATA on a single table space, or on several table spaces (which could be in the same set of referentially-related table spaces).

      Robert

      Delete
    2. In my first example,parent and child was on different tablespace.Several tablespaces in same check statment? can you me an example please .

      Delete
    3. Refer to Example 1 on this page of the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=data-sample-check-control-statements.

      Robert

      Delete
    4. Thanks for the article.As you said,if this in the same set of referentially-related table spaces.Does that mean parent tablespaces or child tablespaces to be all listed in same check statement.As per example 1- looks like any tablespaces can be listed in same job.Is that all parent tablespaces and associated parent tables? or only child tablespaces(or child which is parent of another child) and asscoated child tables?

      Delete
    5. As far as I know you can specify any table spaces you want for an execution of CHECK DATA - doesn't matter if the tables in the specified table spaces are child or parent in the context of a referential constraint.

      Robert

      Delete
  20. Hello Rob,I'm doing an load for table which has index (defined with copy yes) so I guess thats why after load is done it automatically places index in informational copy pending state.How can we prevent this from happening? nocopypend keyword does not seem to be rempving informational copy pending.

    ReplyDelete
    Replies
    1. You mention that, with regard to an index defined with COPY YES going into informational copy-pending status via execution of LOAD for the associated table space, the NOCOPYPEND keyword does not seem to be removing informational copy-pending." Check this page in the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement. In the description of the NOCOPYPEND keyword on that documentation page you will see the following: "A NOCOPYPEND specification does not... change any informational COPY-pending (ICOPY) status for indexes." Here's what that means: when LOAD is executed for a table, an index on the table that is defined with COPY YES will be placed in informational copy-pending (ICOPY) status (as noted on this page of the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/12?topic=copy-preparing-recovery-by-using-utility). Specifying NOCOPYPEND for that LOAD job will not change this.

      You ask, "How can we prevent this from happening?" You can prevent this from happening by altering the index with a specification of COPY NO. If you tell Db2, by including COPY YES in an index's definition, that you want to be able to image-copy the index and use those backups for index-recovery purposes, Db2 will tell you, via ICOPY status for the index, when you would be well-advised to create a new image copy of the index (such as following execution of LOAD for the table on which the index is defined). That's the deal. If you are willing to rely on REBUILD INDEX to recover an index when necessary, issue an ALTER for the index with COPY NO, and you will no longer see ICOPY status for that index.

      Robert

      Delete
    2. Thanks for the response

      Delete
  21. I ran a load replace on a table(which is only child to a parent table) and noticed that it is placed in check pending.
    Child tables will be in check pending only when the parent tables data gets loaded,isnt'it ? how come the load to child itself place table in check pending? Im trying to understand the reason behind it

    ReplyDelete
    Replies
    1. Think about the objective of referential integrity: it is to ensure that there will be no "orphan" rows in a child table (an "orphan" row being a row with a foreign key value that has no matching value in the referenced key of the parent table). If new rows are loaded in a parent table and those new rows have no matching rows in the child table, so what? Those are just parent rows with no corresponding child rows - no violation of referential integrity there. On the other hand, if you load rows into a child table with ENFORCE NO specified for the LOAD job, how does Db2 know if you've loaded child table rows with no match in the parent table? Answer: it doesn't know that, so it places the table space in check-pending status so you can execute CHECK DATA to determine if the child-load operation violated referential integrity constraints.

      Refer to this page in the Db2 for z/OS online documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=states-check-pending-status. On that page, you'll see this sentence: "A table space or partition is placed in CHKP status in any of the following situations that might introduce table check violations:" One of the listed situations is this one: "The LOAD utility is run with ENFORCE NO, and check constraints are defined on the table." Could the phrase "check constraints are defined on the table" apply to a parent table? No. A referential integrity constraint is ALWAYS defined only on the child table of a parent-child table relationship.

      Robert

      Delete
    2. But I ran the load without any enforce keyword-So by default as per ibm manual "enforce constraints" is applied.If that is the case,then how come the table I loaded (which is the only child table) gets into check pending?

      Delete
    3. What did the utility control statement look like for this execution of LOAD?

      Robert

      Delete
    4. LOAD DATA INDDN SYSRECAA LOG NO NOCOPYPEND RESUME NO REPLACE
      SORTKEYS SORTDEVT SYSDA SORTNUM 10
      STATISTICS TABLE(ALL) INDEX(ALL) KEYCARD FREQVAL NUMCOLS 10 COUNT 15
      INTO TABLE TBLNAME

      Delete
    5. Hmm. Given that LOAD utility control statement, the default of ENFORCE CONSTRAINTS should have been in effect, and if that were the case then the LOAD job itself should not have (per my understanding) put the table into check-pending status. Is it possible that some other process (e.g., a LOAD REPLACE of a parent table that has this table as a child) caused the table on which you're focused to go into check-pending status? If that did not happen, I am at a loss here. You might need to open a case with IBM Support to get your question answered.

      Robert

      Delete
    6. Yes load did not happen to parent table.

      Delete
    7. I do not know why the child table space was put in check-pending status. If you open a case with IBM Support, those folks might be able to determine what happened.

      Robert

      Delete
  22. Hello,
    On running copy for set of tablespaces with explicit specification of shrlevel change,I get below error.How does the shrlevel change affect/relate to logging attribute of tablespace ?
    DSNUBAII - COPY SHRLEVEL CHANGE OF TABLESPACE xxxxx.xxxx DSNUM 1 IS NOT ALLOWED
    BECAUSE IT HAS A LOGGING ATTRIBUTE OF NOT LOGGED

    ReplyDelete
    Replies
    1. Plese see this page in the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=copy-syntax-options-control-statement. On that page, in the description of the SHRLEVEL CHANGE option, you will find the following: "SHRLEVEL CHANGE is not allowed for a table space that is defined as NOT LOGGED unless it is a LOB table space and its base table space has the LOGGED attribute."

      Robert

      Delete
    2. Thanks for pointing that out.I'm trying to understand what would be the concept/logic behind this "not logged attribute and shrlevel change" why does shrlevel change not allowed for not logged ts's?

      Delete
    3. Probably because an image copy made with SHRLEVEL CHANGE in effect could not be relied upon as input for a RECOVER job that would recover a NOT LOGGED table space to a point of consistency. A SHRLEVEL CHANGE image copy is likely to be "fuzzy" (i.e., inconsistent), owing to data changes that could be made while the image copy is being generated. A fuzzy image copy, when restored as part of a RECOVER operation, can be taken to a point of consistency via roll-forward application of logged data changes, but that process is not an option for a NOT LOGGED table space.

      Robert

      Delete
  23. I came across below phrase from IBM manual -https://www.ibm.com/docs/en/db2-for-zos/13?topic=copy-copies-lists-objects

    "If you use COPY with the SHRLEVEL(CHANGE) option:
    If you specify OPTIONS EVENT(ITEMERROR,SKIP):
    Each object in the list is placed in UTRW status and the read claim class is held only while the object is being copied"
    When an object is placed in rw status it allows both read and write already,why does it talk about read claim class separetely?

    ReplyDelete
    Replies
    1. Hey, Doug. The STATE of a Db2 for z/OS object and CLAIM activity related to that object are two different things. Let's say that a database object - for example, a table space - is in a read/write state. That means (as you probably know) that the object can be accessed in a data-read way or in a data-change way by an application process. Are there any claims on the object? Maybe, maybe not. Any time a process accesses a Db2 database object, that process has to get a claim on the object (a read claim if the object will only be accessed for read purposes, a write claim if the process will - or at least, might - change data in the object, perhaps by way of INSERT, UPDATE or DELETE processing). When a process commits, any claims it might have on any database objects are released, and any such claim that the process still needs (e.g., if it's a batch job that issues multiple commits while executing) will have to be re-acquired after that commit.

      What is the purpose of claims? Basically, this has to do with Db2 for z/OS drain processing, which is primarily used by Db2 utilities. Take online REORG, for example. Near the end of an online REORG, we can't switch from the "original" data sets to the shadow data sets (which, after the switch, will be the object's "new original" data sets) if some process is currently "on" one or more of the original data sets - in other words, we're not going to jerk the data set "chair" out from under some process while the process is sitting on that chair. REORG will request a drain lock on the original data sets. That means no new claims will be allowed on the data sets while REORG tries to get that drain. If there are any clains on the data sets, the drain request will have to wait for those claims to be released (via commit), and if the drain has to wait too long then the drain request will time out. If REORG gets the drain it requested, it knows that no process is currently "sitting" on the data sets, and it can swap the old original data sets for the shadow data sets (new original data sets) while processes are waiting to take their seats on the data sets, and when the data set switch is complete the waiting processes get the claims they've requested, and they can take their data set seats and everyone's happy and no one ended up with their backside on the floor because of an unorderly data set switcheroo.

      Hope this helps.

      Robert

      Delete
    2. So Probably it tells that when an object is in utrw state it will allow only for read claim (not for write claim) while it is doing an copy? with below quote:
      "Each object in the list is placed in UTRW status and the read claim class is held only while the object is being copied"

      Delete
    3. UTRW status basically means that a utility is operating on the object, and it is generally available for read and write access from other processes during that time, but there could be some restrictions owing to the fact that the utility is operating on the object (the long form of UTRW is "Utility restrictive state, read-write access allowed."

      The "read claim" in the part of the documentation you quoted refers, I believe, to the read claim of the COPY utility itself: COPY will have a read claim on an object that is being copied, but that read claim will be released when COPY has completed the backup of the object in question.

      Robert

      Delete
    4. Out of curiosity asking this.Will an copy utility only require an read claim to copy an object.I was in an thought that any utility will acquire/require both read and write claim on object that will be accessed/processed.

      Delete
    5. See Table 1 on this page of the Db2 for z/OS online documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=copy-concurrency-compatibility. You will see that when COPY is executed with SHRLEVEL CHANGE (typically the case), COPY gets a read claim on the object being copied - not a write claim.

      Robert

      Delete
  24. I was reading through syscopy catalog table to find out if running repair on object would be recorded in syscopy?.But I see that repair is not listed in ictype but under field TTYPE(which talks about some flashcopy).What is this TTYPE and STYPE? does this mean running repair would make an flashcopy?

    ReplyDelete
    Replies
    1. STYPE and TTYPE are just names of SYSCOPY columns that contain values that in turn indicate various actions taken regarding Db2 for z/OS database objects.

      Yes, the SYSCOPY documentation suggests that REPAIR can lead to execution of a FlashCopy action, but I cannot think of a scenatrio in which that would happen (i.e., I do not know of a situation in which execution of REPAIR would cause a FlashCopy copy of a data set to be taken).

      Robert

      Delete
    2. I don't find repair and check data under ictype column but under different columns like ttype and stype.So does that mean syscopy does not record check data and repair utilities,as it is not under ictype?

      Delete
    3. Please refer to this page in the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=tables-syscopy. You will see that both CHECK DATA and REPAIR can be executed in ways that result in a SYSCOPY row being inserted with a value in the ICTYPE column.

      Robert

      Delete
    4. In order to test this ,I ran check data log no on a tablespace ,for which there was no record in syscopy.
      CHECK DATA LOG NO TABLESPACE xxx.xxx
      SCOPE ALL
      SORTDEVT SYSDA SORTNUM 03
      Could this be related db2 versions?.We are on catalog level 509 and function lebel 510.

      Delete
    5. Probably, a SYSCOPY record is not written unless CHECK DATA is executed with DELETE YES LOG NO. In that case, the target tabe space will be placed in COPY-pending status, and records in the log in the range of A to B, where A is the log point at which CHECK DATA started, and B is the log point at which CHECK DATA completed processing, will be available for recovery of data in the table space. This is explained in the description of DELETE YES LOG NO found on this page of the online Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=data-syntax-options-check-control-statement.

      Note: any time you want to see if something has changed between Db2 12 and Db2 13, after reviewing the Db2 13 information on a given page in the online documentation, go to the upper-left part of the page, where you see "change version," and choose 12 from the pull-down menu instead of 13.

      Robert

      Delete
    6. I tested an tablespace with check data log no delete yes as you said ,still syscopy record is not written.Not sure why syscopy entry is not written when the catalog table states option D is for check data log no(neither "check data log no" nor "check data log no delete yes"
      CHECK DATA LOG NO
      TABLESPACE DB2DB.DSNWETS
      FOR EXCEPTION DELETE YES
      IN DB2DB.TASKS USE DB2DB.DSN_TASKSX

      Delete
    7. I believe that a CHECK DATA DELETE YES LOG NO job has to actually delete rows from a table in order to cause a SYSCOPY row to be inserted. If your CHECK DATA DELETE YES LOG NO job resulted in deletion of one or more rows, and you did not get an associated row in the SYSCOPY table, I cannot explain that - you might need to open a case with IBM Support to determine what happened in that case.

      Robert

      Delete
  25. I'm doing copy on tablespace level with shrlevel reference.I had a question "whether copy on tablespace takes care of copy on indexes as well" like REORG.From IBM site-it states"Alternatively, specify the DSNDB01.SYSUTILX, DSNDB06.SYSTSCPY, or DSNDB01.SYSLGRNX table space with indexes over the table space that were defined with the COPY YES attribute"-does this mean it copies index as well along with tablespace?

    ReplyDelete
    Replies
    1. To get an image copy of an index, you have to execute the COPY utility explicitly for the index in question (and, as you have suggested, this is only relevant for an index defined with COPY YES).

      Robert

      Delete
    2. I noticed something with copy yes index.Few days after employee table creation- check data was run on this table as parent table(customer table) was load replaced and eventually check pending on child.so the syscopy entry of child table is only "CREATE" at this moment.But the check data job sysprint has messages " index is in icopy pending state" .As far my understanding ,ICOPY will only be placed for an table due to reorg/load.
      why would ICOPY be placed on an empty table index which is not loaded/reorg'd?

      Delete
    3. It may be that CHECK DATA was executed with DELETE YES and LOG NO. Per the information on the Db2 for z/OS documentation page at https://www.ibm.com/docs/en/db2-for-zos/13?topic=data-syntax-options-check-control-statement (see the description of the LOG NO option that is below the description of the DELETE option), "If any rows are deleted, CHECK DATA places the table space in COPY-pending status and any indexes with the COPY YES attribute in informational COPY-pending status."

      Robert

      Delete
  26. Hello,My reorg job failed stating tablespace left in copy pending status.Reorg did not have log keyword (considering default is yes),table space is defined with logged attribute.Also after the tablespace is created ,the first job that was submitted is reorg .So not sure why tablespace is placed in copy pending ? I also noticed that there is no index for this object.Not sure if that is related to copy pending.

    ReplyDelete
    Replies
    1. When the REORG job failed, what Db2 error message was received?

      What did the REORG utility control statement look like?

      Robert

      Delete
  27. Reorg control statement:

    TEMP COPY1 DSN " "
    UNIT TAPE RETPD 7 VOLCNT(7) DISP(NEW,CATLG,DELETE)

    REORG TABLESPACE DB1.TS1
    COPYDDN COPY1
    UNLDDN UND1
    SORTDEVT SYSDA SORTNUM 03
    UNLOAD CONTINUE
    LONGLOG
    SHRLEVEL CHANGE
    MAXRO 200
    DRAIN_WAIT 30 RETRY 4 RETRY_DELAY 20
    DRAIN DELAY 1200

    ERROR MESSAGE:
    DSNU400I COPY PROCESSED FOR TABLESPACE DB1.TS1 NUMBER OF PAGES =50760 PERCENT OF CHANGED PAGES=100 ELAPSED TIME =00.00.08
    TABLESPACE DB1.TS1 IS COPY PENDING
    UTILITY IS TERMINATED RC=8

    ReplyDelete
    Replies
    1. Hmm. The DSNU400I message just reports some statistics related to the image copy taken for the target table space. If REORG failed then I would have expected more diagnostic information from Db2 for z/OS related to why the REORG failed.

      You may need to open a case with IBM Support to find out what happened. If an online REORG fails, I would not expect the target object to be in any kind of pending status, because the REORG failure would have left the "original" data sets of the object as they were (since online REORG operates on shadow data sets). All I can think of is that the table space was in COPY-pending status prior to execution of the REORG, and that status was not reset by the online REORG because (apparently) the online REORG failed to run to completion. In that case the inline copy generated by REORG would not reset COPY-pending status because the inline copy would apply to the shadow data sets, and those would not become the "new original" data sets unless the online REORG ran successfully to completion.

      Again, that's speculation on my part - IBM Support might be better able to provide you with a definitive answer as to what happened in this case.

      Robert

      Delete
    2. I missed to include one more dsn message(DSNU381I) apart from DSNU400I that appeared in the log.Repasting the complete sysprint message :

      Reorg control statement:

      TEMP COPY1 DSN " "
      UNIT TAPE RETPD 7 VOLCNT(7) DISP(NEW,CATLG,DELETE)

      REORG TABLESPACE DB1.TS1
      COPYDDN COPY1
      UNLDDN UND1
      SORTDEVT SYSDA SORTNUM 03
      UNLOAD CONTINUE
      LONGLOG
      SHRLEVEL CHANGE
      MAXRO 200
      DRAIN_WAIT 30 RETRY 4 RETRY_DELAY 20
      DRAIN DELAY 1200

      ERROR MESSAGE:
      DSNU400I COPY PROCESSED FOR TABLESPACE DB1.TS1 NUMBER OF PAGES =50760 PERCENT OF CHANGED PAGES=100 ELAPSED TIME =00.00.08
      DSNU381I TABLESPACE DB1.TS1 IS COPY PENDING
      UTILITY IS TERMINATED RC=8

      Delete
    3. DSNU381I simply informs you that a table space is in COPY-pending status - it doesn't tell us why the REORG failed with a return code of 8.

      Robert

      Delete
    4. Right.It does not tell the reason for Rc 8.But still reading the list of options for DSNU381I,I'm thinking of the possibilities,Please find my inline questions :
      1)A LOAD or REORG utility with the LOG(NO) option left the associated table space in an unrecoverable state-If an load with log no "would have failed "earlier my reorg attempt to that object wouldnt it be recorded in syscopy? In this case I ran reorg (but without log keyword ,,default is yes,so definitely my failed reorg job would have not placed the object in copy pending rather i guess copy pending would have been placed way before reorg attempt)
      2)An image copy failed because broken pages were detected or an image copy was terminated-wouldnt be recorded in syscopy?
      3)A partial recovery finished successfully-wouldnt be recorded in syscopy?
      4)A MODIFY RECOVERY left the associated table space in an unrecoverable state-wouldnt be recorded in syscopy?

      Delete
    5. 1) If a LOAD or a REORG with LOG NO failed, I would not expect the target table space to be left in COPY-pending status - I would expect a LOAD or REORG executed with LOG NO to leave the target table space in COPY-pending status if the LOAD or REORG job succeeded.

      2) I do not know if a failed COPY would be recorded in SYSCOPY.

      3) Not sure what you mean by a "partial recovery," but I would expect a successful RECOVER to be recorded in SYSCOPY.

      4) As I noted in a previous response: on the page in the Db2 for z/OS documentation at https://www.ibm.com/docs/en/db2-for-zos/13?topic=utilities-modify-recovery, under the heading, "SYSIBM.SYSCOPY records," you will see the following: "In most cases, MODIFY RECOVERY inserts a row into SYSIBM.SYSCOPY to record the RBA or LRSN of the most recently deleted SYSCOPY or SYSLGRNX record." I would expect an entry in SYSCOPY for a MODIFY RECOVERY that removed image copy-related rows for a table space from SYSCOPY.

      Robert

      Delete
  28. I ran modify recovery with delete date (*) which placed a tablespace in copy pending.When I ran reorg on same tablespace (with shrlevel change/shrlevel none) it went fine .Per your suggestion ,if an tablespace is in copy pending before reorg ,it will not reset the status.How come it went fine ?

    ReplyDelete
    Replies
    1. That REORG you ran reset COPY-pending status for the table space because it (the REORG) ran successfully to completion. In your previous comment (assuming thet you're the same "Anonymous" that posted the previous comment), you referenced a REORG job that failed. I would not expect REORG to reset COPY-pending status for a table space if the REORG job does not run successfully to completion.

      Robert

      Delete
    2. Gotcha.Since the table space was placed in copy pending On investigating what caused copy pending,the only syscopy entry was "create" of that tablespace.I was thinking if an modify recovery would have removed entries(causing copy pend).If that is the case,I assume modify recovery would still be recorded in there,not just "create " entry .Is it possible for modify recovery entry itself would get deleted from syscopy?.
      Lets say,If modify recovery(MD1) is run to remove entries older than 15 days.Any modify recovery ran within 15 days would be removed.What about the one ran(MD1) to remove them ,will that be removed in any case?

      Delete
    3. Check this page in the Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=utilities-modify-recovery. On that page, under the heading, "SYSIBM.SYSCOPY records," you will see the following: "In most cases, MODIFY RECOVERY inserts a row into SYSIBM.SYSCOPY to record the RBA or LRSN of the most recently deleted SYSCOPY or SYSLGRNX record." You will also see, in that same section of the page, that there are exceptions to that rule. If there are no SYSCOPY rows for the table space in question other than the row for CREATE, my guess is that the table space was in COPY-pending status because it had never been image-copied. If you know that the table space had been image-copied, you might need to work with IBM Support via a case to see what happened.

      Robert

      Delete
    4. You are right.The only syscopy entry was "create" on that table space no other activity/utility was run on that table space.So you mean that when no other activity/process happens after create,so automatically object will be placed in copy pending ? I was not able to find this info out there in any manual.So I was not sure .

      Delete
    5. I can certainly see how you'd get that impression from my previous response - sorry about that.

      NO, a table space will NOT go into COPY-pending status just because you created it and maybe put some data in it (thank goodness - what a hassle it would be if that were the case).

      Aside from actions related to certain utilities, about the only thing I can think of that would put a table space in COPY-pending status would be altering it from NOT LOGGED to LOGGED. If that doesn't apply in your case, I do not know why the table space was placed in COPY-pending status. As mentioned previously, you might be able to find that out via a case opened with IBM Support.

      Robert

      Delete
    6. I guess you meant -'just because you created it maybe "NOT" put some data in it" is that right'? instead of -' just because you created it and maybe put some data in it'.Is that right ?

      Delete
    7. I meant what I said. If you created a table, and it went into COPY-pending status every time you put new data into it or changed data in it using (for example) INSERT or UPDATE statements, that would be terrible.

      Robert

      Delete
  29. Hello,I see below context in IBM for load utility under "replace option"
    https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement
    You cannot use REPLACE with the PART integer REPLACE option of INTO TABLE; you must either replace an entire table space by using the REPLACE option or replace a single partition by using the PART integer REPLACE option of INTO TABLE.
    My question is "it states cannot use replace with part int replace at first place and then it also states can replace single part with part INT replace".I'm confused

    ReplyDelete
    Replies
    1. It just means that you cannot specify REPLACE both outside of AND inside of the INTO TABLE part of a LOAD utility control statement. You specify REPLACE either outside of, or inside, the INTO TABLE part of the utility control statement. If you want to replace all the data in a table space, you specify REPLACE outside of the INTO TABLE part of the utility control statement. If you want to replace data in a given partition of a partitioned table, you specify REPLACE (along with the appropriate PART specification) within the INTO TABLE part of the utility control statement. You can see examples of both uses of REPLACE on the documentation page at https://www.ibm.com/docs/en/db2-for-zos/13?topic=load-sample-control-statements.

      Robert

      Delete
  30. Hello,when I submitted an load job with shrlevel reference indexdefer NPI and indexdefer ALL the job went fine.But as IBM manual ,Indexdefer option(ALL/NPI) of load utility states that -"ALL/NPI is valid only if SHRLEVEL NONE is also specified".
    So how come shrlevel reference worked when manual indexdefer is valid only with shrlevel none .
    https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement

    ReplyDelete
  31. Load card:
    LOAD DATA INDDN SYSREC LOG NO REPLACE NOCOPYPEND SHRLEVEL REFERENCE INDEXDEFER ALL INTO TABLE EMP01.EMPLOYEE.
    this ran fine where this should have per manual stating (indexdefer all/npi is valid with shrlevel none is also specified)

    ReplyDelete
    Replies
    1. Are any indexes defined on the EMP01.EMPLOYEE table? If so, were they built as a result of this execution of LOAD REPLACE with SHRLEVEL REFERENCE?

      Robert

      Delete
    2. Yes there are indexes built on table.I saw these message related index.

      "index build deferred for index EMPIX1 "
      after the reload phase
      "build phase statistics and build phase complete messages"

      Delete
    3. OK, if in fact the indexes on EMP01.EMPLOYEE were NOT built as part of the LOAD REPLACE job, and were in REBUILD-pending status (RBDP) following execution of that job, this would indicate that the documentation is in error regarding the use of INDEXDEFER together with SHRLEVEL REFERENCE. In that case, you could inform the IBM Db2 for z/OS documentation team of the situation as described on this page of the online documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=documentation-how-send-comments.

      Robert

      Delete
    4. Did you expect the indexes to be built due to SHRLEVEL REFERENCE keyword in load card? what was the reason behind index built question ?

      Delete
    5. I'm just saying that the documentation, as you pointed out, states that INDEXDEFER ALL is not valid for a SHRLEVEL REFERENCE REORG. If the indexes were in fact built via the execution of the SHRLEVEL REFERENCE REORG, that would suggest that a better way of phrasing the situation in the documentation is that INDEXDEFER ALL is "ignored" (versus "invalid") for a SHRLEVEL REFFERENCE REORG. If, on the other hand, the indexes were not built via execution of the SHRLEVEL REFERENCE REORG (that is, if the indexes were in RBPD status after completion of the SHRLEVEL REFERENCE REORG), the documentation appears to be incorrect - it would appear in that case that INDEXDEFER ALL is indeed valid for a SHRLEVEL REFERENCE REORG.

      Robert

      Delete
  32. With below load card ,I got an error stating "invalid operand numrecs for keyword load"
    Load data INDDN sysrec replace into table "tablename" shrlevel reference numrecs 121365
    Looks like the NUMRECS keyword cannot be specified as the last specification of load card.
    But I do not see this condition specified under NUMRECS option.
    https://www.ibm.com/docs/en/db2-for-zos/12?topic=load-syntax-options-control-statement

    ReplyDelete
    Replies
    1. The problem could be the SHRLEVEL specification - that's a table space specification and you have it in the table-specification part of your LOAD utility control statement.

      Robert

      Delete
    2. I see NUMRECS specified under INTO -TABLE -SPEC

      Delete
    3. My point is not that NUMRECS is in the wrong place in your LOAD utility contral statement; rather, my point is that SHRLEVEL REFERENCE appears to be in the wrong place in the utility control statement. SHRLEVEL is a table space-related specification, but you have it in the INTO TABLE part of the utility control statement. That could be the reason you received the utility control statement syntax error.

      Robert

      Delete
    4. Understood.We mention/specify table names in load cards.Do we have table space specification for load card ? I only see table specification in the syntax diagram .

      Delete
    5. Look again at the syntax diagram for the LOAD utility (https://www.ibm.com/docs/en/db2-for-zos/13?topic=load-syntax-options-control-statement). Everything you see "along the syntax line" prior to the "INTO-TABLE-spec" is a table space-related specification. That includes the SHRLEVEL specification, which itself is part of the ""resume-spec".

      Robert

      Delete
    6. I was not being very technically precise when I said that LOAD utility options that precede INTO TABLE in the utility syntax diagram are "table space-related." Actually, the options that precede INTO TABLE either relate to the table space associated with the target table (e.g., SHRLEVEL or KEEPDICTIONARY), or to the operation of the utility for a given job (e.g., LOG YES or SORTKEYS or ENFORCE CONSTRAINTS).

      Robert

      Delete
  33. What kind of lock does an "Load log no replace" take on an table/tablespace?
    I see an load job failed with reason code OOC200EA.
    I only see an select of two columns with an single where condition run on same table at same time?
    Will an select cause an load utility to fail?

    ReplyDelete
    Replies
    1. LOAD requires exclusive access to the table space being loaded (or the partition being loaded, if it is a partition-level LOAD), unless SHRLEVEL REFERENCE or SHRLEVEL CHANGE is specified in the utility control statement (these SHRLEVEL options are described on this page in the Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=load-syntax-options-control-statement).

      The 00C200EA reason code (the first 2 characters are 00, not OO) received when the LOAD failed indicates that LOAD could not acquire the required drain lock on the target table space (or partition). A drain lock request will fail if there are any claims on the target table space (or partition) that are not released (by a commit on the part of the claimer) before the drain lock request times out. Any process that accesses a Db2 for z/OS table using SQL will have a read or a write claim on objects accessed (table spaces and/or partitions and maybe indexes), and those claims are held until the SQL-issuing process commits. More information on drains and claims can be found in this blog entry that I wrote while working as an independent consultant, prior to rejoining IBM in 2010: https://catterallconsulting.blogspot.com/2009/03/claims-and-drains-on-mainframe.html.

      Robert

      Delete
    2. When select sql is issued/executing on a table ,it holds "S" lock.For the load to get an exclusive lock,even read lock/"s" lock is not allowed/cannot run?

      Delete
    3. Actually, when a process issues a SELECT that targets a certain table, the lock on the associated table space that will be acquired by the process will almost certainly be an IS lock, as opposed to an S lock.

      That said, this is more about drains and claims than it is about locks. Refer to this page in the Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=load-concurrency-compatibility. In table 1 on that page, you will see that when LOAD is executed (unless executed with SHRLEVEL CHANGE), it will request a DA drain - that means "drain all," and THAT means (as explained in the legend for table 1) "no concurrent SQL access;" so, if LOAD can successfully drain all claimers, there will be no SQL access to data in the table space while LOAD is executing. If the LOAD job cannot get the requested "drain all" for the table space (because one or more processes accessing the table space has not committed in a timely manner, as commit is required for release of claims) within the timeout interval, the LOAD job will fail.

      Robert

      Delete
  34. Hello,I executed an Load with resume no log no format spanned yes enforce constraints
    The load failed with error "TABLESPACE IS NOT EMPTY"
    Resume no is considered as replace .For an replace to happen it can be either empty or non empty tablespace.It shoudn't be an issue i believe.
    Even for RESUME YES still,it will load to an empty as well non empty .Isn't .
    Why would it fail due to "NOT EMPTY" ?

    ReplyDelete
    Replies
    1. RESUME NO is NOT equivalent to REPLACE. Refer to the information about options for the LOAD utility on this page in the Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=load-syntax-options-control-statement. On that page, you will see the following in the description of the RESUME NO option (I have capitalized some words for emphasis): "Loads records INTO AN EMPTY TABLE SPACE. If the table space is not empty, and you have not used REPLACE, a message is issued and the utility job step terminates with a job step condition code of 8." As indicated in this excerpt from the documentation, use of REPLACE is required if RESUME NO is specified and the target table space is not empty.

      Robert

      Delete
  35. Hello,
    I have an requirement to refresh LOB table from Production. Per spanned format, I have unloaded LOB table successfully.
    What will happen to base table after I refresh/load the LOB table in target ?
    Like how we refresh parent and child table in order to make it sync, should I refresh base table as well along with LOB table ?

    ReplyDelete
    Replies
    1. Yes - I would think that you would want to unload the non-LOB and the LOB values from the table in production, and then LOAD that data (non-LOB and LOB values) into the table in the non-production environment. When you unload non-LOB and LOB data from a table using the UNLOAD utility's report for the spanned record format, a given row's non-LOB values in the unload data set will be followed by that row's LOB value or values, and then you'll have the next row's non-LOB values followed by the row's LOB value or values, and so on.

      Robert

      Delete
    2. Just to reclarify, so you are saying that I need to unload both LOB and base table with spanned record format from production and then load both base table as well as LOB table in non prod regions.
      But our aim is to refresh LOB table ,wouldn't that suffice to load LOB table alone .Wouldn't it align along/refresh the pointers of base table (as it just points lob table not store lob data)

      Delete
    3. I am saying that if it were me, I would unload the base table and LOB table space values from production and use those values to load the base table and the associated LOB table space(s) in the non-prod environment. Why not do that? The CPU cost difference between doing that and just LOAD REPLACE-ing the LOB table space should be relatively small, and this approach would ensure consistency between the base table and the associated LOB table space in the non-prod environment. If you are pretty certain that the base table in non-prod contains exactly the same data as the base table in prod, you could try just LOAD REPLACE-in the LOB table space in the non-prod environment. After doing that I'd want to run the CHECK LOB utility for the LOB table space and the CHECK DATA utility for the base table space.

      Robert

      Delete
  36. Hello,

    I'm trying to unload an PBR table for partitions 11:15 from image copy dataset.
    Below is my unload card.
    TEMPLATE UNLDS DSN'E000.&DB..&TS.REC.P&PA.' UNIT TAPE DISP (NEW,CATLG,DELETE)
    UNLOAD TABLESPACE A1.EMP PART 10:15
    FROMCOPY E375.DB1.TS1.D235
    UNLDDN UNLDS
    PUNCHDDN PUNCHDS
    The job only takes unload for one partition(i.e partition 11) and I see below message in log
    DSNU1252I PARTITION PARALLELISM IS NOT ACTIVATED AND THE PARTITION VARIABLE IN THE TEMPLATE DSN WAS REPLACED BY '00000' FOR TABLESPACE TS1
    How can I unload all specified partitions (11:15) in same job using Image copy dataset?

    ReplyDelete
    Replies
    1. With regard to the DSNU1252I message, please refer to the description of the message in the Db2 for z/OS documentation, at https://www.ibm.com/docs/en/db2-for-zos/13?topic=messages-dsnu1252i. You will see there that "partition parallelism is not supported... when FROMCOPY... is specified." You will also see that, when this message is issued, "For the indicated table space, a single output data set is allocated, and the unloaded data is placed in that data set." In other words, you are going to get a single data set for the unloaded data, and that makes a partition designation for that data set meaningless.

      As for why only one partition was unloaded, it may be that the table space has been image-copied at the partition level (quite common), and the specified image copy data set corresponds to only one partition. If you need to unload from multiple partition-level image copy data sets, you use the FROMCOPYDDN option to concatenate those data sets (see the description of FROMCOPYDDN on this page of the Db2 for z/OS documentation: https://www.ibm.com/docs/en/db2-for-zos/13?topic=unload-syntax-options-control-statement. See also https://www.ibm.com/docs/en/db2-for-zos/13?topic=unload-unloading-data-from-image-copy-data-sets, and on that page do a search on "partition" to find relevant information.

      Robert

      Delete
    2. The image copy was taken on tablespace level-E375.DB1.TS1.D235.In order to unload rest of the partitions ,same image copy dataset was used .
      But just that ,eventhough we used that image copy for doing unload of partition 11:15 still,only partition 11 was unloaded .
      the unload dataset that was created out of this job is E000.DB1.TS1.REC.P0011

      Delete
    3. Assuming the input image copy data set is a full image copy, and assuming that partitions 12-15 were non-empty at the time the image copy was taken, I do not know why you are only getting data from partition 11 unloaded. You might need to open a case with IBM Support to find out what is going on here.

      Robert

      Delete
    4. Interestingly,per DSNU message
      AND THE PARTITION VARIABLE IN THE TEMPLATE DSN WAS REPLACED BY '00000' FOR TABLESPACE TS1
      but the sysrec dataset that was created was-E000.DB1.TS1.REC.P0011 and did not have any 00000 in it (per message).
      Am I missing something here?

      Delete
    5. If there are data records for partitions 12, 13, 14 and 15 in the image copy data set from which you have unloaded data, I do not know why this discrepancy has occurred. If it turns out that there are no data records for partitions 12, 13, 14 and 15 in the image copy data set from which you loaded data, it is conceivable that Db2 treated this as a single-partition unload and dismissed the template-related issue concerning multiple partitions.

      Robert

      Delete
    6. Even though this is treated as single-partition unload. Lets say ,it has data records only for partition 11 .
      I have my template card to have partition number .
      'E000.&DB..&TS.REC.P&PA.' Unload dataset created was -E000.DB1.TS1.REC.P0011.
      DSNU message states:
      AND THE PARTITION VARIABLE IN THE TEMPLATE DSN WAS REPLACED BY '00000' .
      We see only 'P0011' in dataset created by this job not '00000'.
      Is this '00000' meant for a different context?

      Delete
    7. Two possibilities occur to me:

      1) The message text is in error, meaning that you have apparently encountered a Db2 for z/OS code bug.

      2) The DSNU1252I message was generated because you had a partition-number qualifier in the name of the output data set (per your template specification), but this is not compatible with an unload of multiple partitions from an image copy data set. I mean to say that this error could have been generated before the UNLOAD utility job ever examined the provided image copy data set. If (and this is a big "if") the image copy data set only held data for one partition, it may be that the previously-generated DSNU1252I message became irrelevant because the provided image copy data set did not in fact hold data for more than one partition.

      If it's not one of these two things, I'm at a loss as to why you're seeing what you're seeing.

      Robert

      Delete
    8. I have an question on FROMCOPYDDN option in unload.
      https://www.ibm.com/docs/en/db2-for-zos/12?topic=unload-unloading-data-from-image-copy-data-sets
      Is this option applicable when:
      1)one tablespace has different image copy datasets. Do we have different image copy datasets created for same tablespace? I have never seen that,for all partition we have same dataset name.
      Under what circumstances we see different datasets?
      2)I have different tablespaces(each tablespaces will have unique dataset names) listed under LISTDEF like
      LISTDEF INCLUDE TABLESPACE A1.B1,A2.B2,A3.B3

      How to design my template for FROMCOPYDDN to pick corresponding image copy datasets for each tablespace?

      Delete
    9. You can certainly create image copies of a partitioned table space at the partition level (see the description of the DSNUM option of copy on the documentation page at https://www.ibm.com/docs/en/db2-for-zos/13?topic=copy-syntax-options-control-statement), and FROMCOPYDDN is intended primarily for use when you have partition-level image copies of partitions of a single partitioned table space, and you want to unload data from the multiple partition-level image copy data sets associated with a single table space into a single output data set.

      The TEMPLATE is for the output data set. If you use multiple partition-level image copies (related to different partitions of a single table space), the data unloaded from those data sets - data sets related to the FROMCOPYDDN specification - will always be unloaded to a single unload data set.

      Robert

      Delete
  37. Hello,
    I see write compression dictionary log status in syscopy after load to an table .
    When I checked for reasons for this status ,I see multiple reason like :
    compress yes attribute in tablespace, data capture changes specified in table, not having keepdictionary in load card.

    The table I'm working has both "compress yes and data capture changes" .Not sure what could have been the reason for this "WRITE.COMP.DICT.LOG".

    ReplyDelete
    Replies
    1. When a Db2 for z/OS table space is defined with COMPRESS YES, Db2 writes data-change records associated with the table space to the log in compressed format. When the table in the table space is a data replication sources (as indicated by DATA CAPTURE CHANGES), the replication tool used will get data-change information for the table from the Db2 log. These records will have to be decompressed before they are provided to the replication tool for propagation to the replication target. By default, a LOAD REPLACE will build a new compression dictionary, based on the new data being loaded into the table. When this happens, Db2 will write a copy of the old compression dictionary to the log if the table being loaded is defined with DATA CAPTURE CHANGES. Why? So that, if necessary, records associated with the table space that were written to the log when the old compression dictionary was in effect can be decompressed (for replication purposes) using that old compression dictionary.

      Robert

      Delete
  38. Which one will cause new compression dictionary to be built ,is it due to "compress yes" attribute or "data capture changes" or due to "load replace" itself.
    Also, you have mentioned ,old dictionary that were written in log can be decompressed for replication purpose.
    How to do that ? Does DB2 take care of that internally or should we do that by any mechanism

    ReplyDelete
    Replies
    1. Why is the old compression dictionary written to the log? This is done as part of a LOAD REPLACE when the target table space is defined with COMPRESS YES and the target table is defined with DATA CAPTURE CHANGES - so, it's related to all three things you mentioned.

      Use of an old compression dictionary, written to the Db2 log, to decompress log records that were compressed by that old dictionary, is handled by Db2 for z/OS. Automatic.

      Robert

      Delete
  39. Interesting ! I did an testing by creating a sample table(objects structure same as table where I observed this write compress dict log status) defined with "COMPRESS YES" in tablespace structure and "DATA CAPTURE CHANGES" in table structure and did an load replace .Below is load card:
    LOAD DATA INDDN SYSREC01 LOG NO REPLACE NOCOPYPEND
    STATISTICE TABLE(ALL) INDEX(ALL) KEYCARD FREQVAL NUMCOLS 6 COUNT 25
    INTO TABLE A.B
    Surprisingly ,I do not see "write compression dict log" in syscopy.
    Structure is all same between them ,why DB2 is printing this status in syscopy?

    ReplyDelete
    Replies
    1. If the new table you created did not contain any data prior to the LOAD REPLACE, it did not have an associated compression dictionary. If the table space does not have a compression dictionary then there will be no write of the compression dictionary to the log when the LOAD REPLACE executes.

      Robert

      Delete
    2. You are right ,I tested this. The entry "write compression dictionary log" gets written only for LOAD REPLACE and not for load resume no/yes. Even though there is data /compression dictionary prior to it .So not sure why this happens only for replace not for resume? Any idea ?

      Also ,I was in an assumption that "compression dictionary log" entry will be recorded in Syscopy only when there is changes/updates in log, Looks like its not so. Even if we execute "load replace" multiple times within a span of seconds (and there is no occurrence of inserts/deletes/updates between those executions) still it writes compression dictionary log status in syscopy.

      Delete
    3. The compression dictionary for a table space (or a partition) will be written to the log if the dictionary is rebuilt. LOAD with REPLACE will, by default, rebuild a compression dictionary for a table space or a partition. I believe that when there is already data in a table space and a compression dictionary exists, a LOAD with RESUME YES will retain the dictionary. If you want to run a LOAD REPLACE without rebuilding an existing compression dictionary, specify KEEPDICTIONARY in the utility control statement.

      Robert

      Delete