Saturday, February 11, 2012

Got LOBs? Get DB2 10 for z/OS (Part 2)

So, last week I posted a blog entry describing one of the two really important (in my opinion) DB2 10 for z/OS enhancements related to LOB (large object) data management -- that being the ability to "in-line" a portion (or even all) of a LOB column's data values in a base table space, alongside the associated table's non-LOB data values (this as opposed to having to store all of every LOB value in a LOB table space that -- while logically transparent -- is physically distinct from a table's base table space). This week, I'll cover the other of my two favorite DB2 10 LOB-related enhancements: support for the variable-blocked spanned record format for the SYSREC data set of LOAD and UNLOAD utility jobs. I'll also provide some information about other improvements in LOB data management delivered with DB2 10. [And before going further, I'll give props here to Jeffrey Berger, a member of IBM's DB2 for z/OS development team, who's helped me to build up my knowledge of LOBs.]

At first glance, being able to use the variable-blocked spanned (VBS) record format for the LOAD and UNLOAD SYSREC data set may not seem like a big deal. A little background, then: SYSREC is the default DD name for the data set into which data is unloaded from a table via the UNLOAD utility, and it's the default DD name for the data set that holds records that are loaded into a table space through the LOAD utility. Prior to DB2 10, the LOAD or UNLOAD SYSREC data set had to use the variable-blocked record format (RECFM=VB). That's fine in most cases, but often a problem when LOB data is involved. Why? Because the maximum record length for RECFM=VB is 32,760, and as we all know a LOB data value can exceed 32 KB in length. How, then, were LOB values handled with respect to UNLOAD output and LOAD input? Here's how: when a table with a LOB column is unloaded with the UNLOAD utility, the non-LOB data goes into the one SYSREC data set, but each individual LOB data value goes into a separate file (the same is true, in reverse, for the LOAD utility: individual LOB data values are loaded from separate files). The name of the file into which a LOB value is placed (for UNLOAD) or from which it is retrieved (for LOAD) goes into what's called a file reference variable, or FRV, and it's the FRV values that are found in the SYSREC data set, along with the non-LOB data processed by UNLOAD and LOAD.

With regard to the file type used for LOB data UNLOAD and LOAD in a pre-DB2 10 environment, you have choices, but these come with associated trade-offs. One option is to unload LOB data values into (or load LOB data from) members of a PDS or PDSE. The goodness in that alternative is the fact that most DB2 for z/OS people are very familiar with these data set types. The negatives are related to scalability (i.e., accommodating a large amount of LOB data). Specifically:
  • A PDS or PDSE is limited to one disk volume.
  • The size of a PDS is limited to 65,536 tracks.
  • A PDSE can have no more than 524,236 members.

LOB data can also be unloaded to, or loaded from, HFS files (referring to a file system available via the UNIX System Services component of z/OS, also known as USS). This USS file system doesn't have the above-noted space limitations of PDS and PDSE data sets, and it delivers a performance advantage, to boot; however, a lot of mainframe DB2 people are not very familiar with HFS, so there can be a learning curve to deal with when this file system is used for DB2 LOB data. On top of that, for UNLOAD an FRV cannot refer to a sequential file (DSORG=PS), which is the type used for data sets on tape (this is true whether HFS files or members of a PDS or PDSE are used to hold unloaded LOB values). Bummer.

Enter DB2 10, and these limitations and hassles are removed because the variable-blocked spanned record format (RECFM=VBS) is supported for SYSREC data sets (use of such a data set is indicated via the new SPANNED option of the LOAD and UNLOAD utility control statements). Thanks to this enhancement,
  • ...a table's LOB and non-LOB data can be unloaded to (or loaded from) the SAME data set! LOB values no longer have to be in separate files!
  • ...the UNLOAD output file (or LOAD input file) can be on tape -- not just disk!
  • ...the UNLOAD output file (or LOAD input file) can span multiple volumes (thus overcoming a PDS/PDSE restriction)!

Those exclamation points may come across as a little juvenile, but I really am psyched about this DB2 10 feature. Add to the benefits listed above a MAJOR performance boost, especially for smaller LOBs: an IBM test of an UNLOAD of 16,000-byte LOB values showed an 80% reduction in elapsed time when a VBS SYSREC data set was used versus HFS file reference variables, and a 99% elapsed time improvement for a VBS SYSREC data set as compared to PDSE FRVs.

Truly, if you are storing LOB values in a DB2 for z/OS database, or thinking of doing that, DB2 10 is the release for you. As if LOB in-lining (written about, as previously noted, in an entry I posted to this blog last week) and VBS SYSREC data sets weren't enough to convince you of that, consider these additional LOB-related goodies delivered with DB2 10:
  • LOAD REPLACE of data in a table containing LOB values is substantially faster with DB2 10 than with prior DB2 releases, thanks to the use of a write operation called format write for LOB table spaces (LOAD previously used format writes only for non-LOB table spaces).
  • INSERT of data from a DRDA client into a table with a LOB column can take considerably less time when LOB values are large, because DB2 10's distributed data facility (DDF) does not have to materialize the entire to-be-inserted LOB value before passing to to the database services address space (DDF will materialize up to 2 MB of an incoming LOB value before starting to pass the value to the database manager, and it will continue passing chunks of the value until it's all been received from the client). In addition to reducing elapsed time for network-driven inserts of larger LOBs, this feature can reduce CPU time and virtual storage consumption associated with such insert operations.
  • DB2 10 supports online REORG of a LOB table space with SHRLEVEL CHANGE, and it will reclaim disk space (if less is needed, as would be the case if some LOB values had been remove via DELETE operations). Previously, SHRLEVEL CHANGE could not be specified for a REORG of a LOB table space, and the utility would not reclaim space occupied by the LOB table space.
  • The new AUX YES option of REORG will cause the utility to reorganize LOB table spaces associated with partitions of a partitioned table space, as these base table space partitions are reorganized. Additionally, when a partitioned table space holding a table with LOB data is reorganized in a DB2 10 environment, REORG can move rows from one partition of the base table space to another -- something not done in prior releases of DB2. This functionality enables REBALANCE to be specified for a REORG of a range-partitioned table space holding a table that contains LOB data, and it also comes into play when a partition-by-growth table space associated with a LOB-containing table is reorganized (in that case, rows could be moved from one partition to another to reestablish the table's clustering sequence).
  • DEFINE NO can be specified on the creation of a LOB table space, so that the table space's data set(s) will not be created until data is inserted into the table space (this option was previously ignored when specified for a LOB table space). DEFINE NO can be particularly useful when DB2 objects are being defined as part of the installation of a packaged software application that will not be immediately put to use.

All good stuff. Get to DB2 10 for z/OS (if you're not there already), and reap these LOB-related benefits. If you're not yet storing LOBs in DB2 tables, get to DB2 10 and re-think that situation. More than ever, it's LOB time in mainframe DB2 land.

34 comments:

  1. Another advantage of VBS vs. VB. When using UNLOAD with VB to move LOB data from one environment to another (say Dev to PAT) as you mentioned above the DSNAME of the LOB unload file is recorded in the SYSREC unload (in lieu of te LOB column value). This can be a problem if the HLQ is not valid on the target environment. Using VBS eliminates this.

    Michael Harper/TD Bank

    ReplyDelete
    Replies
    1. Good point, Michael -- I had not thought of that.

      Robert

      Delete
  2. Question: Does TEMPLATE support VBS? http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z10.doc.sqlref%2Fsrc%2Ftpc%2Fdb2z_refs2builtinsessionvars.htm seems to indicate "no".

    Michael Harper/TD Bank

    ReplyDelete
    Replies
    1. Whoops, wrong URL - http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z10.doc.ugref%2Fsrc%2Ftpc%2Fdb2z_utl_template.htm

      Delete
    2. Michael, with regard to the question of whether or not VBS is supported when one uses a template for the unload data set of an UNLOAD utility execution, you indicate that the DB2 10 Utility Guide and Reference "seems to indicate 'no'. In looking over that information, I don't reach the same conclusion. Could you point me to the specific phrase in the DB2 10 utility doc that "seems to indicate 'no'" with respect to VBS support when a template is used for the unload data set of an UNLOAD job?

      Robert

      Delete
    3. If you go to the syntax diagram the valid values for RECFM are listed as F, FB, V, VB - no VBS. Sorry I cannot actually verify this as I won't have access to a V10 NFM environment for another month or 2 - they are still migrating CM across all the subsystems.

      Michael Harper/TD Bank

      Delete
  3. Shouldn't be a problem for you, Michael. First, I believe that RECFM only needs to be specified for a TEMPLATE if you will be using BatchPipes or a z/OS Unix System Services file. Second, if you specify SPANNED YES on the UNLOAD utility control statement then UNLOAD will ignore the RECFM attribute of the unload data set.

    Robert

    ReplyDelete
  4. UNLOAD TABLESPACE TESTDB1.CLOBBASE SPANNED YES

    INVALID OPERAND 'SPANNED' FOR KEYWORD 'UNLOAD'

    As soon as I used SPANNED, it would not process at all :(

    Frustrated as smaller LOB's work fine without SPANNED.

    ReplyDelete
    Replies
    1. Is the DB2 for z/OS subsystem in question at the DB2 10 level, running in new-function mode? If so, what does the complete utility control statement look like, and what, if any, message (of the form "DSNUnnnnI") was issued when the job was submitted?

      Delete
    2. CATALOG LEVEL(101) MODE(CM9 )

      I thought about this last night, as I was assured it was V10 but I believe this is NOT new-function mode, but Compatibility or Conversion mode - from V9

      Thanks for the tip, looks like I won't be spanning any time soon.

      Peter.

      Delete
    3. You've got it, Peter. CM9 does indeed indicate that the DB2 10 for z/OS subsystem is running in conversion mode, having been migrated from a DB2 9 environment. UNLOAD and LOAD support for the variable-blocked spanned record format for the SYSREC data set (i.e., the unload-to data set for UNLOAD, or the load-from data set for LOAD) is available in DB2 10 for z/OS when the subsystem is running in new-function mode (NFM).

      Robert

      Delete
  5. A DBA at a site that recently went to new-function mode in a DB2 10 for z/OS environment experimented with the new SPANNED YES option of the UNLOAD utility. This keyword is required (as noted in the blog entry above) if you want to put the unloaded LOB and non-LOB data from a table into a single output data set (versus having to unload each LOB value into a separate HFS file or PDS member, as was formerly the case). The thing is, just including SPANNED YES in the UNLOAD utility control statement is not sufficient to get the desired behavior. My DBA friend found that out, and suggested that I highlight this information (the fact that SPANNED YES alone is not sufficient for unloading LOB and non-LOB data from a table into a single output data set is noted in the DB2 10 documentation, but it's something that one could overlook).

    So, here's the scoop: to unload LOB and non-LOB data from a table into a single output data set, specify SPANNED YES in the UNLOAD utility control statement. ADDITIONALLY, provide a field specification list in the control statement (i.e., a FROM TABLE statement with a list, in parentheses, of the columns of the table that are to be unloaded). ADDITIONALLY, in that field specification list, make sure that any LOB columns in the table (and/or XML columns, if applicable) are listed at the END of the field specification list, regardless of their position in the logical ordering of the table's columns (the logical order being the order in the CREATE TABLE statement). ADDITIONALLY, do NOT specify length and POSITION for LOB columns (and XML columns, if applicable).

    So, with SPANNED YES, a field specification list, LOB columns at the end of the field specification list, and no length and POSITION specifications for LOB columns in the field specification list, you should be good to go.

    ReplyDelete
  6. Nothing like a good example that works (in v10):

    //SYSIN DD *
    UNLOAD TABLESPACE CASTERD2.BOTLDATS
    FROM TABLE DCWF003P.BOTLDATA
    (
    REPORTID VARCHAR(17),
    COUNTER VARCHAR(4),
    DATE_ADDED DATE EXTRNAL,
    REPORT BLOB
    )
    SPANNED YES
    SHRLEVEL CHANGE;

    Just make sure your lob colum(s) are LAST on the list.

    ReplyDelete
    Replies
    1. Thanks for the example, Dale. I imagine folks will indeed find it to be helpful.

      Robert

      Delete
  7. Hi All, quick question, what's the size limit of an unloaded record with VBS? I'm getting an error trying to unload a table with 2 CLOB columns:

    "DSNU1231I -DB2U 182 16:46:23.89 DSNUULVA - LENGTH OF OUTPUT RECORD IS TOO LONG FOR TABLE NACS03.NABRLGTB"

    I Googled around and read the IBM doc but can't seem to find what I'm doing wrong.

    Thanks for any help!

    ReplyDelete
    Replies
    1. Mike, what does the utility control statement look like for that UNLOAD job that's giving you the error? Also, what are the definitions of the two CLOB columns in the table?

      Robert

      Delete
  8. Has anyone been able to successfully transfer a VBS SPANNED dataset to another system via a Managed File Transfer software product? For big shops that don't have shared DASD or shared Tape, this is a roadblock we haven't been able to get past.

    ReplyDelete
    Replies
    1. Did you run into a problem in trying to transfer such a file in that way? If so, what was the nature of the problem?

      Robert

      Delete
    2. We most certainly have run into that problem. Everything outside of DB2 gets a 002 abend when trying to utilize that type of file.

      Delete
    3. Could you provide a bit more detail? Is the "002" to which you refer an S002 abend? If so, was there anything else, like a reason code, that was part of the abend message? What process triggered the abend? A LOAD job using, for SYSREC, the output of an UNLOAD with SPANNED YES that was sent from one system to another via file transfer? Or, did the file transfer operation itself trigger the abend? And, what was used to accomplish the file transfer?

      If you'd like, you can send this information in an e-mail to rfcatterall@gmail.com.

      Robert

      Delete
    4. We have a file created out of an UNLOAD with SPANNED YES that we are trying to transfer from system A to system B. Any attempt to transfer the file results in a 002-04 abend in the file transfer software.

      The ABEND occurs when the logical record size exceeds 32K. The records appear to be of valid construction, the RDW, SDW and the segment control codes are all in order.

      What methods are others using to transfer these types of files between systems?

      Delete
    5. I believe that IBM Sterling Connect:Direct for z/OS would be able to transmit a file of this nature (the documentation for this product can be accessed online at http://www-01.ibm.com/support/knowledgecenter/SSFGBN/cd_zos_welcome.html).

      Another option would be to use DFSMSdss to create a dump of the VBS-format file and transmit it to another system. You would have to terse the backup file before transmitting it, because the format of the output dump data set would be RECFM=U (undefined record format). Per a colleague of mine in DFSMSdss development, here are the steps involved (I'll refer to your original file as "file.vbs"):

      1) Dump the file.vbs (preferably via logical data set dump) to file.backup.
      2) Terse file.backup -> file.backup.trs.
      3) Transmit file.backup.trs.
      4) Unterse file.backup.trs -> file.backup.
      5) Restore file.vbs from file.backup.

      Robert

      Delete
    6. Robert,
      Thank you for your thoughts and time on this issue. We'll see if we can make some progress.

      Delete
  9. Hi Robert,

    Thank you for all the terrific articles over the years.

    Can someone please post a working sample of an LOB Reorg using IBM or BMC Reorg.
    My syntax keeps failing after the Unload
    We are V10 NFM.

    Thanks for the help



    ReplyDelete
    Replies
    1. If it's an IBM REORG that's failing, what does your utility control statement look like? What error message are you getting with the failure?

      Robert

      Delete
  10. Hi All,

    one question from my side, I am trying to UNLOAD one CLOB column data, i tried by placing RECFM as VBS for SYSREC and i can do that succesfully.And one more case where I need to pass delimiter in my job so that the same data can be uploaded to Oracle database. But right now when I pass delimiter I am getting error like lenth of output record is too long. Can we able to pass Delimiter with RECFM as VBS?



    I

    ReplyDelete
    Replies
    1. Hi Nani, you got any solution for unloading lob tb in delimited format?

      Delete
  11. Hi Robert, I can't unload my lob table with spanned yes in a delimited format (I need to load in luw) as delimited is not compatible with spanned yes. Any other option do I have to unload the lob's in a delimited format.

    ReplyDelete
    Replies
    1. I am not aware of an option that would allow you to unload LOB data in a delimited format (an exception to that rule, I believe, pertains to smaller LOBs - those being LOB values small enough, size-wise, so that the total record length does not exceed 32 KB). I believe that you will need to unload the LOB values into separate files; then, on the DB2 for LUW side, you could use LOAD with the lobsinfile option.

      Robert

      Delete
  12. this blog entry was very helpful and the comment discussion really helped me out. thanks Rob and community! :)

    -Terry

    ReplyDelete
  13. Hi,

    Can this LOB Unload file in VBS format be used in a system outside Mainframe to load to a Hadoop database?

    ReplyDelete
    Replies
    1. I don't see why not (and apologies for the delay in responding). The process that accomplishes the load would of course need to support variable-block spanned records (i.e., multiple physical records across which a single very large data value is written).

      You might also consider using a data replication tool to populate a Hadoop target with data (including LOB data) sources from a Db2 for z/OS system. IBM InfoSphere Data Replication supports Hadoop targets (and supports replication of Db2 LOB values). More information is available at https://www.ibm.com/products/infosphere-data-replication.

      Robert

      Delete
  14. Thanks for all the insights Rob, Could you please let us know how to browse/view the dataset with RECFM=VBS

    ReplyDelete
    Replies
    1. I'm afraid I'm not going to be a help to you regarding this question. I know how to view LOB data that is stored in a Db2 for z/OS table, but I can't tell you off the top of my head how to view LOB data outside of Db2, when it is in a file with VBS format. I know that VBS data sets are not supported for viewing via ISPF. If the LOB values are actually not too long, it is conceivable that a VBS data set holding LOB values could be converted to VB format, and then you should be able to view the contents. Other than that, I don't have any suggestions.

      Robert

      Delete