At first glance, being able to use the variable-blocked spanned (VBS) record format for the LOAD and UNLOAD SYSREC data set may not seem like a big deal. A little background, then: SYSREC is the default DD name for the data set into which data is unloaded from a table via the UNLOAD utility, and it's the default DD name for the data set that holds records that are loaded into a table space through the LOAD utility. Prior to DB2 10, the LOAD or UNLOAD SYSREC data set had to use the variable-blocked record format (RECFM=VB). That's fine in most cases, but often a problem when LOB data is involved. Why? Because the maximum record length for RECFM=VB is 32,760, and as we all know a LOB data value can exceed 32 KB in length. How, then, were LOB values handled with respect to UNLOAD output and LOAD input? Here's how: when a table with a LOB column is unloaded with the UNLOAD utility, the non-LOB data goes into the one SYSREC data set, but each individual LOB data value goes into a separate file (the same is true, in reverse, for the LOAD utility: individual LOB data values are loaded from separate files). The name of the file into which a LOB value is placed (for UNLOAD) or from which it is retrieved (for LOAD) goes into what's called a file reference variable, or FRV, and it's the FRV values that are found in the SYSREC data set, along with the non-LOB data processed by UNLOAD and LOAD.
With regard to the file type used for LOB data UNLOAD and LOAD in a pre-DB2 10 environment, you have choices, but these come with associated trade-offs. One option is to unload LOB data values into (or load LOB data from) members of a PDS or PDSE. The goodness in that alternative is the fact that most DB2 for z/OS people are very familiar with these data set types. The negatives are related to scalability (i.e., accommodating a large amount of LOB data). Specifically:
- A PDS or PDSE is limited to one disk volume.
- The size of a PDS is limited to 65,536 tracks.
- A PDSE can have no more than 524,236 members.
LOB data can also be unloaded to, or loaded from, HFS files (referring to a file system available via the UNIX System Services component of z/OS, also known as USS). This USS file system doesn't have the above-noted space limitations of PDS and PDSE data sets, and it delivers a performance advantage, to boot; however, a lot of mainframe DB2 people are not very familiar with HFS, so there can be a learning curve to deal with when this file system is used for DB2 LOB data. On top of that, for UNLOAD an FRV cannot refer to a sequential file (DSORG=PS), which is the type used for data sets on tape (this is true whether HFS files or members of a PDS or PDSE are used to hold unloaded LOB values). Bummer.
Enter DB2 10, and these limitations and hassles are removed because the variable-blocked spanned record format (RECFM=VBS) is supported for SYSREC data sets (use of such a data set is indicated via the new SPANNED option of the LOAD and UNLOAD utility control statements). Thanks to this enhancement,
- ...a table's LOB and non-LOB data can be unloaded to (or loaded from) the SAME data set! LOB values no longer have to be in separate files!
- ...the UNLOAD output file (or LOAD input file) can be on tape -- not just disk!
- ...the UNLOAD output file (or LOAD input file) can span multiple volumes (thus overcoming a PDS/PDSE restriction)!
Those exclamation points may come across as a little juvenile, but I really am psyched about this DB2 10 feature. Add to the benefits listed above a MAJOR performance boost, especially for smaller LOBs: an IBM test of an UNLOAD of 16,000-byte LOB values showed an 80% reduction in elapsed time when a VBS SYSREC data set was used versus HFS file reference variables, and a 99% elapsed time improvement for a VBS SYSREC data set as compared to PDSE FRVs.
Truly, if you are storing LOB values in a DB2 for z/OS database, or thinking of doing that, DB2 10 is the release for you. As if LOB in-lining (written about, as previously noted, in an entry I posted to this blog last week) and VBS SYSREC data sets weren't enough to convince you of that, consider these additional LOB-related goodies delivered with DB2 10:
- LOAD REPLACE of data in a table containing LOB values is substantially faster with DB2 10 than with prior DB2 releases, thanks to the use of a write operation called format write for LOB table spaces (LOAD previously used format writes only for non-LOB table spaces).
- INSERT of data from a DRDA client into a table with a LOB column can take considerably less time when LOB values are large, because DB2 10's distributed data facility (DDF) does not have to materialize the entire to-be-inserted LOB value before passing to to the database services address space (DDF will materialize up to 2 MB of an incoming LOB value before starting to pass the value to the database manager, and it will continue passing chunks of the value until it's all been received from the client). In addition to reducing elapsed time for network-driven inserts of larger LOBs, this feature can reduce CPU time and virtual storage consumption associated with such insert operations.
- DB2 10 supports online REORG of a LOB table space with SHRLEVEL CHANGE, and it will reclaim disk space (if less is needed, as would be the case if some LOB values had been remove via DELETE operations). Previously, SHRLEVEL CHANGE could not be specified for a REORG of a LOB table space, and the utility would not reclaim space occupied by the LOB table space.
- The new AUX YES option of REORG will cause the utility to reorganize LOB table spaces associated with partitions of a partitioned table space, as these base table space partitions are reorganized. Additionally, when a partitioned table space holding a table with LOB data is reorganized in a DB2 10 environment, REORG can move rows from one partition of the base table space to another -- something not done in prior releases of DB2. This functionality enables REBALANCE to be specified for a REORG of a range-partitioned table space holding a table that contains LOB data, and it also comes into play when a partition-by-growth table space associated with a LOB-containing table is reorganized (in that case, rows could be moved from one partition to another to reestablish the table's clustering sequence).
- DEFINE NO can be specified on the creation of a LOB table space, so that the table space's data set(s) will not be created until data is inserted into the table space (this option was previously ignored when specified for a LOB table space). DEFINE NO can be particularly useful when DB2 objects are being defined as part of the installation of a packaged software application that will not be immediately put to use.
All good stuff. Get to DB2 10 for z/OS (if you're not there already), and reap these LOB-related benefits. If you're not yet storing LOBs in DB2 tables, get to DB2 10 and re-think that situation. More than ever, it's LOB time in mainframe DB2 land.