Robert's Db2 blog: DB2 for z/OS: Boosting UNLOAD and LOAD Performance with Pipes and FORMAT INTERNAL

Monday, February 11, 2013

DB2 for z/OS: Boosting UNLOAD and LOAD Performance with Pipes and FORMAT INTERNAL

My colleague Jorn Thyssen, who is based in Denmark, recently let me know of some nice work that he had done in setting up a high-performance process that gets data from one DB2 for z/OS table into another via the UNLOAD and LOAD utilities. What Jorn shared with me I'll share with you in this blog post.

Here's how this got started: Jorn, an IBM System z Information Management Technical Specialist, had delivered presentations on the DB2 Analytics Accelerator for z/OS to a number of organizations. In so doing, he would describe how the copying of DB2 tables into the Analytics Accelerator was performance-optimized in a couple of ways: 1) data is unloaded from the source DB2 tables in internal format, saving the CPU cycles that would otherwise be consumed in converting the data to external format; and 2) the unloaded DB2 data is transferred to the Analytics Accelerator by way of z/OS UNIX System Services (USS) pipes. [Pipes, also known as FIFO (first-in, first-out) files, are commonly used in UNIX environments. One process can read from a pipe as another process is writing data to the pipe, enabling a data load operation to run simultaneously with the unload operation that provides its input.] Following one of these presentations, someone asked Jorn a question: could data be moved from one DB2 for z/OS table to another in a similar fashion, with no data conversion and without the use of an intermediary data set that would force serialization of the load and unload tasks?

In considering this question, Jorn thought first about the cross-loader function of the DB2 LOAD utility. That could be used to get data from one table to another without the need for an "in-between" data set that would first be the output an UNLOAD and then the input to a LOAD. The cross-loader, however, can't be used with the FORMAT INTERNAL option, so while it would address the "no intermediate data set" stipulation, it would leave the "no data conversion" requirement unsatisfied.

It then occurred to Jorn that one could utilize, for a table-to-table data move in a DB2 for z/OS context, the same technique employed for copies of data into a DB2 Analytics Accelerator: combine the FORMAT INTERNAL option with a transference of data through a USS pipe. Jorn went looking for examples of this approach, and when he didn't find any that precisely fit the bill, he created his own UNLOAD and LOAD jobs to show how FORMAT INTERNAL and USS pipes can be used to avoid data conversion and the serializing effect of a traditional "in-between" data set. He ran these jobs on a DB2 subsystem that he uses for testing purposes, and they worked as expected. The jobs were submitted at the same time. The UNLOAD process waited for the LOAD process to open the pipe for reading, whereupon it commenced writing unloaded records to the "back" of the pipe, while the LOAD process read records from the "front" of the pipe. This is all done in memory -- there is no physical I/O involved.

Here is Jorn's UNLOAD job:

//UNLD1 EXEC DSNUPROC,SYSTEM=DB2A,
//             LIB='DB2.V9R1.SDSNLOAD',
//             UID=''
//SYSPUNCH DD DSN=ABC1234.DB2A.CNTL.DSN8D91A.DSN8S91E.PTALL,
//             DISP=(,CATLG,DELETE),
//             DCB=(LRECL=80,BLKSIZE=0,RECFM=FB,DSORG=PS),
//             SPACE=(TRK,(5,5),RLSE),
//             UNIT=SYSDA
//SYSABC DD PATH='/tmp/unload.pipe1',DSNTYPE=PIPE,
//        LRECL=107,BLKSIZE=27998,RECFM=VB,
//        PATHOPTS=(OCREAT),
//        PATHMODE=(SIWUSR,SIRUSR),
//        PATHDISP=(DELETE,DELETE)
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN     DD *
TEMPLATE UD
      PATH('/tmp/unload.pipe1')
      PATHDISP(KEEP,KEEP)
      RECFM(VB)   LRECL(00000049)
      FILEDATA(BINARY)

UNLOAD TABLESPACE DSN06697.TEST UNLDDN(UD)
FORMAT INTERNAL
/*

And here is the control statement for the complementary LOAD job:

//DSNUPROC.SYSIN DD *
TEMPLATE D3RYKD6Z
      PATH('/tmp/unload.pipe1')
      PATHDISP(KEEP,KEEP)    PATHOPTS(ORDONLY)
      RECFM(VB)   LRECL(00000049)
      FILEDATA(BINARY)
LOAD DATA INDDN D3RYKD6Z LOG NO REPLACE
   FORMAT INTERNAL
INTO TABLE ABC1234.TESTKOPI

In addition to unloading data from one DB2 for z/OS table and loading the data into another table, Jorn successfully tested some variations on the technique:

He unloaded data from a table to a USS pipe, and sent that pipe to another system via FTP.
He sent a file to a USS pipe via FTP, and loaded data into a table from the pipe.

In exploring the use of USS pipes in your DB2 for z/OS environment, you might find the following sources of additional information to be useful:

DB2 for z/OS APAR PK70269. The fix for this APAR introduced DB2 for z/OS TEMPLATE support for USS files (these are associated PTFs for DB2 for z/OS versions 8 and 9 -- the functionality is part of the DB2 10 base code). The text of this APAR is quite informative.
The IBM "red book" titled, "DB2 9 for z/OS: Using the Utilities Suite." Section 3.2.8 of this document explains the use of TEMPLATE with USS pipes. Section 7.17 covers unloading and loading data using USS pipes.
The DB2 for z/OS Utility Guide and Reference contains supporting information in the section on TEMPLATE. Refer to the DB2 9 or the DB2 10 manual, depending on the DB2 release you're running.

We often think of DB2 for z/OS utilities as workhorses, and they are, but they are workhorses that are constantly being enhanced with new functionality, one example of which I've written about here (with, again, a tip of the hat to Jorn Thyssen). Consider how the combination of FORMAT INTERNAL and USS pipes could enhance the performance of UNLOAD and LOAD operations at your site.

32 comments:

AnonymousMarch 8, 2013 at 9:33 AM
Robert, how big was the test table, and how did the UNLOAD/LOAD/pipes scenario compare with Crossloader, from a CPU and elapsed time standpoint?

Also, can you comment on how WLM handles the pacing of the UNLOAD and LOAD so that the USS pipe flows optimally, rather that having periodic surges? How does it compare with MVS Batchpipes?
ReplyDelete
Replies
AnonymousMarch 14, 2013 at 11:03 AM
forgot to sign.... Michael Harper, TD Bank
ReplyDelete
Replies
AnonymousMarch 14, 2013 at 11:11 AM
Robert, your example have LRECL=107 (the row length) on the SYSABC DD, while the templates have an LRECL of 49. I am puzzled by this. I have a table with a row length of 68 and if I harden the FORMAT INTERNAL to a MVS file the ouput has a LRECL of 86 (VB).

Michael Harper, TD Bank
ReplyDelete
Replies
Jørn ThyssenMarch 17, 2013 at 10:13 AM
Lab measurements show

– 85% CPU & elapsed time reduction on UNLOAD
– 77% elapsed time, 56% CPU reduction on LOAD

using FORMAT INTERNAL

(taken from Haakon Robert's presentations)

I only experimented with small tables in order to build some working examples I passed on to my customer, so I haven't tested if performance is as good as we claim.

Jørn Thyssen
IBM Denmark
ReplyDelete
Replies
Jørn ThyssenMarch 18, 2013 at 6:44 AM
Hi Michael,

Good catch regarding the LRECL.

What I've found so far is that it looks like the dataset will have the same LRECL as if you had unloaded in normal external format, that is,

LRECL = 6 bytes + size of fields in external format

For example, if your table had timestamp(6) fields it will add 26 to LRECL instead of the expected 10 bytes.

I'll have to contact the lab to find out why that is the case.

Best regards,

Jørn Thyssen
IBM Denmark
ReplyDelete
Replies
Jørn ThyssenMarch 19, 2013 at 11:38 AM
Hi Michael,

Update from the lab:

FORMAT INTERNAL does not take into account the shorter length of internal format for, e.g., timestamps.

The first 6 bytes of internal format is an 6 byte data prefix, whereas the first bytes of normal format is a 2 byte OBID. So INTERNAL FORMAT will have four bytes longer LRECL.

What column types did your LRECL 68 vs 86 example have?

Best regards,

Jørn Thyssen
IBM Denmark

ReplyDelete
Replies
UnknownAugust 5, 2013 at 1:07 AM
Hi Jorn and Robert,

In the comments is stated that you 'only experimented with small tables in order to build some working examples'. I wondered:

- What you meant with 'small' tables (in terms of kb and records)?
- Whether you have done further analysis on the performance?

Best regards,
Ludovic Janssens
Infocura
ReplyDelete
Replies
UnknownJune 4, 2014 at 10:51 PM
Here is a simple working JCL.

You need to pre-allocate USS Named Pipe (UNIXFILETYPE=FIFO). I allocated '/u/prems/TMP/MYPIPE' in ISHELL. (I like ISHELL since it is in ISPF).

Submit the LOAD job first. Followed by the UNLOAD job.

UNLOAD JCL
----------

//PREMUNLD JOB (PLS,81038),CLASS=A,MSGCLASS=H,MSGLEVEL=(1,1),
// REGION=0M,TIME=1440,NOTIFY=&SYSUID
//**************************************************************
//UNLD1 EXEC DSNUPROC,SYSTEM=DSNB,
//* LIB='DB2.V9R1.SDSNLOAD',
// UID=''
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
TEMPLATE UD
PATH('/u/prems/TMP/MYPIPE')
FILEDATA RECORD
UNLOAD TABLESPACE STUD00D2.DSN8S10P
FROM TABLE STUD00.PROJACT
UNLDDN(UD)
FORMAT INTERNAL
/*

LOAD JCL
--------

//PREMLOAD JOB (PLS,81038),CLASS=A,MSGCLASS=H,MSGLEVEL=(1,1),
// REGION=0M,TIME=1440,NOTIFY=&SYSUID
//**************************************************************
//LOAD1 EXEC DSNUPROC,SYSTEM=DSNB,
// UID=''
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
TEMPLATE LD
PATH('/u/prems/TMP/MYPIPE')
FILEDATA RECORD
LOAD DATA INDDN LD LOG NO RESUME NO NOCOPYPEND REPLACE
FORMAT INTERNAL
INTO TABLE PREMS.PROJACT01

Content of /u/prems/TMP directoy in ISHELL
------------------------------------------
EUID=400135 /u/prems/TMP/
Type Filename
_ Dir .
_ Dir ..
_ FIFO MYPIPE
_ File UNLOAD.PIPE1

Note the filetype of MYPIPE is FIFO.

P S Prem
Works for IBM Singapore. Views are personal.
ReplyDelete
Replies
UnknownJuly 20, 2016 at 8:17 AM
I am loading a table using cross-loader function between two system in DB2 Zos. Both the tables has GENERATED ALWAYS as IDENTITY defined. I would like to know what is that the value of a Generated Always column be in TArget table will it be same as Source table or different
ReplyDelete
Replies
AnonymousOctober 3, 2023 at 12:19 AM
Hello,I used to dsnutilb to unload a table/tablespace.I get below error-
DSNU1218I -DB2X 276 02:58:15.66 DSNUULIA - LOGICAL RECORD LENGTH OF OUTPUT RECORD EXCEEDED THE LIMIT FOR TABLE .I tried using nopad option as well as per info msg suggestion but dint work either.can you help
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.