Friday, April 22, 2011

You DO Let DB2 for z/OS Allocate Utility Sort Work Data Sets, Don't You?

Some of my blog posts are about DB2 things that are very new (my previous entry described the high-performance DBATs introduced with DB2 10 for z/OS), and some cover DB2 stuff that's been out there for a while. This entry is an example of the latter. Dynamic allocation of the sort work data sets used in the execution of mainframe DB2 utilities is something that goes back at least to DB2 Version 8, and a lot if DBAs have taken advantage of this feature to improve the reliability and performance of their utility jobs. At the same time, it seems that there are a number of DB2 for z/OS people who are kind of confused about dynamic allocation of utility sort work data sets, and are uncertain as to how they can leverage this product capability. Seeing as how dynamic allocation of the data sets in question (and not just dynamic allocation, but DB2-directed allocation -- more on that distinction momentarily) is highly recommended by IBM, my aim today is to clear things up for folks who could use some clarification. By the way, I want to give props to Christian Michel, one of the utilities developers at the IBM lab in Boeblingen, Germany, who helped me get my arms around the topic.

So, a little background: a number of DB2 utilities -- including LOAD, REORG, REBUILD INDEX, and RUNSTATS -- use DFSORT (an IBM product that's a feature of z/OS) to handle required sorting of data records (typically index keys). DFSORT of course needs some disk space to do this work, and some time ago you had to allocate those data sets via DD statements in the JCL of your utility jobs. SORTWKnn is the DD name that generally comes to mind first when one thinks of these data sets, but there are others: SW01WKnn, DATAWKnn, DA01WKnn, etc.). The problem with the JCL-directed allocation of these data sets was that you could have a utility job fail if the space available in the DFSORT temporary data sets were insufficient, or the job might not perform optimally if the wrong number of data sets were allocated (this could impede parallelization of sort operations).

To address these challenges, IBM provided new keywords for DFSORT-using utilities (again, I'm not sure if this was introduced with DB2 V8 or a prior release) that would enable DFSORT to dynamically allocate the temporary data sets that it needed to do the sorting required by a utility job. The new keywords were SORTDEVT, which indicates the device type to be used for data sets dynamically allocated by DFSORT, and SORTNUM, which can be used to specify the number of temporary data sets to be allocated by DFSORT for each sort operation performed in the execution of a DB2 utility (the value of SORTDEVT is usually set to a so-called "esoteric," which is a z/OS installation-defined group of devices, and the common specification is SYSDA, because that means magnetic disk storage and it's an esoteric that is almost always defined on a z/OS system).

Dynamic allocation of utility sort work data sets by DFSORT was a step in the right direction, but still the situation was less than ideal (SORTNUM might be set to a sub-optimal value, potentially impeding utility sort parallelization, or the estimate of space needed for sort work data sets might be substantially off due to inaccurate statistics in the DB2 catalog). To address these challenges, DB2 utilities were enhanced via APARs PK45916 (for DB2 Version 8) and PK41899 (for DB2 9) to enable them to dynamically allocate sort work data sets before invoking DFSORT (the enhancements delivered via these APARs are part of the base functionality of DB2 10). In doing this, the utilities would optimize the number of sort work data sets dynamically allocated for a job (thereby removing the need for a user-specified SORTNUM value). On top of that, sort work space requirements would be more accurately estimated utilizing information in the real-time statistics tables.

That's all good (REAL good), but some people were (and still are) uncertain as to what had to be done to realize the benefits of these important DB2 utility enhancements. That, as much as anything, is what I want to address in this entry. Here's the deal:
  • Make sure that the value of the DB2 ZPARM parameter UTSORTAL is set to YES (note that YES is the default setting for UTSORTAL in a DB2 10 system).
  • Remove SORTNUM specifications from your utility control statements, or set the DB2 ZPARM parameter IGNSORTN to YES to have DB2 ignore any SORTNUM specifications found in utility control statements.
  • Remove ALL of the sort-related DD statements from the JCL of your DB2 utility jobs -- that's anything with a WKnn string in it (SORTWKnn, SW01WKnn, DATAWKnn, DA01WKnn, STATWKnn, and ST01WKnn), along with UTPRINnn and DTPRINnn (exception: keep the UTPRINT and SORTDIAG DD statements in your JCL).
  • Ensure that real-time statistics information is available to the DB2 utilities. If you are still on DB2 V8 or using DB2 9 in Conversion Mode, you need to create the real-time statistics tables if they have not already been defined on your system (starting with DB2 9 in New Function Mode, the real-time statistics tables are part of the DB2 catalog, and you don't have to be concerned with creating them). If you need to set up the real-time statistics tables in your DB2 environment, refer to Appendix I in the DB2 Version 8 Administration Guide (available in PDF form at this url:
Another thing: leave SORTDEVT in your utility control statements, or include a SORTDEVT specification in those statements (referring to the utilities for which SORTDEVT is a valid option: CHECK DATA, CHECK INDEX, CHECK LOB, LOAD, REBUILD INDEX, REORG INDEX, REORG TABLESPACE, and RUNSTATS). You need this for dynamic allocation of sort work data sets, regardless of whether this dynamic allocation is done by DB2 (preferred) or DFSORT.

Finally: some people who have been accustomed to allocating sort work data sets via JCL DD statements might be concerned about loss of control over placement of these data sets in the disk subsystem if they go the dynamic allocation route. After all, SYSDA (again, the typical SORTDEVT specification) is usually very generic ("allocate these on magnetic disk devices"). What do people do to direct dynamically allocated sort work data sets to a particular set of disk volumes? My understanding is that the primary means to this end is the utilization of DFSMS Automatic Class Selection (ACS) routines. These routines would look for the DD names of the sort work data sets being dynamically allocated (as mentioned previously, these contain the string WKnn) and would direct those data sets to an SMS storage group that would be comprised of the volumes you want to be used for the "sort pool." Another option would be to define an esoteric that would include the sort pool volumes and then to use the name of this esoteric instead of SYSDA as the value of the SORTDEVT utility control card option.

OK, so all of this is old news to people who have been taking advantage of DB2-directed dynamic allocation of sort work data sets since the functionality was introduced back in 2008. I'm interested in reaching people who are still allocating DB2 utility sort work data sets the old way. If you're in that group, I'm telling you: the new way (let DB2 allocate the data sets) is the better way -- better for performance, and better for reliability. Give it a shot, if you haven't already. 

1 comment:

  1. Great clarification for newbies (like me)