Monday, September 30, 2013

DB2 for z/OS Hash-Organized Data: an Interesting Performance Story

A lot of DB2 for z/OS people know about the hash mode of organizing data in a table -- something introduced with DB2 10 (I've blogged about this topic multiple times, most recently in an entry I posted a few months ago). The basics of hash-organization of data (this as opposed to traditional cluster-based data organization) are pretty straightforward: you select for a table a hash key (which could be comprised of a single column or a concatenation of multiple columns, as long as each key value is unique), and you tell DB2 to hash-organize the data in the table using the designated key (this can be done with a CREATE TABLE or an ALTER TABLE statement -- in the latter case a subsequent online REORG of the associated table space changes the mode of data organization from cluster-based to hash-based). Thereafter, when a row is inserted into the table, DB2 will determine the target page for the row by running the value of the hash key for the row through a hashing algorithm. The really good part, performance-wise, comes when a row is retrieved from the table. If a SELECT statement includes an "equals" predicate that references the table's unique hash key, DB2 can run that hash key through the aforementioned hashing algorithm to identify the page into which the row was assigned on insert, and voila -- the row is returned to the requesting application process with a single GETPAGE (possibly some rows in a hash-organized table will be placed in the overflow area of the table space at insert if the hash-identified target page is full, but the percentage of such rows in the table should be small if the table space was properly sized). In contrast, if DB2 were using a unique index to access a row then multiple GETPAGE operations would be necessary (e.g., three GETPAGEs for a three-level index, followed by one more GETPAGE for the table space page identified via the row ID found in the index leaf page containing the key value referenced in an "equals" predicate in a query). Because GETPAGEs are a key determinant of the CPU cost of executing a SQL statement, the cost of accessing a row in a hash-organized table will, in some cases, be less than the cost of accessing the same row in a cluster-organized table.

When will a hash-organized table be tops in terms of efficiency of data access? Your initial response to that question might be, "A hash-organized table wins the CPU efficiency game when a query retrieves a single row qualified by an 'equals' predicate that references the table's hash key." That answer will generally be right, but it won't always be right, as illustrated by an interesting situation recently brought to my attention by a DB2 for z/OS DBA. This DBA informed me of a test he ran in a DB2 10 new-function mode environment. The test involved a program that issued a query that targeted a hash-organized table and contained an "equals" predicate referencing the table's hash key. The DBA ran the same program using the same data, with the only difference being cluster-based organization of the data in test B instead of the hash organization used in test A. Guess what? The program's elapsed and CPU times were lower when the target table was cluster-organized. Huh? How could that be?

The key to what at first appears to be a strange result is this: the program executed by the DBA issued the singleton SELECT with the hash key-referencing "equals" predicate in a loop, with the value plugged into the "equals" predicate picked up, for each execution of the query, from a file -- a very common batch scenario in the mainframe DB2 world. On top of that, the file of key values was sorted in a way that pretty much lined up with the clustering sequence of the target table (referring to the case in which the program was executed with a cluster-organized table). This is also very common in DB2 for z/OS systems. With the input file of key values sorted in this way, execution of the program with a cluster-organized table resulted in a sequential page access pattern: each successive table space page accessed was "ahead of" (with respect to physical order) and "nearby" (generally speaking, within 16 pages) of the previously accessed page. Through the mechanism known as sequential detection, DB2 recognized this sequential data access pattern and activated dynamic sequential prefetch for the program, significantly reducing elapsed time, and saving CPU time, as well, relative to the random synchronous read activity attending the execution of the program with the hash-organized table. Additionally, repeated execution of the singleton SELECT with the cluster-organized table allowed DB2 to utilize index look-aside to dramatically reduce index-related GETPAGE activity, thereby largely negating the GETPAGE minimization advantage that would otherwise be seen with access to a hash-organized table (note that, for a batch program that issues many COMMITs, the effectiveness of sequential detection and index look-aside is maximized when the associated package(s) is bound with RELEASE(DEALLOCATE)). Thus, in this case, two old DB2 performance features (I believe that both sequential detection and index look-aside were introduced with DB2 V2.3, in the early 1990s) trumped one very new one (hash-organized data).

Now, this isn't the end of the story. The DBA ran the test again, this time with the input file of key values sorted in a random fashion that didn't line up at all with the sequencing of rows in the cluster-organized table. The results of this second test were the reverse of what was seen the first time around: the performance of the program was better when it executed with the hash-organized table. No surprise there. With sequential detection and index look-aside out of the picture, the one-GETPAGE-per-row aspect of hash-organized data access beat out the many top-to-bottom index probes and random reads of table and index pages that came with access to the cluster-organized table.

So, in weighing whether or not a particular table should be hash- or cluster-organized, do not check to see only that the table is primarily accessed by singleton SELECTs that qualify rows with an "equals" predicate referencing what would be the table's hash key. Consider, as well, where those singleton SELECTs come from. Are they mostly issued by online transactions that retrieve just one row (or a very small number of rows) from the table with each execution, or are they chiefly issued in do-loop fashion by batch programs that pull lots of rows from the table, using files of input keys that can be sorted to match (or highly correlate with) what is (or would be) the table's clustering key? If the latter holds true, the oldie-but-goodie batch performance optimization features, sequential detection and index look-aside, might provide even greater CPU efficiency (and lower elapsed times) that you could get with a hash-organized table. If the singleton SELECTs are, successively speaking, more random in terms of data access, hash-organizing the data could be your best choice for optimal performance. What's true now has been true as long as I've been working with DB2 for z/OS (a long time): good physical database design decisions proceed from knowledge of your data and of how that data is accessed. 'Nuff said.

Wednesday, September 25, 2013

DB2 10 for z/OS: Take Advantage of "Native" SQL User-Defined Functions

Lots and lots of mainframe DB2 people know about the native SQL procedure technology delivered with DB2 9 for z/OS. That was, in my opinion, the most important enhancement of this aspect of DB2 functionality since stored procedures were introduced in the mid-1990s with DB2 for z/OS Version 4 (I've blogged multiple times about native SQL procedures, starting with an entry I posted a few years ago to the blog I maintained while working as an independent DB2 consultant). Many DB2-using organizations around the world are reaping the benefits of native SQL procedure utilization, but plenty of folks have overlooked a related enhancement provided by DB2 10 for z/OS: "native" SQL user-defined functions (UDFs). I have "native" in quotes because officially these UDFs are not known by that designation -- they are simply referred to in the DB2 10 SQL Reference as SQL scalar UDFs and SQL table UDFs; however, they share with native SQL procedures several characteristics that make them a great choice for packaging server-side data manipulation logic, so calling DB2 10 SQL UDFs "native" appeals to me. In this entry, I'll explain why native SQL UDFs (not gonna put quotes around native from here on out) are not only an advance in terms of programming productivity, but also, in some cases, a way to improved application performance versus traditional, "external" UDFs.

First, a little review. UDFs, like stored procedures, provide (as noted) a means of packaging data access logic on the DB2 for z/OS server. One of the main differences between UDFs and stored procedures is the way in which they are invoked. Stored procedures, of course, are invoked via a CALL statement, like this one:


UDFs, on the other hand, are invoked as a result of appearing in an SQL DML statement -- usually a SELECT statement. For example, if I have a simple UDF, called KM_MILES, that converts kilometers to miles, it could be executed by way of a statement like the one below:


This difference in the means of invocation is one reason why a person might choose to package some data manipulation logic in the form of a UDF as opposed to a stored procedure -- sometimes it's preferable to invoke the packaged functionality via a SELECT (or other SQL DML statement), versus a CALL statement (and sometimes the reverse is true, and a stored procedure will be favored over a UDF).

Now, one of the great things about native SQL procedures is that they significantly expanded the population of people who could develop stored procedures in a DB2 for z/OS environment: if you could write SQL, you could develop a native SQL procedure -- you didn't have to know COBOL or Java or one of the other languages used to develop traditional external DB2 stored procedures (external SQL procedures, introduced with DB2 Version 7 for z/OS, similarly expanded the stored procedure development population, but external SQL procedures execute as C language stored procedures, and many organizations prefer the "in-DB2" nature of native SQL procedures). Development-oriented DBAs, in particular, got heavily involved with stored procedure development at numerous sites, thanks to native SQL procedure technology. In the same way, native SQL UDFs extended DB2 for z/OS UDF development capabilities to people who know SQL but don't know COBOL or Java or C or one of the other external UDF languages.

Here, I have to make an acknowledgement. It is true that UDFs could be written in SQL prior to DB2 10 for z/OS; however, there were significant restrictions placed on such UDFs. Basically, in a pre-DB2 10 system the functionality of a SQL UDF was limited to what you could code in the UDF's RETURN statement. And the SQL expression in that RETURN statement couldn't reference a column name. And, it couldn't contain a scalar fullselect (i.e., you couldn't generate the returned value via a SELECT statement). Also, data-changing SQL statements (INSERT, UPDATE, DELETE) were not allowed (you could specify READS SQL DATA in the CREATE FUNCTION statement, but MODIFIES SQL DATA was not an option). If you wanted to declare variables and assign values to same in your SQL UDF, you were out of luck. Logic control SQL statements, such as GOTO, IF, and WHILE? Nope. Oh, and SQL table UDFs, which return a set of rows versus a single value? Forget about it (or, as my friends up New York City way might say, "fuhgeddaboudit").

With DB2 10 (in new-function mode), those restrictions went away. A DB2 10 SQL UDF can call a stored procedure, consume a result set generated by a called stored procedure, declare variables and assign values to those variables, loop through a set of statements, provide condition handlers, and more (and, DB2 10 allows for the coding of row-set-returning table UDFs in SQL). So DB2 10 not only expands the population of people who can develop UDFs (because UDFs can be written in SQL), it also enables SQL UDF developers to code routines that can do all kinds of useful things -- the possibilities have been hugely extended versus the DB2 9 situation. Programming productivity is further enhanced by the fact that DB2 10 SQL UDFs (like native SQL procedures) do not have the external-to-DB2 artifacts and processes associated with external routines -- there is no need for precompile, compile, and linkedit steps; no object libraries; no load libraries; no WLM execution environment (though you can specify a WLM environment to be used for running a SQL UDF in debug mode).

And what about the performance angle? There are several benefits here. For a client-server workload involving access to DB2 via the DDF address space from DRDA requesters, there's the fact that a SQL UDF runs under the task of the process invoking the UDF, as opposed to running under a TCB in a WLM-managed address space. When the application process is a DRDA requester, the task in the z/OS system is an enclave SRB in the DDF address space, and that means substantial zIIP offload for SQL UDF-related processing. Also related to DDF-using applications: the package associated with a SQL UDF can be bound with RELEASE(DEALLOCATE), providing the CPU efficiency boost associated with high-performance DBATs when the UDF is invoked through a SQL statement issued by a DRDA requester.

Another performance benefit has to do with the elimination of "thread-task switching" when SQL UDFs are utilized instead of external UDFs. Here's what I mean by "thread-task switching": when an external UDF is invoked, the DB2 thread of the application process has to be switched from the task of the application process to the task of the external UDF. When the UDF has completed its processing, the DB2 thread is switched back to the task of the UDF-invoking application process. When the UDF is a SQL UDF, there is no "other" task involved, because (as mentioned) the SQL UDF runs under the task of the invoking application process (and runs in the DB2 DBM1 address space); thus, there is no need to switch a DB2 thread from one task to another and back. If a UDF is invoked once in the course of the invoking SQL statement's execution, the lack of thread-task switching in the SQL UDF case may not be noticeable. But what if the UDF is invoked LOTS of times in the course of an SQL statement's execution? I heard recently of an interesting real-world situation of this nature. A DB2 for z/OS DBA told me of a SELECT statement at his site that included a COBOL-coded external UDF in the select-list and in the WHERE clause of the statement. One execution of this statement could drive over 100,000 executions of the external UDF. Statement elapsed time could go to several minutes. Much of that time was UDF time, and the vast majority of UDF elapsed time was shown by a monitor to be TCB wait time. In response to this performance issue, the DBA replaced the COBOL external UDF with a functionally equivalent SQL UDF (the organization is running DB2 10 for z/OS in new-function mode). The result? UDF-related TCB wait time was eliminated, and response time for the query went way down.

Believe me, folks, the ability to code functionally rich, native SQL UDFs is one of the real gems provided with DB2 10 for z/OS. It's a gem that has not received a lot of attention (understandable, given the many "wow" features of DB2 10, such as temporal data support, hash-organized tables, and LOB inlining), but it's one that you shouldn't overlook. Consider how you could put this technology to work in your shop.