Wednesday, May 15, 2013

DB2 for z/OS Performance Tuning: the Rising Tide and the Home Run

I've been working with DB2 for z/OS for about 25 years. During most of that time, I've been actively engaged in performance tuning work -- specifically, performance tuning as it pertains to the CPU and elapsed time associated with application access to data in DB2-managed databases. I've learned that the best results, in terms of efficiency and throughput gains, are achieved via a two-pronged approach that I conceptualize as a rising tide and home runs. In this blog entry, I'll explain these concepts and try to provide you with a road map for achieving your own DB2 for z/OS performance tuning successes.

The rising tide

Here in the USA (and perhaps in some other countries), there's an old saying: "A rising tide lifts all boats." The DB2 tuning actions that I think of as being in the "rising tide" category are those that positively affect the performance of all SQL statements (or at least many statements) that execute in the target DB2 subsystem. Examples of these actions include the following (and I'm providing here a few links to blog posts and articles of mine that provide additional information):

"Rising tide" DB2 performance tuning actions are attractive for several reasons. Chief among these:
  1. They tend to have a wide-ranging impact on the performance of DB2 data access operations. That's the "lifts all boats" effect to which I alluded above. In taking one or more of these actions, you can expect to see improved performance (particularly in terms of CPU efficiency) for lots of the SQL statements that execute in your DB2 subsystem -- not just one or two statements.
  2. They're often pretty easy to implement. Want to page-fix buffer pool BP2 (for example)? OK -- issue an -ALTER BUFFERPOOL(BP2) PGFIX(YES) command (the change will take effect when the pool is next deallocated and reallocated -- usually as a result of "bouncing" the DB2 subsystem). Want to increase the size of your DB2 dynamic statement cache? Change the value of the EDMSTMTC parameter in your ZPARM module. Want to boost CICS-DB2 thread reuse? Change PROTECTNUM to a non-zero value in the DB2ENTRY resources associated with your most frequently executed CICS transactions. In these and other cases, no database design changes are needed, and no application code changes are required.
  3. They can be implemented quickly. This is directly related to item 2, above. When a tuning action requires a change to an SQL statement issued by a program, that change is not going to go into production overnight. Even if it's a pretty simple change, a change request has to be generated and approved, the change has to be assigned to a programmer, it has to be tested, etc. All this could take weeks. Because "rising tide" tuning actions generally don't necessitate modification of application code, they can often be put into effect in relatively short order.

All good; and yet, there is a pretty significant "yeah, but" associated with most "rising tide" DB2 performance tuning actions (as in, "Yeah, that's nice, but..."). The big "yeah, but" to which I'm referring is the typically modest magnitude of the performance gain that usually results from implementation of a "rising tide" tuning step. Very often, what you'd expect to see is a reduction in CPU consumption of maybe a few percentage points for SQL statements impacted by a "rising tide" change. Basically, you're improving a lot of DB2-access operations by a little bit each. Saving a little bit of CPU for a lot of SQL statements is a good thing, of course, but along with the "lots of little" improvements, you want some big scores. That's where "home run" tuning comes into play.

The home run: narrow focus, big payback

Concentrate your tuning efforts on a single SQL statement, with all performance-improving options on the table, and you might see the CPU time (and, quite possibly, the elapsed time) of that statement go down by 90% or more. I've seen that happen time and time again over the past couple of decades, with query elapsed and CPU times slashed as a result of statement-focused tuning actions. Examples of such actions include:
  • Defining a new index on a table to provide a better-performing access path for an expensive query.
  • Using Data Studio Statistics Advisor recommendations to update DB2 catalog statistics via RUNSTATS to reflect the non-uniform distribution of duplicate values in a column of a table, thereby enabling DB2 to choose a much better access path for a query.
  • Adding a non-result-set-altering predicate to a query to provide DB2 with more information about the characteristics of qualifying rows, enabling selection of a much superior access path for the query.
  • Changing a "between" predicate in a query to an in-list predicate, and changing another predicate in the same query to be indexable (by removing an arithmetic column expression in the predicate), resulting in an increase in the number of index key columns on which DB2 can match in processing the predicates, and a much better-performing access path.
  • Changing a join query with OR'ed predicates referencing columns in two different tables (WHERE TABLE_A.COL1 = 'XYZ' OR TABLE_B.COL2 = 123) to a union of two selects without such OR'ed predicates, leading to vastly improved pre-join row filtering.
  • Creating an index on a column expression to make a formerly non-indexable predicate indexable and stage 1.

In these and other cases, I've observed that chances of success are improved when:
  1. You do a good job of target selection. Because statement-level tuning work might improve performance only for the one query on which you're focused, choose your target query wisely. Make sure that the query you've selected for tuning is one for which a big performance improvement would be truly valuable for your organization.
  2. You don't develop tunnel vision at the outset of a statement-level tuning effort. By this I mean that you don't want to settle prematurely on a particular technique for reducing the query's CPU and/or elapsed time, as doing so could blind you to other means of accomplishing your goal. Consider all available tuning options. What worked for one query may or may not work for another.
  3. You get the right people involved. Not only DBAs and application developers, but business-knowledgeable people and data domain experts, as well. Different people will view a query that's up for tuning in different ways, and that can spur creative and effective thinking in terms of identifying actions that boost query performance.

One other thing: working to reduce CPU and elapsed time for one particular query might in fact improve the performance of other similar queries. For example, an index on an expression created to cut the response time of one query might similarly shorten run times for other queries that also have predicates with the column expression in question. Think of that -- getting better performance for several queries as a result of an action taken to drive down the CPU cost of a single query -- as hitting a home run with men on base (sticking with the baseball analogy).

It's not an either/or thing

By all means, you should avoid getting into arguments over the relative value of the "rising tide" versus the "home run" approach to DB2 for z/OS performance tuning. One approach produces low-cost (particularly in terms of required effort) wins that have wide-ranging -- if generally modest -- cost-reducing effects on query execution. The other approach is typically more labor-intensive and can require more time for implementation, but often delivers big performance improvement results for queries singled out for tuning. What you want is to use both approaches in an ongoing, systematic way: make your query execution environment as conducive to good performance as possible, and look for ways to make your most-costly queries much less costly.

Pursuing better DB2 application performance on these two fronts brings to my mind one more baseball-related image. The San Francisco Giants play their home games in a baseball stadium that's right on the water. Hitters who can drive the ball to right field with some power can land a ball in San Francisco Bay. That's where the rising tide meets the home runs, and that's where you want to be with respect to tuning the performance of your mainframe DB2 system.