Monday, April 21, 2014

DB2 for z/OS: the Importance of Historical Performance Data

Just within the past couple of weeks, I've received questions of a similar nature from two different DB2 for z/OS people. In each message, a DBA described a change that he had made in a production DB2 environment (in one case a size increase for several buffer pools, and in the other a change from an external stored procedure to a native SQL procedure). The questions asked pertained to assessing the impact of the change on system and/or application performance.

I've been fielding questions of this nature for a long time. In every such instance, my first question is pretty much the same: do you have accounting and statistics trace information for the DB2 subsystem in question from before the time of the change? If the answer to that question is, "No," I know that assessing the impact of the DB2 system or application change will be less easy -- maybe a lot less easy -- than it otherwise would have been. In this blog entry, I hope to convey to you the importance of having historical DB2 for z/OS performance information on hand (ideally, information that is based on DB2 accounting and statistics trace information), and to provide you with some related guidance and recommendations.

Why you need "before" performance data

In a nutshell: because the value of your "after" data (meaning, from after the change that you made and want to assess) is considerably diminished if there's no "before" data that can be used for comparative purposes. This is true not only for situations in which you want to determine the performance impact of an alteration of a DB2 setting (e.g., a change to PGFIX(YES) for a buffer pool), or of some aspect of a database object's physical design (e.g., going to LOCKSIZE ROW for a table space) or of DB2-accessing application code (e.g., going to multi-row INSERT for a program) -- it's also applicable to situations in which something's gone wrong, performance-wise (perhaps a transaction's response time has quadrupled, or a batch job is running past a time specified in an SLA), and you need to get to the root of the problem. Online monitoring is great for seeing things from a "now" perspective, but sometimes you need the "then" view of things, too, to avoid going down dead-end streets in your pursuit of a solution.

Your historical data raw materials

First, you need to have DB2 writing trace records to data sets on disk -- not just to the in-memory buffers that are used by performance monitors. In my experience, DB2 trace records are most often written to SMF data sets (GTF is another option). As for the trace classes to have active, I'd go with accounting trace classes 1, 2, and 3, and statistics trace classes 1, 3, 4, 5, and 6 (class 5 is applicable to a DB2 data sharing system). The combined overhead of running all these trace classes should be less than 5%. There are organizations that like to have DB2 accounting trace classes 7 and 8 on, too, to provide package-level information. I agree that package-level figures can be very useful, but at some sites there is concern about the volume of trace records generated via accounting trace classes 7 and 8. If that's a concern at your shop, you could leave classes 7 and 8 off most of the time, and activate them on-demand if you have a need for package-level accounting data. Here's another idea: leave accounting trace classes 7 and 8 on, and use the new (with DB2 10) ZPARM parameter SMFCOMP to activate compression of DB2-generated SMF trace records. You're likely to get a disk space reduction of 60-80% for these compressed SMF records, and the CPU overhead associated with the compression work is likely to be very small (like 1% or less). Before compressing SMF trace records, verify that your DB2 monitor can handle SMF records in compressed form (IBM's OMEGAMON DB2 monitor can) -- if your monitor can't, you can use DB2 sample program DSNTSMFD, provided via APAR PM27872, to decompress DB2 SMF trace records that have been compressed).

Options for keeping historical performance data in a space-efficient way

With DB2 trace records being written to SMF data sets, you then need to decide how long to keep the data around. This is going to have something to do with the volume of trace data produced on a daily basis (workload dependent -- lots more DB2 trace data will be generated at some sites versus others), and the amount of space available for the data in your disk subsystem. Personally, I'd like to have at least 60 days of DB2 accounting and statistics trace data available to me -- 30 days at a minimum. What if you'd like to keep a rolling 60 days of DB2 trace data but don't have the space needed to hold that much in its raw SMF form (even with SMF data compression turned on)? In that case, you have at least a couple of options, which I'll briefly describe below.

Option 1: print to disk. Have a batch process kick off once each night to generate (by way of your DB2 monitor's batch reporting function) a standard set of accounting and statistics reports, and "print" those reports to disk. At a DB2-using site at which I worked in the early- and mid-2000s, we had such reports written to a set of GDGs, with each GDG having 60 data sets (thus, an automatically maintained rolling 60 days of reports). If you go with the "print to disk" approach, I'd recommend generating, each night (using the previous days' SMF records), the following reports:
  • At least one statistics long report for each production DB2 subsystem. That report should have FROM and TO times that will capture 24 hours of system activity (midnight-to-midnight), and the data should be summarized at 1-hour intervals (so, one report would consist of twenty-four 1-hour subreports). If you are running DB2 in data sharing mode, in addition to member-level statistics long reports generate one group-level statistics long report for each production data sharing group.
  • At least one accounting long report for each production DB2 subsystem. If you just generate one report, I'd go with a report that captures 24 hours of data (as with the previously mentioned statistics long report, this should be midnight-to-midnight, with data summarized at 1-hour intervals), and I'd have the data ordered by connection type (i.e., aggregated at the connection type level, so that I could see, for a given time period, the CICS-DB2 activity, the DRDA activity, the call attach activity, etc.). If you want to slice and dice things a little more finely, consider generating at least a couple more accounting long reports each night: 1) for your CICS-DB2 and batch-DB2 activity, consider a report that includes only the CICS and call attach connection types (and/or TSO attach, depending on how your batch jobs connect to DB2), and that orders information by correlation name (batch job names and CICS transaction names are correlation names in DB2 accounting lingo), so that you'll see detailed accounting data for each DB2-accessing batch job and CICS transaction; and 2) for your DDF activity, generate an accounting detail report that includes only the DRDA connection type, with data ordered by primary authorization ID (that should then show you information at the DDF-using application level if, as is typically the case, each such application has its own ID with which it connects to the DB2 server).

Option 2: a performance database. Here, you'd use two functions of your monitor (if your DB2 monitor has these functions): a "reduce" function (probably, to filter and/or summarize information contained in the raw SMF trace records) and a "save" or "file" function. The "save" or "file" function would write data to a file, and you could subsequently use that file to load the DB2 tables that would comprise your performance database. If you go this route, you'll want to follow the directions provided in your monitor's documentation to create and populate the tables in the performance database. The performance database approach might be more involved than the print-to-disk option (in terms of set-up, that is), but once you get the prep work done and the data loaded, you'll have a lot of flexibility in terms of how you view and report on your historical DB2 performance data.

Information is power

When you have historical DB2 performance data at your fingertips, you'll be able to much more effectively point to the effects of your DB2 tuning changes, and say, "Our team did that." What was the impact of a buffer pool change on DB2 DBM1 address space CPU time? How did thread reuse increase when you set up some CICS-DB2 protected entry threads, and how did average in-DB2 CPU time change for transactions that use those protected threads to execute DB2 packages bound with RELEASE(DEALLOCATE)? How did the change to row-level locking for a particular set of table spaces affect deadlock and timeout totals, and how was per-transaction in-DB2 CPU time impacted? How did an increase in the value of the SRTPOOL parameter in ZPARM affect activity in the buffer pools used for the work file table spaces? How did an increase in the page size of a set of indexes affect the throughput of a certain batch job? How did leveraging of DB2 10 high-performance DBATs impact per-transaction in-DB2 CPU times for a DDF-using application? How did a change from an external stored procedure to a native SQL procedure affect zIIP offload for a DDF-using application? How did migration to DB2 11 from DB2 10 affect average in-DB2 CPU times for your applications? How did the archive and purge of "old and cold" data in DB2 tables affect GETPAGE counts in the system?

And so on, and so on. Yes, your DB2 monitor's "what's happening now" capability makes it a powerful tool, but combine that with some historical data, and watch that power grow exponentially. The cost is low, and the return is high. If you're not there, get busy.

Friday, March 28, 2014

DB2 for z/OS People: Wearing Your "Applications" Hat Versus Your "Systems" Hat

Not long ago, a colleague of mine sent to me an e-mail message that he'd received from a person who supports DB2 for z/OS at a US-based company. In this message the IT professional wrote of some DB2-accessing application programs that were running into lock contention issues with page-level locking (the default) in effect. He knew that going to row-level locking for the target tables (accomplished by altering the associated table spaces with LOCKSIZE(ROW)) would likely resolve the locking conflicts that were negatively impacting throughput and response time for the aforementioned applications; however, he was reluctant to pursue this physical database design change because it would be a DB2 response to what he regarded as an application problem.

This person, in a well-meaning way, was wearing his "systems" hat. Wrong choice of figurative headwear, I'd say. It might have been the right hat to wear 10 or 20 years ago, but times have changed. These days, it is more important than ever for a mainframe DB2 person to reach first for his or her "applications" hat. Hereinafter I'll endeavor to explain the reasoning behind this contention of mine.

First, I have a "micro" response regarding the particular matter of page- versus row-level locking, and I believe that this will serve as a perfect lead-in to my take on the larger, "macro" issue which has to do with striking the proper balance between system- and application-focused thinking.

Row- versus page-level locking -- think about it

IBM invented relational database technology, and DB2 for z/OS was, in essence, the first commercial-grade, enterprise-level relational DBMS on the market (it was called DB2 for MVS back then). Page-level was initially the finest locking granularity available in a mainframe DB2 environment, and for years that worked just fine for all kinds of DB2-using organizations. In cases where lock contention problems arose with page-level locking in effect, programmers were often ready and willing to make application code changes that eliminated or at least minimized the occurrence of timeouts and/or deadlocks. On occasion, when an application code change was not feasible, a DB2 DBA would reduce effective lock granularity by limiting the number of rows that could be stored in a table space's pages (this via the MAXROWS option of ALTER TABLESPACE). The marching orders given to most DB2 for z/OS DBAs were, "Focus on making the DB2 workload as CPU-efficient as possible." DBAs heeded these orders, and so routinely wore their "systems" hat.

Time went by, and while DB2 for MVS (and then OS/390 and then z/OS) continued to grow in terms of installations and of data volumes managed and transaction volumes processed, relational database management software for distributed systems servers made the scene and eventually became quite popular. For these DBMSs, row-level was (and still is) the default locking granularity utilized. Page-level locking continued to be (and still is) the default for DB2 for z/OS, but a new index structure introduced with DB2 Version 4 made row-level locking an option for the mainframe platform.

Relational database technology evolved, and so did the ways in which people used it. An important development was the rise of vendor-supplied -- versus in-house-written -- application software. Companies producing this software understandably sought to develop code that would be suitable for use with a wide range of database management systems. That meant designing with row-level locking in mind -- the default in most of the relational database world.

The rise of vendor-supplied applications, of course, didn't mean the end of in-house application development. Organizations continued to build their own apps to address unique demands and opportunities in their respective markets, but in this sphere there was a pronounced shift towards multi-tiered, client-server applications and away from more monolithic application architectures. People developing these applications favored the use of data access interfaces, such as JDBC and ODBC, that worked across many different relational DBMSs. Developing in such a DBMS-agnostic way, programmers quite naturally gravitated towards coding with the most widely-used database locking granularity -- row-level -- in mind.

Fast-forward, then, to the present day. You have row-level-locking-using vendor-supplied applications, and you have in-house developers writing code with an implicit assumption of row-level locking granularity, and you have people who want to pair these purchased and in-house-written applications with DB2 for z/OS (which a friend of mine likes to call the "super-server"), and you have DB2 DBAs and systems programmers looking sternly out from under their "systems" hats and telling these people, "Row-level locking doesn't deliver optimal CPU efficiency. Come back to me after you've changed that application to work well with page-level locking."

EXCUSE ME? There are people coming to you, bearing applications that will drive workload growth on the platform you support, and you're giving them the cold shoulder because the preferred (and maybe required, absent code changes) locking mode might cause DB2-related CPU consumption to be a little higher than it otherwise would be? I have a different response to recommend. How about, "Row-level locking? No problem. Of course we can do that. We've been able to do that for about 20 years now. Works like a champ. Let's talk about setting up a testing environment for the application, and let's start putting a production implementation plan together."

On a macro level: technology enables applications

DB2 for z/OS features all kinds of great technology, and that story gets better with each new release of the software, but guess what? That advanced technology is valuable to an organization to the extent that it facilitates and accelerates the development of applications that make the organization money (enabling new revenue streams, upping business volumes from existing clients, growing the customer base, etc.) or reduce costs (better inventory management, reduced customer churn, reduction of fraud-related losses, etc.), or both; thus, your mindset as a DB2 support person should be about rolling out the red carpet to application developers (and to those interested in storing the data related to vendor-supplied applications in a DB2 for z/OS database). You should aim to make DB2 a preferred DBMS for your organization's programmers, and you don't get there by making those folks do things differently versus other data-serving platforms with which they work.

When you wear your "applications" hat, you think "accommodate" more than "challenge" when working with programmers. Supporting row-level locking is just one example of such accommodation. Another example would be creating an index on a column expression in order to make an otherwise non-indexable predicate indexable (and stage 1). Maybe you utilize concentrate-statements-with-literals functionality (introduced with DB2 10) to have DB2 parameterize dynamic SQL statements that were coded with literal values, when application-side statement parameterization is not feasible. Maybe you create an "instead of" trigger to allow programmers to code data-changing SQL statements targeting what would otherwise be a read-only view. Maybe you create a user-defined function that replicates the capability of a function that was useful to a development team when they worked with some other DBMS. Maybe you define a check constraint for a table to help ensure that only certain data values can be inserted into a column -- this so that programmers won't have to code that data domain-restricting logic themselves. There are many, many more ways in which you can use DB2 functionality to make life easier for application developers -- I've just scratched the surface here. Put on your "applications" hat and get creative and be proactive -- go to programmers with ideas, instead of just waiting for them to bring problems and challenges to you.

Now, reaching first for your "applications" hat doesn't mean throwing your "systems" hat away. Enhancing the CPU efficiency of the DB2 for z/OS workload at your site is a very worthwhile pursuit. That said, you should think about actions that could reduce per-transaction CPU consumption without requiring application code changes. Tuning steps of that nature include page-fixing buffer pools, combining persistent threads with RELEASE(DEALLOCATE) packages, and, in some cases, hash-organizing data in a table. Shoot, just rebinding packages when you go to a new release of DB2 can deliver significant CPU savings for your organization.

DB2 is a tool. When people approach you about using that tool to enable a new in-house-developed application, or to support a vendor-supplied application, you should be delighted, not bothered. Such requests give you the opportunity to show what DB2 for z/OS can do as a data server. Consider DB2 features and functions while wearing your "applications" hat, and you'll be that much more effective in leveraging DB2 technology for the benefit of developers, and to provide an ideal data-serving system for vendor-supplied applications. DB2 for z/OS really is the "super server" -- tops in availability, scalability, and security. Don't stand between it and your organization's applications folks. Be a bridge, not a barrier.

Saturday, March 1, 2014

DB2 for z/OS SQL PL Routines: REBIND PACKAGE vs. ALTER REGENERATE

I have long been a big fan of DB2 for z/OS native SQL procedures, but I also recognized early on that putting this technology to use would require DB2 DBAs to learn some new things. Much of this newness has to do with the fact that for a native SQL procedure (and for its more recently arrived cousin, the "native" SQL scalar function, also known as a compiled SQL scalar function or a non-inline SQL scalar function), there is no external-to-DB2 executable -- no object module, no load module. There is just the SQL PL routine's package (SQL PL is the language in which native SQL procedures and non-inlne SQL scalar functions are written). The package is the executable, period. That being the case, managing these packages is a different game versus managing a package that is tied to (for example) the external-to-DB2 load module of a COBOL stored procedure program. Think about making a package "active" in a DB2 for z/OS system. In the context of a load module-tied package, that's familiar territory: you bind the package into the target DB2 environment, and it doesn't get "activated" (i.e., it's not used) until the associated load module (batch job, CICS transaction, external stored procedure, whatever) is executed in that environment. But what if the package itself is the sole executable (the case, as previously mentioned, for a native SQL procedure or non-inlined SQL scalar function)? What then? Well, that's what the DEPLOY option of BIND PACKAGE, and the ACTIVATE VERSION option of ALTER PROCEDURE and ALTER FUNCTION, are for, as I pointed out in a blog entry (written while I was working as an independent consultant) that I posted back in 2009. In the blog entry I'm writing right now, I want to provide you with information about another new wrinkle associated SQL PL routines: "full" versus "partial" package bind.

"Say what?" you might be thinking. "You don't 'partially' bind a package. You bind the whole thing." That WAS true before SQL PL routines came along (and with "SQL routines" I am NOT referring to external SQL procedures, which are written in SQL PL but become C language programs with embedded SQL DML statements in the course of being prepared for execution). When the package in question is related to a native SQL procedure or a non-inline SQL scalar user-defined function (UDF), a REBIND PACKAGE operation in fact rebinds only part of the package. To understand why this is so, consider that the package for a SQL PL routine has a section that corresponds to the SQL "control" statements in the routine (i.e., the logic flow control statements such as IF, WHILE, ITERATE, and LOOP), and another section that pertains to the SQL DML statements (e.g., SELECT, INSERT, UPDATE, DELETE) in the native SQL procedure or non-inline SQL scalar UDF. Generally speaking, a REBIND PACKAGE is executed so as to drive reoptimization for SQL DML statements. If, say, a new index is added to a table to provide a better-performing access path for a static SQL statement embedded in a transaction program, DB2 will not use the new index in executing the statement unless the package of which the statement is a part is rebound. But what if the package is, essentially, the compiled form of a native SQL procedure or a non-inline SQL scalar UDF? In that case, if you want to drive reoptimization of DML statements included in the SQL procedure or UDF, you don't necessarily want the control statements in the SQL PL routine reworked, and so REBIND PACKAGE for a SQL PL routine leaves the control-statement section of the package alone and rebinds only the DML section.

What if you DO want a SQL PL routine's package to be rebound in its entirety? That can be done by way of an ALTER PROCEDURE (or ALTER FUNCTION) statement with the REGENERATE option specified. The effect of that statement's execution will be a rebinding of the whole of the package -- the SQL control statements as well as the DML statements. Why might you want to use ALTER PROCEDURE (or ALTER FUNCTION) with REGENERATE versus REBIND PACKAGE for the package associated with a SQL PL routine? One reason would be to rebind the control statements of a routine so as to get the advantage of a SQL PL performance enhancement (an example of such an enhancement would be the reduced path length of IF statement execution delivered with DB2 10 for z/OS). Here's another reason to use ALTER PROCEDURE (or FUNCTION) with REGENERATE: in a DB2 10 (or later) environment, it gets the bulk of the control section of the package, as well as the DML section, copied into above-the-bar virtual storage in the DB2 database services address space (DBM1) when the package is executed (i.e., when it's allocated to a thread). If a native SQL procedure created in a DB2 9 system is not either regenerated (via ALTER PROCEDURE with REGENERATE) or recreated in a DB2 10 (or later) system, the control section of the procedure's package will occupy more below-the-bar virtual storage when allocated to a thread.

Note that the REBIND PACKAGE option APREUSE, which tells DB2 to reuse the existing access path (when possible) for SQL DML statements when a package is rebound, is not an option that can be specified in an ALTER PROCEDURE (or FUNCTION) statement; so, when you execute ALTER PROCEDURE (or FUNCTION) with REGENERATE, access path changes for SQL DML statements in the procedure or function are a possibility.

Now you know (if you didn't already): with respect to a package associated with a native SQL procedure or a non-inline SQL scalar UDF, use REBIND PACKAGE when all you want to do is rebind the part of the package pertaining to the SQL DML statements in the procedure or function. To rebind all of the package's statements -- control statements as well as DML statements -- use ALTER PROCEDURE or ALTER FUNCTION with the REGENERATE option.

Tuesday, February 11, 2014

DB2 11 for z/OS Bind Concurrency - Even Better Than You Think*

For the asterisk at the end of the this blog entry's title, you could substitute, "If what you think is based on what you've likely read or heard about this feature up till now." In the space below, I'll provide some background information on DB2 11's bind concurrency enhancement, and then I'll give you the bonus insight that you might see as the "cherry on top" -- the important scenario that is addressed by DB2 11 bind concurrency but which has received little (if any) coverage in documents and presentations that I've seen to date.

OK, background: first, the term "bind concurrency" is kind of a nickname for the DB2 11 feature of interest here. A longer and more comprehensively descriptive name for the enhancement could be worded as follows: "the capability through which BIND, REBIND, DDL, and utility operations can execute successfully in a DB2 11 new-function mode environment when they would otherwise have been blocked by RELEASE(DEALLOCATE) packages executed via persistent threads (or blocked by another circumstance)." That's a mouthful. You can see why people use the short descriptor, "bind concurrency." That "other circumstance" in the italicized parenthetical phrase has to do with the "cherry on top" aspect of bind concurrency to which I referred previously, and which I'll describe momentarily.

About RELEASE(DEALLOCATE) and persistent threads: this is a combination that has long been utilized at DB2 sites to improve the CPU efficiency of DB2-accessing applications. It's something about which I've blogged a number of times over the years, most recently in an entry posted here just a couple of weeks ago. A persistent thread (i.e., one that persists through commits) can stick around for quite some time, and therein lies a rub that can be encountered when a package bound with RELEASE(DEALLOCATE) is executed by way of such a thread: the package is considered to be in-use as long as the thread exists, because it won't be "released" until thread deallocation time (versus being released at each commit). This potentially lengthy and continuous (for the life of the thread) "use" of the package (not to be confused with "execution" of the package -- a package, once executed through a persistent thread, retains "in use" status until thread deallocation, even if the thread won't be deallocated for a while after package execution has completed) can interfere with a variety of database administration operations. For example, a package cannot be rebound while it is in use. A DDL operation that would cause a dependent package to become invalidated, such as adding a partition to a range-partitioned table, will not succeed while the package is in use, because a package can't be invalidated while it's in use. Similarly, a utility operation that would cause a package to be invalidated, such as an online REORG executed for the purpose of materializing a pending DDL action, will not succeed if the package is in use.

So, RELEASE(DEALLOCATE) combined with persistent threads is a CPU efficiency booster, but from a DB2 DBA's perspective it can gum up the system in some respects. This flip side of RELEASE(DEALLOCATE) plus persistent threads was addressed nicely for high-performance DBATs (a type of persistent thread introduced with DB2 10 for z/OS) through the introduction of the PKGREL keyword of the -MODIFY DDF command: when you need a "window" in which to perform bind/rebind, DDL, or utility operations that might be blocked by high-performance DBATs, you issue the command -MODIFY DDF PKGREL(COMMIT) and that effectively turns off high-performance DBATs until further notice (that "further notice" is indicated via issuance of the command -MODIFY DDF PKGREL(BNDOPT), which re-enables high-performance DBAT functionality).

That's nice for high-performance DBATs, but what if RELEASE(DEALLOCATE) packages are used with other persistent thread types, such as CICS-DB2 protected entry threads or long-running batch jobs (a batch job's thread doesn't go away until the job completes, regardless of the number of commits issued in the course of the job's execution)? That's where DB2 11 bind concurrency comes in. DB2 11 for z/OS introduces a new ZPARM parameter, PKGREL_COMMIT, that can be set to YES (the default) or NO. In a DB2 11 new-function mode environment, PKGREL_COMMIT YES has this effect: when a BIND/REBIND, DDL, or utility operation would be blocked because of a RELEASE(DEALLOCATE) package tied to a persistent thread, a flag is set for that package and any persistent thread to which the package is allocated will automatically switch to RELEASE(COMMIT) behavior for the package at the next commit (or rollback). Voila - the package is released when associated persistent threads commit, and the BIND/REBIND, DDL, or utility operation that would otherwise have hit a package-related concurrency wall can instead execute to completion. Pretty slick, eh?

There are a few caveats I should mention: PKGREL_COMMIT YES does not free up a RELEASE(DEALLOCATE) package at the next commit point for a persistent thread when 1) the package has a WITH HOLD cursor that is open at the time of the next commit, or 2) the package was bound with KEEPDYNAMIC(YES), or 3) the next commit (or rollback) is issued by a stored procedure.

Now for the promised cherry on top: in looking at documentation and/or presentations that describe DB2 11 bind concurrency, you may find yourself thinking:

"Nice feature, but it appears to be applicable only to situations involving RELEASE(DEALLOCATE) packages and persistent threads. In my DB2 system, we've run into package-related bind, DDL, and utility concurrency problems that have nothing to do with RELEASE(DEALLOCATE) and persistent threads. These cases involve packages executed by high-volume transactions. We have trouble sometimes getting bind, DDL, and utility operations to go through to successful completion. These operations get blocked by an in-use package. Even though the package is bound with RELEASE(COMMIT) and is not executed by way of persistent threads, it is in-use for long stretches of time because there is always at least one transaction using it at any given time. In other words, executions of some of our high-volume transactions overlap so that the use count for associated packages rarely goes to zero. When are we going to get some relief from that problem?"

Guess what? You have gotten relief from that problem, and the relief is in the form of the DB2 11 bind concurrency feature about which I've been writing in this blog entry. See, even though the write-up -- in documents and presentations -- about DB2 11's bind concurrency feature always (as far as I've seen) mentions RELEASE(DEALLOCATE) and persistent threads, that's NOT the only scenario to which bind concurrency is applicable. It also addresses bind, DDL and utility blockage situations caused by in-use RELEASE(COMMIT) packages that stay in-use for long stretches of time because of overlapping executions of related transactions. Bind concurrency functionality does this by providing something akin to a drain locking mechanism for packages (drain locking, introduced way back in the nineties, pertains to table spaces and indexes and partitions of same, and is used by DB2 utilities and also some commands and SQL statements). When PKGREL_COMMIT YES is in effect and an in-use package needs to go out-of-use to allow a bind or DDL or utility operation to complete, the package in question is flagged by DB2 and any transaction that would use the package will hold off on using the package as long as the flag remains in place; thus, as in-flight transactions associated with a flagged package hit commit points and other transactions hold off on using the package, the package's use count goes to zero and the bind or DDL or utility operation that is waiting to proceed gets the green light and gets done. After that, normal application processing resumes.

And there you have it: DB2 11 bind concurrency is even better than advertised. If you're not running DB2 11 yet, this gives you one more reason to get going with migration planning.

Saturday, January 25, 2014

DB2 10 (and beyond) for z/OS: Being Smart About More RELEASE(DEALLOCATE)

By now, a lot of you probably know that a really important action to take, once your DB2 subsystem is at the Version 10 (conversion mode) level, is to rebind your plans and packages. Taking this action delivers several benefits:
  • It tends to improve the CPU efficiency of package execution. Obviously, you could see performance gains related to positive access path changes, as the optimizer gets smarter with each new release of DB2; however, even if access paths associated with a package don't change, CPU consumption reduction is quite likely to be achieved through a DB2 10 rebind. Why? Because the package code so generated -- including the compiled, executable form of static SQL statements -- is more CPU-efficient than the pre-DB2 10 version of same.
  • It causes package table virtual storage to be allocated from an agent local pool in the DB2 DBM1 address space, versus the EDM pool. The package table is the space into which a package is copied from the skeleton package table when it is executed (there will be a copy of a package in the package table for every thread to which the package is concurrently allocated). Because package table space no longer comes from the EDM pool after a package has been bound or rebound in a DB2 10 system, a latch formerly needed to serialize acquisition and release of EDM pool space for package table use is no longer needed for that purpose, and that's good for application throughput (latch reduction and elimination is a significant factor in the DB2 10 scalability story).
  • It causes almost all thread-related virtual storage to go above the 2GB "bar" in DBM1.

That last item is noteworthy for two reasons: 1) it means that a DB2 10 subsystem can support a much higher number of concurrently active threads versus a DB2 8 or 9 environment, and 2) it means that with DB2 10 you have a lot more virtual storage "head room" than before to accommodate expanded use of the RELEASE(DEALLOCATE) option of BIND and REBIND PACKAGE. The rest of this blog entry is focused in the latter of these two points.

Binding a package with RELEASE(DEALLOCATE) can enhance the performance of associated applications, but you do NOT want to bind all of your packages in this manner; rather, you should be selective in looking to increase the number of RELEASE(DEALLOCATE) packages in a DB2 system. Below are some guidelines for your consideration. 

First, divide you search for candidate packages (i.e., packages currently bound with RELEASE(COMMIT) that could be advantageously rebound with RELEASE(DEALLOCATE)) into two courses: batch and online.

BATCH

Binding batch-executed packages with RELEASE(DEALLOCATE) can boost the performance of jobs that issue frequent commits in two ways: 1) as is the case for online transactions that reuse threads (more on this below), CPU consumption is reduced because the cost of releasing and then re-acquiring the same table space locks and package table elements at each commit is eliminated; and 2) additional CPU cycles can be saved through more effective leveraging of index lookaside (resulting in fewer GETPAGEs for index access) and sequential detection (which triggers dynamic prefetch). When RELEASE(COMMIT) is in effect, the execution efficiency benefits of index lookaside and sequential detection are diminished because the information tracked by DB2 in relation to these two performance features is reset at each commit, versus being retained across commits for packages bound with RELEASE(DEALLOCATE).

So, in reviewing your DB2-accessing batch programs, look for jobs that issue a lot of commits. If you use DB2 monitor-generated accounting reports as input to your analysis, note that information on commit frequency is available at the correlation name level (for batch applications, the job name is the DB2 correlation name -- you can request that your monitor generate accounting reports with data ordered by correlation name). One batch job might involve execution of several DB2-accessing programs and associated packages, so once you find a batch job that issues a lot of commits you might need to identify the related packages and then see if particular packages are the ones that drive a lot of commit activity (application developers might be able to help with that). Those packages, if currently bound with RELEASE(COMMIT), could potentially be rebound with RELEASE(DEALLOCATE) to good effect.

Considerations: Packages bound with RELEASE(DEALLOCATE) can get in the way of some package bind/rebind, DDL, and utility operations, so if you want to use RELEASE(DEALLOCATE) for batch packages you'll need to see when the associated jobs run (when a program with a RELEASE(DEALLOCATE) package is executed, the package will stay allocated to the batch job's thread until the job runs to completion) and make sure that you can, if needed, schedule package bind/rebind and DDL and utility operations (those that would affect RELEASE(DEALLOCATE) packages, or objects on which the RELEASE(DEALLOCATE) packages are dependent) around the batch job's scheduled run time (with DB2 11, there is relief in this area: a package's RELEASE status can be dynamically, automatically, and temporarily changed from DEALLOCATE to COMMIT to enable a bind or rebind or DDL or utility operation that would otherwise be blocked to proceed). Also, if you want to bind a batch-executed package with RELEASE(DEALLOCATE), check to see if it is executed through multiple different batch jobs versus just one -- again, you might need to schedule package bind/rebind and DDL and utility operations around the execution times of these jobs.

You'll probably want to ensure that RELEASE(DEALLOCATE) is not used for batch packages that issue LOCK TABLE statements or for batch packages that trigger lock escalation -- that could potentially lead to lock contention problems, because those exclusive table space locks would be retained until thread deallocation with RELEASE(DEALLOCATE) in effect (retention of table space-level locks for the duration of a thread is generally not an issue if the locks are of the intent variety, versus exclusive).


ONLINE -- CICS

First, look for CICS-DB2 packages associated with frequently executed transactions. As in analyzing a DB2-accessing batch workload, you could be helped in reviewing CICS-DB2 activity by looking at DB2 monitor-generated accounting reports that exclude non-CICS work and in which data is grouped at the correlation name level (that would be, for CICS programs, the CICS transaction name). Once the most frequently executed transactions are identified, you can determine the DB2 packages related to those transactions. In addition to frequency of execution, you should also look for transactions and packages with relatively low in-DB2 times, because it is for quick in-and-out transactions that the cost of repeatedly releasing and re-acquiring table space locks and package table elements is proportionately higher (versus transactions with larger in-DB2 CPU times). Also, keep in mind that RELEASE(DEALLOCATE) for CICS-DB2 packages won't affect performance if the threads through which the packages are executed are not reused. Typically, CICS-DB2 thread re-use is driven through the specification of some protected entry threads for transactions that are frequently executed (i.e., by making PROTECTNUM greater than 0 in the CICS DB2ENTRY resources associated with these transactions).

Considerations: As previously noted, DB2 11 provides relief for the problem of getting bind/rebind and DDL and utility operations to complete successfully when RELEASE(DEALLOCATE) packages and persistent threads (such as CICS-DB2 protected entry threads) are in the picture, but in a DB2 10 environment you'll need to think about whether this could be an issue for you (there are sites at which the execution volume of some CICS-DB2 transactions is sufficient to keep protected entry threads around for days). Give some thought to the following: if you're going into a time period during which you need to accomplish some bind/rebind and/or DDL and/or utility actions, and you feel that these actions might be blocked by RELEASE(DEALLOCATE) packages executed via CICS-DB2 protected entry threads, use CICS RDO (resource definition online) to dynamically take PROTECTNUM to 0 for related DB2ENTRY resources. This could have the effect of making RELEASE(DEALLOCATE) CICS-DB2 packages behave as though bound with RELEASE(COMMIT), if it were to cause the level of CICS-DB2 thread re-use to drop to none. Do your bind/rebind and/or DDL and/or utility work, then dynamically take PROTECTNUM values back to where they had been in order to get thread re-use going again.

ONLINE -- DDF

As with CICS-DB2 packages, frequency of execution plus relatively low in-DB2 CPU times are the key attributes for which you're searching. Candidates could be packages associated with stored procedures that are frequently executed through DDF. Also candidates are the IBM Data Server Driver and/or DB2 Connect packages, but you almost certainly DO NOT want RELEASE(DEALLOCATE) to be in effect for all applications that use these packages. To provide some control in this regard, bind the Data Server Driver and/or DB2 Connect packages into the default NULLID collection with RELEASE(COMMIT), and bind them with RELEASE(DEALLOCATE) into a collection with a different name. That way, you can make selective use of the RELEASE(DEALLOCATE) packages (and, therefore, make selective use of high-performance DBATs, which enable thread re-use for network-attached DB2-accessing applications) by pointing an application to the NULLID collection or the other collection via a data source property on the client side.

Considerations: If you need to shut down high-performance DBATs for a time to get some bind/rebind, DDL, or utility stuff done, do that by issuing the DB2 command -MODIFY DDF PKGREL(COMMIT). Then, when you're ready to re-enable high-performance DBATs, do so by issuing -MODIFY DDF PKGREL(BNDOPT).

Also, note that use of high-performance DBATs will decrease the number of DBATs in the DBAT pool, so you'll likely want to increase the value of MAXDBAT in ZPARM to help ensure that you continue to have a reasonable number of threads in your DBAT pool.


The bottom line

DB2 10 for z/OS gives you room, virtual storage-wise, to expand your use of the RELEASE(DEALLOCATE) package bind option. Taking this action can deliver performance dividends, but you'll want to be smart about it. Find the packages that would be best candidates for binding or rebinding with RELEASE(DEALLOCATE), and take steps to help ensure that a larger number of RELEASE(DEALLOCATE) packages won't lead to contention problems for bind/rebind, DDL, and utility operations. Oh, and measure the in-DB2 time for affected programs before and after you change packages to RELEASE(DEALLOCATE), so you'll have documented proof that the move paid off in the form of reduced in-DB2 times.

Friday, January 17, 2014

DB2 for z/OS: Answering Some Questions About Adding and Dropping Table Columns

From the very beginning, you've been able to add a column to an existing DB2 for z/OS table by way of the SQL statement ALTER TABLE with the ADD COLUMN option. In fact, relative ease of logical database design modification had a lot to do with DB2 becoming a very widely used DBMS. Of course, there are times when what you want to do is remove a column from an existing table. That proved to be a tougher nut to crack, functionality-wise, but we finally got ALTER TABLE DROP COLUMN with DB2 11 for z/OS. I'm not going to try to tell you everything there is to know about ALTER TABLE DROP COLUMN in this blog entry (you can find a very good write-up on this new feature in section 4.7 of the "red book" titled, IBM DB2 11 for z/OS Technical Overview). Instead, what I want to do here is address a couple of questions that could come to mind as you learn more about DROP COLUMN support.

Question: ALTER TABLE DROP COLUMN is a pending change. Is ALTER TABLE ADD COLUMN also a pending change in a DB2 11 environment?

Short answer: No

Longer answer: One of the really great enhancements delivered with DB2 10 for z/OS is something called pending DDL. Again, I won't try in this space to provide you with a pile of relevant information (check out section 4.1.3 of the IBM redbook titled, DB2 10 for z/OS Technical Overview), but here are two of the key aspects of pending DDL:
  • It enabled non-disruptive implementation of a number of DB2 database design changes. Like what? Like changing a table space's page size, or DSSIZE, or SEGSIZE, or type (e.g., traditional segmented to universal partition-by-growth, or "classic" table-controlled partitioned to universal partition-by-range); or, with DB2 11, dropping a column from a table. How is "non-disruptive" pulled off? Simple: After the ALTER is executed, the target table space is placed in a new (with DB2 10) and non-restrictive advisory REORG-pending state labeled AREOR. A subsequent online REORG (SHRLEVEL REFERENCE or SHRLEVEL CHANGE) of the entire table space (not a partition-level REORG) materializes the outstanding pending changes associated with the table space.
  • It enabled a "never mind" action with respect to the aforementioned database design changes. Suppose you act to change a table space's SEGSIZE via ALTER TABLESPACE, and then decide that you'd rather stick with the current SEGSIZE. Is that a problem? Not at all, especially if the online REORG that would materialize the pending change in segment size for the table space has not yet been executed. In that case, you'd just issue an ALTER TABLESPACE statement for the object in question, with the new (starting with DB2 10) DROP PENDING CHANGES option. Do that, and the pending changes -- which had been recorded by DB2 in the new (with DB2 10) SYSPENDINGDDL catalog table -- are deleted, the table space's AREOR status is removed, and you're back where you started.

[Note that with the exception of a change that results in the conversion of a non-universal table space to a universal table space (such as specifying MAXPARTITIONS in an ALTER TABLESPACE statement to convert a simple or traditional segmented table space to a universal partition-by-growth table space), pending changes can be made only to universal table spaces.]

So, in reading about ALTER TABLE DROP COLUMN for DB2 11, you could find yourself thinking, "Ooh -- if ALTER TABLE DROP COLUMN is a pending change, was ALTER TABLE ADD COLUMN also made a pending change in a DB2 11 environment?" As I indicated above, the answer to that question is, "No." ALTER TABLE ADD COLUMN remains in a DB2 11 system what it's always been: an immediate change. That doesn't mean that the column in question is immediately added to the table in a physical sense (that's accomplished by way of a subsequent REORG). It means that the table's definition in the DB2 catalog is immediately changed to reflect the addition of the new column.

This answer to the first question leads to the second question.

Question: OK, so ALTER TABLE ADD COLUMN is not a pending change. Even so, could the new ALTER TABLE DROP COLUMN functionality provided by DB2 11 be used to "undo" an ALTER TABLE ADD COLUMN action, before the column has been physically added to the table?

Short answer: Yes, but...

Longer answer: Yes, but you'll still need to execute an online REORG of the table space. You might think, "Why is that? I haven't yet executed a REORG to physically add the column named in the ALTER TABLE ADD COLUMN statement; therefore, the column is only logically there. Can't it just be logically removed?" The online REORG of the table space is required because ALTER TABLE DROP COLUMN is a pending change, and pending changes don't get acted upon until an online REORG is subsequently executed for the associated table space. This online REORG to put the ALTER TABLE DROP COLUMN into effect won't change the table in a physical sense -- it will just ensure that the logically added column (reflecting the ALTER TABLE ADD COLUMN action that you now want to un-do) is not physically added to the table.

Here's a sweetener for you: if you run an online REORG just to put pending changes into effect for a table space, and you don't want to incur the cost of re-establishing clustering sequence for table rows, DB2 11 provides a CPU-saving feature. Specify, in the utility control statement, SORTDATA NO (previously not permitted for a SHRLEVEL CHANGE online REORG) together with the new RECLUSTER NO option, and data WILL NOT be unloaded by way of the table's clustering index and WILL NOT be sorted in clustering sequence.

There is a lot of great stuff in DB2 11. This blog entry has highlighted a bit of that great stuff (and some of the great stuff introduced with DB2 10). More to come, so come back by.

Sunday, December 29, 2013

DB2 for z/OS: Want to use High-Performance DBATs? Check your MAXDBAT Value

Of the features introduced with DB2 10 for z/OS, high-performance DBATs is one of my favorites. It enabled (finally) DDF-using applications to get the CPU efficiency benefit that comes from combining thread reuse with the RELEASE(DEALLOCATE) package bind option -- a performance tuning action that has long been leveraged for CICS-DB2 workloads. Implementing high-performance DBATs is pretty easy: in a DB2 10 (or 11) environment, when a package bound with RELEASE(DEALLOCATE) is executed by way of a DBAT (i.e., a database access thread -- the kind used for DRDA requesters that connect to DB2 via the distributed data facility), that thread becomes a high-performance DBAT (if it isn't one already). Before jumping into this, however, you should consider some things that are impacted by the use of high-performance DBATs. One of those things is the DBAT pool. That's where the MAXDBAT parameter of ZPARM comes in, and that's what this blog entry is about.

The value of MAXDBAT determines the maximum number of DBATs that can be concurrently active for a DB2 subsystem. The default value is 200, and at many sites that value, or one that's a little larger, has effectively supported a much greater number of DB2 client-server application connections (the default value for CONDBAT in ZPARM -- the maximum number of connections through DDF to a DB2 subsystem -- is 10,000). How so? Well, if your system is set up to allow for inactive connections (CMTSTAT = INACTIVE has been the default in ZPARM since DB2 V8), when a DDF transaction completes the associated connection will go into an inactive state (a very low-overhead transition, as is the transition back to the active state) and the DBAT used for the transaction will go into the DBAT pool, ready to service another transaction. That can happen because a "regular" DBAT is only associated with a particular DB2 connection while it is being used to execute a request from said connection. Because it is common for only a small percentage of DDF connections to a DB2 subsystem to be active (i.e., associated with in-flight transactions) at any given moment, a large ratio of connections to DBATs has historically been no problem at all.

Bring high-performance DBATs into the picture, and things change. In particular, a high-performance DBAT, once instantiated, will remain dedicated to the connection through which it was instantiated until it's been reused by 200 units of work (at which point it will be terminated, so as to free up resources allocated to the thread). That high-performance DBAT, therefore, will NOT go into the DBAT pool when a transaction using the thread completes. When a request associated with another connection comes in (i.e., from a connection other than the one through which the high-performance DBAT was instantiated), the high-performance DBAT won't be available to service that request. Some other DBAT will have to be used, and guess what? If that DBAT isn't a high-performance DBAT, it will become one if the package associated with the incoming request (and that could be a DB2 Connect or IBM Data Server Driver package) was bound with RELEASE(DEALLOCATE). The DBAT pool thus becomes progressively smaller as high-performance DBATs are instantiated. Know what else happens? The number of active DBATs goes up -- maybe sharply. Why? Because a "regular" DBAT is active only while it is being used to execute a DDF transaction. A high-performance DBAT, on the other hand, is considered to be active as long as it exists -- that will be 200 units of work, as mentioned previously, and when a high-performance DBAT is waiting to be reused, it's an active DBAT.

This last point -- about the number of active DBATs potentially rising sharply when high-performance DBATs are utilized -- is illustrated by some information I recently received from a DB2 professional. At this person's shop, high-performance DBATs were "turned on" for a DB2 subsystem (the PKGREL option of the -MODIFY DDF command can be used as a "switch," telling DB2 to either honor RELEASE(DEALLOCATE) for packages executed via DBATs -- thereby enabling instantiation of high-performance DBATs -- or not), and the number of active DBATs for the subsystem went from the usual 60 or so to about 700. Because the MAXDBAT value for the DB2 subsystem was already at 750, these folks didn't run out of DBATs, but the pool of "regular" DBATs got pretty small. In response to the big increase in active DBATs seen when high-performance DBAT functionality was enabled, the MAXDBAT value for the DB2 system in question was increased to 2000. Was this OK? Yes: When packages are bound or rebound in a DB2 10 for z/OS environment, almost all thread-related virtual storage goes above the 2 GB "bar" in the DBM1 address space, and that allows for a 5- to 10-times increase in the number of threads that can be concurrently active for the DB2 subsystem.

So, if you're thinking about using high-performance DBATs (and you should), check your subsystem's MAXDBAT value, and consider making that value substantially larger than it is now. Additionally, take steps to enable selective use of high-performance DBATs by your network-attached, DB2-accessing applications. For programs that contain embedded SQL statements and, therefore, have their own packages (e.g., DB2 stored procedures -- both external and native), use RELEASE(DEALLOCATE) for the most frequently executed of these packages. For the packages associated with DB2 Connect and/or the IBM Data Server Driver, use two collections: The default NULLID collection, into which you'd bind the DB2 Connect and/or IBM Data Server Driver packages with RELEASE(COMMIT), and another collection (named as you want) into which you'd bind these packages with RELEASE(DEALLOCATE). Then, by way of a data source or connection string specification on the client side, direct DDF-using applications to NULLID or the other collection name, depending on whether or not you want high-performance DBATs to be used for a given application.

To keep an eye on DBAT usage for a DB2 subsystem, periodically issue the command -DISPLAY DDF DETAIL. In the output of that command you'll see a field, labeled QUEDBAT, that shows the number of times (since the DB2 subsystem was last started) that requests were delayed because the MAXDBAT limit had been reached. If the value of this field is non-zero, consider increasing MAXDBAT for the subsystem. You might also want to look at the value of the field DSCDBAT in the output of the -DISPLAY DDF DETAIL command. This value shows you the current number of DBATs in the pool for the subsystem. As I've pointed out, maintaining the "depth" of the DBAT pool as high-performance DBAT functionality is put to use might require increasing MAXDBAT for your DB2 subsystem.

DDF activity can also be tracked by way of your DB2 monitor. I particularly like to use a DB2 monitor-generated Statistics Long Report to see if the connection limit for a DB2 subsystem (specified via the CONDBAT parameter in ZPARM) is sufficiently high. In the section of the report under the heading "Global DDF Activity," I'll check the value of the field labeled CONN REJECTED-MAX CONNECTED (or something similar -- fields in reports generated by different DB2 monitors might be labeled somewhat differently). A non-zero value in this field is an indication that the CONDBAT limit has been hit, and in that case you'd probably want to set CONDBAT to a larger number to allow more connections to the DB2 subsystem.

So there you go. Using high-performance DBATs can improve the CPU efficiency of your DB2 for z/OS client-server workload, but if you do leverage high-performance DBAT functionality then you might need to boost the DBAT limit for your DB2 subsystem in order to maintain the depth of your DBAT pool, because as high-performance DBATs increase in number, pooled DBATs decrease in number (unless you've upped your MAXDBAT value to compensate for this effect). Boosting MAXDBAT in a DB2 10 (or 11) environment is OK, as thread-related virtual storage in such an environment is almost entirely above the 2 GB "bar" in the DBM1 address space (assuming that packages have been bound or rebound with DB2 at the Version 10 or 11 level). Of course, you need real storage to back virtual storage, so if you increase the MAXDBAT value keep an eye on the z/OS LPAR's demand paging rate and make sure that this doesn't get out of hand (if the demand paging rate is in the low single digits or less per second, it's not out of hand).