Monday, April 30, 2018

Db2 for z/OS Buffer Pools - Time to Give AUTOSIZE(YES) a Try?

Db2 9 for z/OS became generally available back in 2007. In the years since, most of the features introduced with Db2 9 have been broadly put to use. An interesting exception is the auto-sizing of buffer pools, implemented via the AUTOSIZE(YES) option of the -ALTER BUFFERPOOL command. Based on my experience, it seems that few Db2 for z/OS sites have gone with buffer pool auto-sizing. I find that a bit perplexing. In the Db2 for Linux/UNIX/Windows user community, a feature called self-tuning memory (which applies to buffer pools and more) appears to be quite popular. Why the seeming reluctance to leverage buffer pool auto-sizing in Db2 for z/OS systems?

I'm thinking that we may be seeing a situation not unlike the one that existed back in the 90s and early 2000s vis-a-vis SMS-management of Db2 for z/OS data sets (SMS is short for System-Managed Storage, a capability provided by the z/OS operating system). Db2 management of table space and index data sets, enabled through a specification of USING STOGROUP (instead of USING VCAT) in a CREATE TABLESPACE or CREATE INDEX statement (or in an ALTER of an existing object), had become commonplace, but many Db2 DBAs and systems programmers were reluctant to take the additional step of letting SMS take care of data set placement within a disk subsystem. Folks were fine with Db2 issuing the Access Method Services commands that were needed to create data sets (what you get with USING STOGROUP), but letting the system handle data set placement was another matter. The thought of having VOLUMES('*') in a CREATE STOGROUP statement, which would hand data set placement responsibility over to SMS, made lots of Db2 systems people uneasy - they just didn't trust z/OS to get that done right; so, folks would create STOGROUPs that were associated with particular disk volumes (e.g., CREATE STOGROUP SGPRD01 VOLUMES(ABC123), CREATE STOGROUP SGPRD02 VOLUMES(XYZ139), etc.), and they would then assign individual Db2 data sets to this STOGROUP or that one - in effect, hand-placing the data sets within a disk subsystem - so as to ensure that the data sets in question went on different disk volumes. "I want to make sure that partition 1 of partitioned table space TS1 doesn't go on the same volume as partition 2 of that table space," they'd say; or, "I want to make sure that index IX1 goes on a volume other than the one holding the table on which IX1 is defined."

This very hands-on approach to Db2 data set placement started to become a major burden on DBAs and systems programmers as Db2 databases mushroomed in size and the number of data sets associated with Db2 databases shot up. Something had to give, and so some Db2 systems people - not without trepidation - decided to let SMS do a job that had become too big to handle via the old hand-placement approach. I had recently started work in the IT department of a Db2 for z/OS-using organization, back in 2000 (between my two IBM stints), when our lead z/OS systems programmer told me that she was going to go with SMS-management of Db2 data sets in production. She just hoped that the performance penalty wouldn't be too severe. Guess what? At our site, and at other sites, SMS management of Db2 data sets worked just fine. Performance didn't suffer, and tons of Db2 administrator time got freed up for use in higher-value activities. Within a short time, SMS management of Db2 data sets became the norm. Key to that development: the old approach (hand-placement of Db2 data sets within disk subsystems) became increasingly untenable, and technology advances made old worries concerning the new approach (SMS management of Db2 data sets) largely moot (z/OS and IBM Z and disk storage improvements made contention at the spinning-disk level much less of a concern than it had been in the past).

Today, a Db2 administration practice that ought to be reexamined is micro-management of buffer pools. Historically, people have sliced and diced buffer pool memory usage in a very fine-grained way. With ongoing growth in the size and number of Db2 subsystems that individual organizations use, and with Db2 for z/OS administration teams being, typically, small in terms of headcount, the time needed to effectively manage Db2 buffer pool sizing is, more and more, just not there. Turning that task over to Db2 (and the z/OS Workload Manager) is an increasingly attractive alternative. At the same time, technology changes have made buffer pool micro-management less worthy of major time commitment: IBM Z memory resources are getting to be huge by recent standards, and as z/OS LPAR real storage sizes soar, super-fine tuning of the use of that space for Db2 buffer pools becomes less and less of an imperative. Not only that, but many a Db2 administrator's mind-set, regarding appropriate buffer pool sizing, has fallen behind the reality of the current hardware situation. What I mean by that: a 10-GB buffer pool configuration (aggregate size of all pools allocated for a subsystem) may seem large in a Db2 administrator's eyes, given historical standards, but in an LPAR with 100+ GB of memory (pretty commonplace these days), that 10 GB looks dinky. It's like z/OS is saying to these people, "Hey, there are a lot of gigabytes in this system that are just lying around, doing nothing. I can see them. Give me and Db2 the keys to the buffer pool sizing car, and we'll put those sleeping gigabytes to work, and application performance will be the better for it. And don't worry about my overloading my own memory resource - I'm not going to do that. I know how to manage what I've got. Give me a chance."

In what I think might have been conceived as an enticement aimed at Db2 people who look askance at system-managed buffer pool sizing, Db2 11 provided "floor" (VPSIZEMIN) and "ceiling" (VPSIZEMAX) options for the -ALTER BUFFERPOOL command: "OK, Db2, I'll turn on auto-sizing for BP10, but I'm not going to let you make the pool smaller than X or larger than Y." I'd say, give that a shot. You can do that for several of your buffer pools (even all of them), or just one. Say you have a z/OS LPAR with lots and lots of memory (when organizations upgrade a mainframe, they often load it with buckets of additional real storage, due in no small part to attractive memory pricing), and a demand paging rate that doesn't vary from zero, even during the busiest processing times of the day (the demand paging rate for an LPAR is available, among other places, in an RMF CPU Summary Report). Say you have in this system a Db2 for z/OS buffer pool (I'll stick with BP10) that is sized at 100,000 buffers. Consider issuing a command of this nature:

-ALTER BUFFERPOOL(BP10) VPSIZE(100000) 
  VPSIZEMIN(75000) VPSIZEMAX(150000)
  ...
  AUTOSIZE(YES)

Let the workload go for a while, and periodically issue -DISPLAY BUFFERPOOL(BP10) DETAIL to see what's happening with the pool. Even better, have "before" and "after" activity data for the pool, for the same hour of the same day (e.g., 10-11 AM on a Thursday), from either a Db2 monitor-generated statistics report - long or the output of -DISPLAY BUFFERPOOL(BP10) DETAIL (if you go with the command for activity data, and you want data from 10-11 AM, issue the command at 10 AM and again at 11 AM, and you'll have activity data for that 1-hour period in the output of the second issuance of the command). My expectation would be that things will be fine, and some performance metrics might move in a positive direction. And, the system's demand paging rate will likely still be zero. Next, maybe open up auto-sizing for other pools in your system, again with floor and ceiling values with which you're comfortable. And, go back to a pool for which you previously activated auto-sizing, and, if the system-adjusted pool size is larger than its initial size, try another -ALTER BUFFERPOOL with a larger VPSIZE and larger VPSIZEMIN and VPSIZEMAX values, to let the system take the pool size further in the direction that the system has determined is best for overall performance. Over time, you may well become not only comfortable, but happy, with Db2 and z/OS handling sizing for all of a subsystem's buffer pools.

And, just in case you hear a little voice in your head, saying, "Careful, buddy. You spend a pretty good bit of time on buffer pool sizing. If you let the system handle that then you'll be sitting around twiddling your thumbs, and that's not good," tell that voice to shut up. Did you run out of things to do when you let SMS manage Db2 data set placement? Or when you no longer had to manage index page locking because Db2 stopped doing that kind of locking? Or when you could quit managing the package table part of the EDM pool because Db2 quit using EDM pool space for allocating packages to threads? Or when you let Db2 manage secondary disk space allocation for table space and index data sets with MGEXTSZ=YES in ZPARM and Db2's "sliding scale" algorithm? Of course not. In each and every case in which Db2 and/or z/OS took on tasks you used to do (or Db2 stopped doing things that you used to have to work to manage), you ended up staying plenty busy, with a different mix - a higher value mix - of work. System-managed buffer pool sizing offers another opportunity to move your work mix up the value chain. I think the time's right for more people to take that step. Maybe it's your time.

Monday, April 2, 2018

Another Db2 for z/OS Blog You Should Check Out

If you're reading this, it's probably because you've found blogs to be a useful source of technical Db2 for z/OS information. I want to direct your attention to one of the newer kids on the Db2 blog block: Db2 for z/OS News from the Lab.

What makes the Db2 for z/OS News from the Lab blog particularly noteworthy? It's written by people who are part of IBM Db2 for z/OS development organization - people who call the IBM Silicon Valley Lab home base. These folks, needless to say, know Db2 for z/OS very well. Additionally, these people interact with Db2 for z/OS users all the time, and those interactions provide inspiration for blog entries that can clarify and highlight various facets of the Db2 whole.

So, get wise and get the inside scoop by visiting the Db2 for z/OS News from the Lab blog. Even better, become part of the blog's community - information on doing that can be found in one of the blog's entries. Read and learn!

Wednesday, March 28, 2018

Db2 for z/OS: CACHEPAC is Another ZPARM You Should Check

Sometimes, the default value of a Db2 ZPARM parameter is changed to reflect new realities of Db2 workloads and of systems on which Db2 for z/OS runs. When that happens, a Db2-using organization may stick with the old default value for the ZPARM, and going that route can have a negative impact on the performance of the Db2 system. Case in point: CACHEPAC, the subject of this blog entry.

I've blogged before about ZPARMs that are regularly set to sub-optimal values. I've also posted entries (part 1, part 2, and part 3) on boosting Db2's use of memory so as to improve CPU efficiency for a Db2 workload. CACHEPAC is related to both of those topics.

What is the purpose if CACHEPAC? It specifies the amount of virtual storage, in the Db2 database services address space (aka DBM1), that will be used to cache package authorization information (like, what authorization IDs have the EXECUTE privilege on package XYZ). Is caching that package authorization information important? Yes. Why? Volume - as in, volume of requests for package execution.

Get yourself a statistics report - long generated by your Db2 monitor (depending on the monitor product used, that might be called a statistics detail report), and check the section of the report that contains figures for EDM pool-related activity (alternatively, look at an online display of EDM pool activity provided by your Db2 monitor). Look for a field with a label like PT REQUESTS. PT is short for package table, and this field shows the frequency of requests to execute packages (or sections of packages). For a production Db2 subsystem, you might see hundreds, or even thousands, of requests to execute package sections per second. Having to check the SYSIBM.SYSPACKAUTH table in the catalog for each and every one of those package execution requests would be a drag on performance, for sure; thus, the in-memory package authorization cache specified via CACHEPAC in ZPARM.

It was with Db2 10 for z/OS that the default value for CACHEPAC changed, and changed pretty dramatically. In a Db2 9 environment, the default value for CACHEPAC was 100 KB. With Db2 10, the CACHEPAC default value changed to 5 MB - 50 times larger than before. Why? I suspect the change was made by the Db2 for z/OS development team for several reasons. For one thing, Db2 systems have ever-larger numbers of users and applications (that's more authorization IDs needing package-related privileges), and ever-larger numbers of packages (further upping space needs for package authorization information). Additionally, z/OS LPAR memory resources are getting bigger all the time (hundreds of GB of memory for one z/OS LPAR is not unusual these days), and with more available memory "real estate," why not use more of it for package authorization information caching, to enhance system performance?

Along with Db2 10's boost of the CACHEPAC default value, the maximum value for the parameter was increased - to 10 MB, from 5 MB.

OK, so it's not uncommon to see a Db2-using organization stick with the old default value for a ZPARM when a new default comes along. This might be due, in some cases, to inertia as much as anything. Trouble is, a 100 KB package authorization cache may prove to be WAY smaller than it should be for a Db2 workload that's large by today's standards. I saw a situation of that nature quite recently, in looking over a Db2 monitor statistics detail report sent to me by the Db2 team at a company with which I was working. The report showed that the package authorization cache for a Db2 system that was having occasional performance problems was severely undersized. How could I see that? Easy. In such a report, find a section with a heading like AUTHORIZATION MANAGEMENT. In that section of the report, find the field labeled something like AUTH UNSUCC-PKG-CACHE. That is where you'll see how often a package authorization check could NOT be successfully performed using the package authorization cache. What would you LIKE to see here? Maybe a few package authorization checks per second (in the single digits, or less, per second) that could not be serviced using information in the package authorization cache. What did I see in that report I just referenced? 1098 package authorization requests per second that were not successfully performed via the package authorization cache. Why was this so? It was so because, on a busy Db2 system doing a lot of work for applications, the package authorization cache size, per CACHEPAC in ZPARM, was too small - probably way too small.

So, here's what you should do: use your Db2 monitor to generate a statistics report - long (or, if you can't do that, check an online display of subsystem statistics provided by the Db2 monitor), and find in that report the AUTHORIZATION MANAGEMENT section. In that section, find the field with the label like PKG-AUTH UNSUCC-CACHE. What's the frequency of package authorization checks NOT accomplished by way of the package authorization cache? If that number is in the low single digits (or less) per second, good. If it's up in the double digits per second or higher, the value of CACHEPAC in ZPARM should be increased. If the frequency of package authorization requests not successfully performed using the package authorization cache is really high, like north of 100 per second, CACHEPAC should probably be substantially increased.

What is the value of CACHEPAC for your production Db2 subsystems? If it is less than the default (since Db2 10) of 5 MB, it should probably be increased to 5 MB. If it is at 5 MB and you still see a lot of package authorization requests NOT successfully performed using the package authorization cache, CACHEPAC should probably have a value greater than 5 MB (remember, that value can go up to 10 MB).

Do you have the memory resources on your z/OS LPAR to support an increase in the CACHEPAC value? Probably (if you can't use a few more MB of memory for Db2 without putting the z/OS LPAR's real storage resource under an undesirably high level of pressure, the LPAR likely has a too-small memory allocation). How can you tell if the LPAR's memory is stressed? Check (or have a z/OS systems programmer check) the LPAR's demand paging rate in an RMF (or equivalent z/OS monitor) CPU Summary Report. If the LPAR's demand paging rate is zero, memory is not overly stressed and you should be able to use more for the Db2 package authorization cache (if a CACHEPAC size increase is seen as being needed). I'd also be comfortable with a CACHEPAC size increase if the associated LPAR's demand paging rate were a very small non-zero value (i.e., less than 1 per second). If the LPAR's demand paging rate is 2-3 per second or higher, I'd like to see more memory allocated to the LPAR before substantially increasing CACHEPAC (or, if more memory can't be provided for the LPAR, consider reducing some of Db2's memory use until the demand paging rate is less than 1 per second, and THEN make CACHEPAC larger if it is too small and failing to satisfy a lot of package authorization requests per second).

And that's that. When most all package authorization checks can be successfully accomplished using information in the Db2 package authorization cache, it's a good thing. If you're not there for one or more of your production Db2 subsystems, try to get there.

Monday, February 26, 2018

Db2 for z/OS: DDF, zIIP Engines, and SMT2

SQL statements executed by way of the Db2 for z/OS distributed data facility (DDF) are not, of course, the only workload that uses zIIP MIPs, but in my experience this does tend to be the main driver of zIIP utilization in a z/OS system on which Db2 runs. I posted an entry to this blog, a few years ago, on the importance of avoiding zIIP engine contention. In that blog entry, I described how to calculate the "zIIP spill-over ratio" for a Db2 DDF workload (the percentage of zIIP-eligible work that ends up being processed by general-purpose engines), and pointed out that this ratio should be less than 5% for optimal performance and CPU cost-efficiency (less than 1% is outstanding). That rule-of-thumb begs the question, "How 'hot' can I run my zIIP engine(s) while avoiding a higher-than-desired zIIP spill-over ratio?" The answer to that question, as pointed out in the blog entry, depends on the number of zIIP engines available to the z/OS LPAR in which the Db2 subsystem of interest is running: the more zIIP engines an LPAR has, the higher the rate at which they can be utilized without overly-high zIIP spill-over becoming an issue. In the entry I'm writing now, I want to supplement the blog entry just referenced (and pointed to by the preceding hyperlink) with information pertaining to an IBM Z server processor technology called simultaneous multithreading, or SMT (or SMT2, as explained below).

SMT2, introduced with the IBM z13 line of mainframe servers and available as well with the newer z14, refers to a technology whereby two processes (thus the 2 in SMT2) can be active at the same time on one "engine" (i.e., one core). You can find more information about SMT2 in an entry on the z13 that I posted to this blog a few years ago.

One thing that SMT2 affects is the look of an IBM RMF CPU Activity Report (that report, by the way, is very useful when it comes to understanding the z/OS environment in which a Db2 subsystem operates). One section of an RMF CPU Activity Report shows the average utilization (usually over a 15-minute interval of time) of the engines available to a z/OS LPAR. For a system running on an IBM Z server with SMT2 enabled, that section of the report might look something like this (with color highlighting added by me):


---CPU---    ---------------- TIME % ----------------
NUM  TYPE    ONLINE    LPAR BUSY    MVS BUSY   PARKED
 0    CP     100.00     4.28         4.15        0.00
 1    CP     100.00     1.41         1.42        0.00
TOTAL/AVERAGE           2.84         2.79           
 2    IIP    100.00    76.08        75.23        0.00
                                    71.13        0.00
 6    IIP    100.00    67.62        66.22        0.00
                                    64.94        0.00

TOTAL/AVERAGE          71.85        69.38

Here are some things to notice about the above report snippet: first, see those numbers in the "MVS BUSY" column that I highlighted in red? See how there are two such numbers for each physical zIIP engine (labeled as type "IIP")? Know what that means? It means that SMT2 is active for those zIIP engines. Could you see two MVS BUSY numbers for one general-purpose engine (labeled as type "CP")? No. Why? Because SMT2 can be activated for zIIP engines and also for IFL engines (engines dedicated to Linux systems running on an IBM Z server), but not for general-purpose engines (more on "can be activated" in a moment).

A second thing that can look different in an RMF CPU Activity Report, when SMT2 is active for a system's zIIP engines, is the section, just below the above-referenced CPU utilization section, that shows information about "in-and-ready" tasks and how often these tasks had to wait for dispatch because all of the LPAR's engines were busy. That section of the report might look like this (again, color highlighting was added by me -- and this snippet and the CPU utilization snippet shown above are from the same RMF report):


-----------------------DISTRIBUTION OF IN-READY WORK UNIT QUEUE
 NUMBER OF              0    10   20   30   40   50   60   70
 WORK UNITS     (%)     |....|....|....|....|....|....|....|...
                                            
<=  N          40.3     >>>>>>>>>>>>>>>>>>>>>
 =  N +   1     5.5     >>>
 =  N +   2    52.3     >>>>>>>>>>>>>>>>>>>>>>>>>>>
 =  N +   3     0.9     >
<=  N +   5     0.4     >
<=  N +  10     0.0
<=  N +  15     0.0
<=  N +  20     0.0
<=  N +  30     0.1     >
<=  N +  40     0.0
<=  N +  60     0.0
<=  N +  80     0.0
<=  N + 100     0.0
<=  N + 120     0.0
<=  N + 150     0.0
>   N + 150     0.0

N = NUMBER OF PROCESSORS ONLINE (6.0 ON AVG)

See that line, highlighted in red, that tells you the value of "N"? Know what "N" is? It's the number of engines available to the LPAR for which the report was run (if you see a non-integer value for N, such as 8.5, it means that the LPAR has part of an engine, or parts of several engines, available to it -- that's indicated by non-zero values in the PARKED column of the CPU utilization section of the report referenced previously). Note that in this case, N = 6, even though the LPAR has, as we saw earlier, four physical engines -- two general-purpose engines and two zIIP engines. N = 6 because two tasks can be dispatched to each of the two zIIP engines at one time (so, a total of 6 tasks can be executing at one time: two on the two general-purpose engines and four on two the zIIP engines, each of which can handle two tasks simultaneously).

Know what else can look different with SMT2 activated for your zIIPs, versus SMT2 not being activated for those engines? The zIIP spill-over ratio. Here's what I mean by that: suppose you had a Db2 for z/OS subsystem, with a significant DDF workload, running on a z13 or z14 mainframe for which SMT2 had not been activated for the zIIP engines. Suppose that on that z/OS system, with X number of zIIP engines, the zIIP spill-over ratio was Y. If you subsequently activated SMT2 for the zIIP engines, what would be the effect on the zIIP spill-over ratio for the Db2 DDF workload? I think it's pretty safe to say that the zIIP spill-over ratio would be less than Y (i.e., less than what it had been with SMT2 not activated for the zIIP engines).

How substantial is the effect of activating SMT2 for zIIP engines? Is it like turning two zIIP engines into four zIIP engines? No. Why? Because the speed at which a given task is executed is somewhat reduced when that task is dispatched to an engine for which SMT2 has been activated, versus the same engine with SMT2 not activated. Generally speaking, when SMT2 has been activated for a zIIP engine, that engine is capable of processing about 40% more work than before; so, enabling SMT2 for two zIIP engines is roughly akin to turning them into 2 X 1.4 = 2.8 engines, capacity-wise.

Now, I keep referring to SMT2 "being activated" for a zIIP engine. That tells you that a zIIP (on a z13 or z14 server) can run in either "uni-thread" or SMT2 mode. Is there any reason NOT to activate SMT2 for a system's zIIP engines? About the only scenario possibly favoring "uni-thread" mode for zIIPs that comes to my mind is one involving a zIIP-eligible process that is single-threaded and batch-like in nature, and which requires the fastest possible single-thread zIIP performance. For a workload that is primarily transactional in nature (true for most DDF workloads I've seen), I would lean strongly towards activating SMT2 for the zIIP engines of the associated z/OS LPAR. The expected result should be greater throughput and less in the way of zIIP-eligible work spilling over to general-purpose engines.

Sunday, January 21, 2018

Db2 12 for z/OS and Insert Algorithm 2: Faster, More-Efficient INSERTs than Ever Before

Near the end of the blog entry I most recently posted prior to this one, I mentioned that a specification of MEMBER CLUSTER for a Db2 for z/OS universal table space could help maximize INSERT throughput and efficiency, in situations where the importance of that objective outweighs that of keeping data rows physically ordered by the target table's clustering key. I also noted that MEMBER CLUSTER sets you up to take advantage of a Db2 12 for z/OS enhancement called (rather cryptically) insert algorithm 2.

So, what's insert algorithm 2? I'll tell you, but before getting to the "what" of this Db2 12 feature, I want to spend a little time on the "why."

I recall seeing, way back in the mid 1980s, not long after the introduction of what was then called Db2 for MVS (the fruit of an IBM research effort that led to the invention of relational database technology), an IBM "red book" about the software product. This document cautioned users against thinking of Db2 as a database management system suitable for high-volume batch and online transactional workloads: Db2 Version 1 Release 1 was intended to support business intelligence-type queries, and was not designed to provide the performance organizations generally required for operational, "run the business" applications.

Well, that advice was widely ignored in the user community. Relational technology and the SQL data-interface language (also invented by IBM) delivered such an advance in programmer productivity versus traditional data stores that companies jumped on Db2 as a foundation for operational applications, in spite of the initial performance challenges. IBM responded to this development by providing, in successive versions and releases of Db2, features and functions that vastly boosted the ability of the DBMS to, among other things, ingest large amounts of new data records in compressed time frames. What were the bottlenecks holding back INSERT throughput and efficiency? Log processing? Index maintenance? Lock contention? Latch contention? Instruction path length? Client-server network management? Smart people at the IBM Silicon Valley Lab kept flattening these INSERT speed bumps, until organizations were routinely able to drive thousands of inserts per second into Db2 tables. And still, in some cases, that wasn't fast enough,

So we come to the Db2 11 for z/OS time frame, and the question is, how do you take Db2 INSERT throughput from really high to super-duper high? You (if you're in the Db2 development organization) do what you've always done: you identify what's holding throughput back in the current technology environment. It was determined that in a Db2 11 system, a key INSERT bottleneck involved the search for space, in pages of a target table, to hold incoming rows. How do you address that? Here, a page was taken from the Db2 playbook of long ago: asynchronous, anticipatory processing is the ticket to success.

Long ago, read I/O was seen as a major drag on Db2 performance, were it to be always handled in the traditional, synchronous way (i.e., read a single page from disk into memory on-demand, at the time a program needed the contents of the page). The solution was something that, today, we can scarcely imagine Db2 being without: prefetch. Through prefetch, Db2 reads batches of table space and/or index pages into memory, in anticipation that the pages will be requested by a process that seems likely to scan a significant number of pages. Because prefetch reads are anticipatory in nature, application elapsed time is reduced: when a prefetch-driving application process requests a page, it is likely that the page has already been read into a Db2 buffer pool in memory, so there is no need to suspend the application's task in order to perform a single-page, on-demand read. Because prefetch reads are asynchronous in nature, application elapsed time is further reduced: prefetch reads are executed by way of Db2 tasks (specifically, by preemptable SRBs in the DB2 database services address space), so application tasks can remain "on task" with regard to processing data (versus having to bring data into memory).

Apply those same concepts - anticipatory, and asynchronous - to an insert-intensive application process, and - presto! - you have a good understanding of the essential nature of insert algorithm 2.

The "how" of enabling insert algorithm 2 is largely table space-based - it can be used only for inserts into universal table spaces that are defined with the MEMBER CLUSTER option (that option can be specified in a CREATE TABLESPACE or an ALTER TABLESPACE statement). What is additionally required is telling Db2 that insert algorithm 2 is to be used for universal, MEMBER CLUSTER table spaces. That can be done at a Db2 subsystem level (at function level V12R1M500 or above) by way of a new (with Db2 12) ZPARM parameter, called DEFAULT_INSERT_ALGORITHM. When the value of that parameter is set to 2 (the default value), insert algorithm 2 will be automatically used for universal table spaces defined with MEMBER CLUSTER (unless that behavior is overridden at the table space level - more on that to come). When DEFAULT_INSERT_ALGORITHM is set to 1, insert processing will be done in the traditional way for universal MEMBER CLUSTER table spaces, unless insert algorithm 2 has been explicitly enabled for a given table space (again, table space-level control over insert algorithm 2 will be covered momentarily). When DEFAULT_INSERT_ALGORITHM is set to 0, insert algorithm 2 functionality is disabled for the subsystem - it cannot be used for any table space.

OK, table space-level control: if the ZPARM parameter DEFAULT_INSERT_ALGORITHM is set to 2 (meaning, insert algorithm 2 is used by default for universal MEMBER CLUSTER table spaces), you can de-activate the functionality (i.e., go with traditional insert processing) for an individual table space by specifying INSERT ALGORITHM 1 for the table space (either at CREATE TABLESPACE time, or via ALTER for an existing MEMBER CLUSTER universal table space). Similarly, if the ZPARM is set to 1 (traditional insert processing will be in effect, by default, for MEMBER CLUSTER universal table spaces), and you want to override that value for a particular MEMBER CLUSTER universal table space (i.e., you want insert algorithm 2 to be in effect for that table space), you can specify INSERT ALGORITHM 2 for the table space (again, either in a CREATE TABLESPACE or an ALTER TABLESPACE statement). Note that a value of 0 has a different meaning for the table space specification versus the ZPARM. For the DEFAULT_INSERT_ALGORITHM parameter in ZPARM, a value of 0 means that insert algorithm 2 functionality is disabled for the Db2 subsystem. For a MEMBER CLUSTER universal table space, INSERT ALGORITHM 0 (the default value for CREATE TABLESPACE, and a specification that is valid only for CREATE TABLESPACE) means that the insert algorithm that will be in effect for the table space is the one specified via DEFAULT_INSERT_ALGORITHM in ZPARM.

Let's say that insert algorithm 2 is in effect for a MEMBER CLUSTER universal table space, and an insert process targeting the table in that table space gets going. What happens? Here's what happens: Db2 provides a "pipe" (an in-memory area) that will be used to supply the insert process (or processes - several can use one pipe) with pages, belonging to the partition receiving new rows, that have space to hold new rows (if several partitions of a table space are receiving new rows, each partition will have its own fast-insert pipe). If row-inserting processes are taking pages from the pipe, what's putting pages into the pipe? That would be a Db2 task that puts insert-ready pages into the pipe, in an asynchronous manner (that is, in a way that does not interrupt the tasks of inserting processes, as a Db2 prefetch task does not interfere with the tasks of page-accessing programs).

What this auto-filled page pipe approach does is remove, for an inserting process, the need to find pages of a partition that have space enough to hold new rows. Even though the space search mechanism used for traditional insert processing is quite efficient (especially for segmented table spaces, thanks to the relatively precise "space available" information maintained in space map pages - and all universal table spaces are segmented), insert throughput can be expected to increase when an inserting process can simply pull "insert space available" pages from a fast-insert pipe versus having to locate such pages itself.

How much will insert throughput increase when insert algorithm 2 versus algorithm 1 (traditional insert processing) is used, all other things being equal? You probably know the answer to that question: it depends. In some cases, insert throughput might increase by a few percentage points. In other cases, the throughput improvement could be substantially greater (some tests have shown a throughput increase in the vicinity of 25%, and even larger increases have been observed in other test scenarios). CPU efficiency gains can also be quite varied - in some cases, in-Db2 CPU time (aka "class 2" CPU time) may decrease only slightly for an inserting process, while in other cases in-Db2 CPU efficiency might improve by 10%, or closer to 25%. Why the variance in throughput and CPU savings for inserting processes? One factor is indexes: generally speaking, the fewer the number of indexes defined on a target table, the greater the performance improvement delivered by insert algorithm 2 will be (the feature is used for inserts of rows into table pages, not for the insert of entries into index pages). The other main factor is the presence of bottlenecks other than row-accommodating-page-search that might "bubble to the surface" and constrain the performance of an insert process, when insert algorithm 2 is in effect. For example, consider a case in which inserts are coming from network-attached processes (i.e., DDF-using applications) that are running on servers that are physically separate from the Db2 subsystem, versus the same processes running in Linux systems on the IBM Z server that also houses the Db2 for z/OS subsystem. In the latter case, the greater network throughput that can be available thanks to IBM HiperSocket technology (memory-to-memory communications links) might result in a comparatively larger performance improvement with insert algorithm 2 in effect, because a more-pronounced network-related bottleneck might constrain the fast-insert improvement achieved with insert processes running on "boxes" that are physically separated from the Db2 server (to put this another way: when insert algorithm 2 removes one insert performance bottleneck, another bottleneck could come into play).

So, your mileage may vary, as they say.

How about the memory impact of insert algorithm 2? Those page-supplying pipes occupy some space, don't they? Yes, but a fast-insert pipe should take up less space than a data-compression dictionary (remember that every partition of every compressed table space has an associated compression dictionary), and at many sites there will be quite a few more COMPRESS YES table spaces than table spaces for which insert algorithm 2 is in effect (recall that insert algorithm 2 is only applicable to universal table spaces defined with MEMBER CLUSTER). Translation: if the real storage resource of a z/OS system were under a good bit of pressure (i.e., if the system's demand paging rate were in the mid-single-digits per second or higher), I might want to add memory to that system before making extensive use of insert algorithm 2; otherwise, you are likely good to go from a memory perspective.

IBM keeps boosting the performance of Z servers. Taking full advantage of these hardware enhancements depends in large part on software design, and Db2 12's insert algorithm 2 is part of that story. With the new insert algorithm in effect, insert rates of hundreds of thousands of rows per second have been observed in tests of Db2 12 systems (in some tests involving tables on which no indexes are defined, insert rates exceeding 1 million per second have been achieved). Does that kind of throughput require a substantial amount of CPU processing capacity? Yes. The insert-algorithm 2 value proposition is this: if you have Z horsepower to apply to an insert throughput challenge, Db2 12 lets you put it to work like never before.

Sunday, December 31, 2017

Db2 for z/OS: Opportunities for CPU Efficiency When Data Clustering is not a Priority

Not long ago, a veteran Db2 for z/OS DBA contacted me with an interesting question. He had, in the system he supported, a table with a clustering index, and he wanted to know how he could get Db2 to act, with regard to INSERT operations and also with regard to executions of the REORG utility, as though the table had no clustering index. In other words, the DBA had a table for which INSERT and REORG CPU efficiency mattered more than having rows physically ordered per a clustering key. That's understandable. There are certainly situations in which the "locality of reference" provided by a clustering index is not of significant value.

You might think to yourself, "This is not a question that's hard to answer - just issue an ALTER for the table's clustering index, with a NOT CLUSTER specification. Problem solved." Eh, not so fast. ALTER INDEX with NOT CLUSTER is a useful means of changing a table's explicit clustering index from IX1 (or whatever it's called) to IX2 (in that case, an ALTER INDEX IX1 NOT CLUSTER would be followed by an ALTER INDEX IX2 CLUSTER). ALTER INDEX with NOT CLUSTER is not a good way to cause a table to not have a clustering index, because if a table has any index then it will always have a clustering index.

"Hold on, there," some might say. "If IX1 is the clustering index for table T1, and I do an ALTER of IX1 with NOT CLUSTER and do not subsequently alter another index on T1 with a CLUSTER specification, table T1 has no clustering index, right?" Wrong. Given that scenario, T1 does not have an explicit clustering index. It has a clustering index of the implicit variety. And what index is that? Index IX1. "But wait," you might counter, "I just altered IX1 with NOT CLUSTER. How can IX1 still be the table's clustering index?" Because the rules of the Db2 for z/OS game are as follows: when the index that had been table T1's explicit clustering index is altered with NOT CLUSTER, that index will continue to be T1's clustering index - albeit in an implicit sense - until such time as another index on T1 is made the table's explicit clustering index by being altered with a CLUSTER specification."

"OK," says you, "The DBA should drop and re-create the table's indexes, without giving any of them the CLUSTER designation. Presto: a table with no clustering index." Wrong again (with all due respect), says I. The table still has an implicit clustering index. Which index will that be? The first one created for the table (so, if IX2 and IX3 and IX1 are created, in that order, on table T1, and CLUSTER was not specified for any of those indexes, IX2 will be T1's implicit clustering index).

"So, if a Db2 for z/OS table has any indexes at all, it will always have a clustering index?" Yes. I said that a few paragraphs ago, and I meant it. I said it a few years ago in an entry in this blog.

How, then, do you get Db2 to act as though there were no clustering index on T1, with respect to INSERT and/or REORG processing, if clustering for T1 is not a priority for you? I'll first address the REORG part of that question, then the INSERT part.

Until quite recently, you COULDN'T tell Db2 to ignore a table's clustering index when REORGing the associated table space - that index was going to be in the picture, one way or another. All you could do was tell Db2 how the clustering index would affect REORG processing: either the table's data would be unloaded by REORG in the order prescribed by the table's clustering index, or it would be sorted in clustering sequence after having been unloaded in "as-is" sequence (the latter approach is the utility's default - the former is used when REORG is executed with the SORTDATA NO option specified). Db2 11 for z/OS delivered an enhancement whereby REORG can be executed with no regard for a table's clustering index. That enhancement will be in effect if REORG is run with SORTDATA NO in combination with the new (with Db2 11) option RECLUSTER NO. With SORTDATA NO and RECLUSTER NO specified, REORG will run as quickly and efficiently as possible when re-clustering is not the motivation for executing REORG. And when might data re-clustering NOT be a motivation for running REORG? How about when REORG is executed with the DISCARD option to remove a lot of rows from a table in a very CPU-efficient manner (as was the case for the DBA I mentioned in the opening part of this blog entry)? How about when REORG is run to materialize a pending DDL change for a table space, such as a change in page or segment size (and each release of Db2 for z/OS adds things to the list of what can be non-disruptively changed for a table or table space or index via pending DDL - changes effected with an ALTER followed by an online REORG)?

OK, so REORG can be executed in a way that does not take a table's clustering key into account. How about INSERTs? Can they be executed without regard to a table's clustering index? The answer to that question is definitely, "Yes." How is that done? Pretty easy: specify APPEND YES when you create the table (or alter an existing table with APPEND YES). On top of that, if the table is in a universal table space, MEMBER CLUSTER YES should be in effect (through either specification of MEMBER CLUSTER YES in the CREATE TABLESPACE statement, or with MEMBER CLUSTER YES specified for an ALTER of an existing table space).

And here's an added bonus associated with MEMBER CLUSTER YES: it sets you up to exploit the new, ultra-high-efficiency and ultra-high-throughput insert algorithm introduced with Db2 12 for z/OS - an enhancement known as "Insert Algorithm 2." How does that work? I'll let you know in the next entry I write for this blog, which I expect to post in January (and I mean January 2018 - happy new year, folks).

In the spirit of "tell 'em what you told 'em," here's the quick reiteration of the main points made in this blog entry:

  • Any Db2 for z/OS table with at least one index has a clustering index, period.
  • If you want to run REORG as efficiently as you can, and re-clustering data in a table is not important for you, run REORG with SORTDATA NO and RECLUSTER NO.
  • If you want to maximize the efficiency of INSERT operations, and row-clustering is not important for you, use APPEND YES for the table in question (and MEMBER CLUSTER YES for the associated table space, if that table space is of the universal variety).
On to 2018!