Friday, December 28, 2018

Db2 for z/OS Partitioned Table Spaces: How the Number of Partitions can Impact Performance

This past summer, I delivered a couple of presentations at a meeting of a regional Db2 users group in the USA. During the lunch break, I shared a table with a member of the Db2 for z/OS support team at a large financial institution. He told me an interesting story. At his site, the Db2 for z/OS team had decided that universal partition-by-range (PBR) would be a better table space type for a large table currently housed in a partition-by-growth table space (PBG). To prove out this assessment, the team created a copy of the table of interest in a PBR table space, and ran some batch jobs that targeted the two variants of the table - the same jobs (some that changed data, others that only read data), accessing data that was identical in a logical sense (i.e., same data records in the tables in the two different table spaces). The performance characteristics of the PBG-accessing and the PBR-accessing jobs were compared.

The findings? The PBG-accessing jobs showed consistently better elapsed and CPU times versus the PBR-accessing jobs. That was not a result I would have expected. I've blogged before about reasons for favoring PBR over PBG for large Db2 tables, and some of those reasons are performance-related. Would I have been surprised by comparative results showing roughly equivalent performance for PBG-accessing and PBR-accessing jobs? Maybe not. But better performance for the PBG-accessing jobs? That had me scratching my head.

I asked the Db2 person at the user group meeting to send me Db2 performance monitor accounting "long" reports capturing activity for the PBR-accessing and PBG-accessing jobs, and he did that not long after getting back to his workplace. Analysis was focused on the package-level accounting data for a particular package that accessed only the table with the PBR and PBG variants (other packages associated with the batch jobs used for comparison accessed the PBR-and-PBG-variant table and several other related tables as well). Sure enough, for the "this table only" package the CPU time was about 6% higher for the PBR-housed table versus the PBG-housed table. Not a huge difference, but statistically significant and, as previously mentioned, unexpectedly favoring the PBG table space.

I started asking questions, looking for a possible explanation for the better performance result seen for the PBG-accessing package:

  • Was the SEGSIZE specification the same for both table spaces (a larger SEGSIZE value can result in better CPU-efficiency for large-scale page scans)? Yes, same SEGSIZE for both table spaces.
  • Was the PBR table space in fact universal, as opposed to being a non-universal table-controlled partitioned table space (I wanted to make sure that we were comparing universal to universal)? Yes, the PBR table space was indeed of the universal variety.
  • Were the PBG-housed and PBR-housed tables clustered by the same key (clustering affects locality of reference for set-level query and update and delete operations)? Yes, both variants of the table had the same clustering key.
  • Were there indexes defined on the same keys of both tables (obviously, indexes have a lot to do with query access paths, and they affect the CPU cost of insert and delete and some update operations)? For both the PBG-housed and the PBR-housed table, there was only one index, and it was defined on the same key.
  • Were the PCTFREE and FREEPAGE specifications the same for the one index on the two table variants (a significant difference here could affect index GETPAGE counts)? Yes, for the one index defined on the same key of both table variants, the PCTFREE and FREEPAGE values were identical.
  • Was the one index on the PBR-housed table a partitioned index or a non-partitioned index (NPI)? It was a partitioned index, and by the way the PBR table space had 318 partitions (me, on getting that piece of information: "Hmmmm").

OK, so what prompted my "Hmmmm" on learning the number of the PBR table space's partitions? That number got me thinking, "318 is a pretty substantial number of data sets. Depending on the number of data sets associated with the PBG table space, that could be an important differentiating factor between the two." Indeed, it turned out that the PBG table space had far fewer data sets versus its PBR cousin. How could that influence CPU time for an application process? In this case, a clue came from a number in the aforementioned Db2 monitor accounting reports: the commit count for the batch process that accessed the table of interest: 204,004 over the 4-hour, 50-minute reporting interval (a little over 700 per minute). What does that have to do with the job's CPU time? Well, when the RELEASE specification for a package is COMMIT (the default), any and all "parent" locks acquired in executing the package will be released every time the application process driving the package's execution issues a commit (the package itself is also released from the application process's thread at each commit). We tend to think of "parent" locks as being table space-level locks, but in fact when a table space is partitioned the parent locks are at the partition level. If the application process's SQL statements are such that a large number of partitions are accessed in each unit of work, that means a lot of partition-level locks are acquired within each unit of work and released at commit time. That parent lock acquisition and release cost can be yet a bit higher in a data sharing environment (and such was the case in the situation I'm describing), because a significant percentage of global locks tend to be of the parent type.

I relayed to the Db2 support person my thinking about the impact that the number of PBR versus PBG partitions might have on the application process's CPU time. He got together with the larger Db2 team at his site, and they decided to repartition the PBR table space in a way that dropped the number of partitions from 318 to 58, and voila - the application process's CPU time dropped to a level that was several percentage points below that seen for the PBG table space. Success!

Now, it's important to note that in this particular situation, a re-partitioning of the PBR table space was not a problematic change. Suppose you have a situation like the one I've described here, and re-partitioning your PBR table space is not a viable option? In that case, the in-Db2 CPU cost of an application process (likely a batch job) that accesses a large number of a table space's partitions within each unit of work could be reduced (especially if the job issues a large number of commits, resulting in a large number of units of work) via a rebinding of the associated Db2 package(s) with RELEASE(DEALLOCATE). That package bind option would cause the partition-level locks acquired as the batch job progresses to be retained until thread deallocation time (i.e., until end-of-job). That, in turn, would eliminate the overhead of releasing many partition-level locks at each commit (which locks would likely be re-acquired within the next unit of work), thereby reducing CPU time (I posted an entry on RELEASE(DEALLOCATE) considerations to this blog a few years ago - the part under the heading "Batch" is most relevant to the blog entry you're reading now). [Note: table space- and partition-level locks are almost always of the "intent" variety. Such locks are non-exclusive in nature, so retaining them across commits should not be a concern from a concurrency perspective.]

The bottom-line message here is NOT that you should always go for fewer rather than more partitions for a partitioned table space; rather, it is that the number of a table space's partitions can be a factor that affects CPU efficiency for some applications. A type of application for which this can matter is one that a) uses RELEASE(COMMIT) packages, b) is characterized by many units of work in the life of a thread, and c) accesses a large percentage of a table space's partitions in a typical unit of work. For such an application, CPU efficiency could potentially be improved by going with a smaller rather than a larger number of partitions for the target table space. When significant partition reduction is not an option for a table space, CPU time for an application process such as the one described in this blog entry could potentially be reduced by rebinding the application's Db2 packages with the RELEASE(DEALLOCATE) option.

Friday, November 30, 2018

Db2 for z/OS Global Variables: What is a "Session"?

Has this ever happened to you? You're looking at the Db2 for z/OS SQL Reference (or in the Db2 for z/OS Knowledge Center on the Web), and you see something like this (and the quote below is from the description of the CREATE VARIABLE statement in the SQL Reference, with highlighting added by me):

Global variables have a session scope. Although they are available for use to all sessions that are active at the current server, the value of the global variable is private for each session.

You think to yourself, "OK. That's good to know." And then, "But wait - what's a 'session'?" You scour the Db2 documentation for the definition of "session", and you come up empty. You're left scratching your head. "Is a 'session' a thread? A transaction? Something else?" I am writing today to remove this particular cause of head-scratching (your scalp can thank me later). I will start by giving you the short answer, and then I'll elaborate.

Short answer: the meaning of "session", in a Db2 for z/OS context, depends on the nature of an application's connection to a Db2 server. For "local-to-Db2" applications (meaning, applications that execute in the same z/OS LPAR as the target Db2 system, and are not network-connected to Db2 - examples include CICS transactions and batch jobs), "session" equates to "thread" (referring to the Db2 thread used by the application). For DRDA requesters (applications that access Db2 for z/OS via the Db2 DDF address space, using the DRDA protocol by way of an IBM driver such as the IBM Data Server Driver or Db2 Connect), "session" equates to the logical connection an application has established with the Db2 server. For a REST client (an application interacting with a Db2 system via the REST interface to the DDF address space), "session" equates to transaction.

You may wonder, "Why these differences between local-to-Db2, DRDA requesters and REST clients, with regard to the Db2 meaning of 'session'?" Read on for explanatory information.

Local-to-Db2 applications

Let's start with an easy case: a Db2-accessing batch job (and I'm not talking about a Java batch job that could use a type 4 JDBC driver and therefore be network-connected to Db2, from Db2's perspective). This job will get a Db2 thread at the first issuance of a SQL statement, and that thread will persist until end-of-job. The job might issue multiple COMMITs, but it's always the same thread that's being used for the life of the job. If the batch job involves use of a Db2 global variable, the initial value of the global variable will be its default value (for a user-created Db2 global variable, the default value will be null unless a different default was specified when the variable was created). If the job sets the value of the global variable to X, that will be the global variable's value for the duration of the job, unless it is subsequently changed by the job to some other value. If another, concurrently executing batch job sets the same global variable to value Y, the other batch job sees Y in the global variable, not X, because the two batch jobs are associated with two different Db2 sessions. Pretty simple.

How about a CICS or an IMS transaction? Still simple, absent thread reuse (I'll get to the thread reuse situation momentarily). The transaction, like a batch job, gets a Db2 thread at the first issuance of a SQL statement, and the thread persists until end-of-transaction. If the transaction involves use of a Db2 global variable, the global variable will initially have its default value. If the transaction sets the global variable to some value, that value will persist, unless subsequently changed by the transaction, until the transaction completes processing and its Db2 thread is deallocated. If two concurrently executing transactions set the value of the same Db2 global variable to X and Y, respectively, the one transaction will see the value X in the global variable and the other transaction will see the value Y in the global variable, because the two transactions are associated with two different Db2 sessions. Again, pretty simple.

Things get a little more interesting in the case of thread reuse by CICS or IMS transactions. Thread reuse is good for transaction CPU efficiency (especially when paired with Db2 packages bound with RELEASE(DEALLOCATE)), but it changes the effective meaning of "session" in a Db2 for z/OS context. Why? Because absent thread reuse, there is a one-to-one correspondence between transactions and Db2 threads. In a transactional thread reuse situation, there is a many-to-one relationship between transactions and threads - in other words, multiple transactions reuse the same CICS-Db2 or IMS-Db2 thread (that's the point of thread reuse). What does this mean for a given transaction program? Well, let's say that one transaction sets the value of Db2 global variable GVAR to X. The transaction completes, and another transaction reuses the same Db2 thread. With regard to that second transaction, what will be the initial value of the Db2 global variable GVAR? Will it be GVAR's default value? NO - it will be X, the value placed in GVAR by the transaction that previously used the thread. Why is this so? Because the scope of a Db2 global variable is a session, and for a local-to-Db2 application, "session" equates to "thread", and in a thread reuse situation multiple transactions will use a particular thread. The moral to this story: if you have a transaction program that puts sensitive information in a Db2 global variable, and this is (or could be) a thread reuse situation, you'd better make sure that you re-set the global variable to its default value before end-of-transaction; otherwise, a transaction that subsequently uses the same thread will be able to see the sensitive information your transaction placed in the global variable.

DRDA requesters

This situation gets really interesting, because the whole application-thread relationship is quite a bit different versus the CICS-Db2 or IMS-Db2 case. A DRDA client application will establish a connection to a Db2 for z/OS server, and very typically the application will stay connected to the Db2 system for a considerable period of time (and it may establish several connections to the Db2 server). Getting more specific, it is the "logical connection" that matters here - the connection that the application perceives that it has with the Db2 server. I say "logical connection" because physical connectivity to the Db2 server involves things like "transports" provided by the IBM Data Server Driver (or Db2 Connect) - something that makes connection pooling possible through the servicing of n logical connections to a Db2 server with fewer-than-n physical connections (connection pooling is good for client-server application scalability and efficiency).

How about threads? Well, a given logical connection between a DRDA client application will likely involve use of any number of Db2 threads. How so? Consider the "regular" DBAT case (DBATs - database access threads - are the kind used for applications accessing Db2 via the DDF address space, and by "regular" I mean DBATs that are not of the high-performance variety). A DRDA client application has established a logical connection to a Db2 server, and a transaction uses that connection. To service the transaction, Db2 takes the application's logical connection from the inactive state (its state when not being used by a transaction) to the active state, and assigns to the now-active connection a DBAT from the DBAT pool. The transaction completes, the DBAT is separated from the connection and goes back to the DBAT pool, and the connection goes back to the inactive state until it's needed for another transaction.

OK, how about the high-performance DBAT case? A high-performance DBAT is instantiated when a Db2 package bound with RELEASE(DEALLOCATE) is allocated to a "regular" DBAT for execution (in other words, the "regular" DBAT becomes a high-performance DBAT in that situation). Once instantiated, the high-performance DBAT will NOT go back into the DBAT pool at end-of-transaction; instead, it will remain dedicated to the connection through which it was instantiated (a connection identified with an application server IP address and an ephemeral port), and will be reused, up to 200 times, by transactions associated with the connection (thus, high-performance DBATs deliver performance benefits similar to those provided by CICS-Db2 protected entry threads and threads between IMS wait-for-input (WFI) and pseudo-WFI regions and a Db2 system).

In a Db2 data sharing system, in which something called Sysplex workload balancing is active (and it's active by default with the IBM Data Server Driver and Db2 Connect), the situation gets even more interesting: a DRDA client application establishes a connection to the Db2 data sharing group (at least it should, versus connecting to an individual group member), and different transactions associated with the same logical connection to the Db2 system end up executing on different members of the data sharing group, using DBATs (regular and/or high-performance) provided by the individual member Db2 subsystems.

I've provided, above, "behind the curtain" information on the workings of Db2 for z/OS as a DRDA server, and now I'll tell you that all this DBAT stuff is essentially irrelevant to the meaning of "session" in a DRDA requester context. The stuff that Db2 for z/OS does with regular and high-performance DBATs, and with active and inactive connections, is invisible to a DRDA client application. It's kind of like the difference between logical and physical database design: a Db2 table looks the same to an application program, regardless of whether it's in a traditional segmented or a universal partition-by-growth or a range-partitioned (universal or otherwise) table space. That's Db2 plumbing. Because a given logical connection to a Db2 server is what it is from the DRDA client application perspective, regardless of the number and/or type of DBATs used to service the transactions associated with the connection, it absolutely would not make sense to equate "session" with "thread" in this case. If a DRDA transaction using a logical connection to a Db2 server sets the value of Db2 global variable GVAR to X, that value will remain X (unless subsequently changed) for the life of the logical connection. In that sense, the situation is similar to the one described above for CICS-Db2 or IMS-Db2 thread reuse: if you do not want the value placed in a global variable by a DRDA transaction to be seen by a subsequent transaction associated with the same logical connection to the Db2 system, reset the value of the global variable before end-of-transaction.

If two transactions associated with two different logical connections between a DRDA client application (or applications) and a Db2 system place the values X and Y, respectively, in the same Db2 global variable, the one transaction will see X in the global variable and the other will see Y, because the scope of a global variable is a session, and two different logical connections to a Db2 server system are two different sessions.

REST clients

This one's pretty easy: for a client transaction using Db2's native REST interface, "session" equates to transaction. Period. Why so comparatively simple? Because REST is a so-called "stateless" model. Using the REST interface, each and every interaction with Db2 gets a "logical fresh start" on the Db2 side, and that means, among other things, automatic resetting of any global variable set by a REST-using transaction back to its default value at end-of-transaction. If Db2 global variable GVAR is set to X by a REST-using transaction, a subsequent REST-using transaction will not see X in GVAR at transaction start time - it will see GVAR's default value. If two different, concurrently executing REST-using transactions put X and Y, respectively, in the same Db2 global variable, the one transaction will see X in the global variable and the other will see Y, because the scope of a global variable is a session, and each and every REST-using transaction is associated with a different Db2 session.

In summary...

Keep the simple answer to the Db2 session question in mind:
  • For local-to-Db2, non-DDF-using applications: session = thread
  • For DRDA client applications: session = logical connection to Db2 server
  • For REST clients: session = transaction
I hope that this information will be helpful to you, and that you can take "Db2 session" off of your head-scratcher list.

Wednesday, October 31, 2018

Db2 for z/OS Buffer Pools: Clearing the Air Regarding PREFETCH DISABLED - NO READ ENGINE

Has this ever happened to you? You're looking at the output of a Db2 for z/OS -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command, or maybe at a statistics report produced by your Db2 monitor, or maybe your Db2 monitor's online display of a subsystem's buffer pool activity, and you see for one of your buffer pools something like this:

PREF.DISABLED-NO READ ENG   350.00

And you think, "Uh-oh. Prefetch is a good thing, so disabled prefetch must be a bad thing. And it happened 350 times for BP2 in one hour! AHHHHH! CODE RED! WE'RE OUT OF READ ENGINES!"

Hey - take a few deep breaths, and calm down. Chances are, this is not a big deal. In this blog entry, I'll explain why.


First: what does it mean?

OK, a lot of people know that prefetch operations - sequential, list, and dynamic - are handled by Db2 tasks. That is to say, the CPU  cost of processing prefetch requests is charged to Db2 (specifically, to the Db2 database services address space, also known as DBM1), as opposed to being charged to the application process on behalf of which Db2 is executing the prefetch operations. We sometimes refer to Db2's prefetch-handling tasks as "prefetch read engines." They are represented in a z/OS system by what are known as preempt-able SRBs (service request blocks - a type of z/OS control block), and that's what makes the work that they do zIIP-eligible (100% zIIP-eligible, in fact - a major reason why, in most systems, the bulk of CPU time charged to DBM1 is zIIP engine time).

There are - surprise! - a finite number of prefetch read engines associated with a given Db2 subsystem. That number is 600 in a Db2 11 environment, and 900 in a Db2 12 system. If all 600 Db2 11 prefetch read engines (or all 900 Db2 12 engines) are busy handling prefetch requests, and another prefetch request comes along, that prefetch request will be abandoned and the PREFETCH DISABLED - NO READ ENGINE counter will be incremented (externalized via the QBSTREE field of trace record IFCID 0002, written when Db2 statistics trace class 1 is active, and also part of the output of the Db2 command -DISPLAY BUFFERPOOL with DETAIL). And what does it mean when a prefetch request is abandoned because there was not a prefetch read engine available to process the request? It could mean that pages that would have been brought into memory via the abandoned prefetch request will subsequently be read into a buffer pool by way of a synchronous (i.e., single-page, on-demand) read operation, but that is not necessarily the case. Read on.


Second: do you care?

One thing to know up front: if you see a non-zero value for PREFETCH DISABLED - NO READ ENGINE for one or more of your Db2 buffer pools, you're not alone. This situation is not super-common, but neither is it highly unusual. In fact, we've been seeing more incidences lately of non-zero values for PREFETCH DISABLED - NO READ ENGINE. Why? Several reasons. One is the trend of Db2 data sharing group member consolidation. As IBM Z servers have become more powerful and as Db2's vertical scalability has been enhanced, organizations running Db2 in data sharing mode on Parallel Sysplex clusters have found that they can support an application workload with (for example) 4 members in a data sharing group versus 6. When work that had been spread across n data sharing members now runs on a smaller number of members, that can mean more concurrently active prefetch read requests for a given member, and that can lead in some cases to prefetch read requests exceeding the number of prefetch read engines available on a subsystem. Other factors driving increased levels of prefetch read activity include Db2 query optimizer changes that drive, in particular, more list prefetch-related access path selection; the I/O parallelism for index updates that Db2 10 introduced; and row-level sequential detection, also introduced with Db2 10, which enables Db2 to continue to use dynamic prefetch even as a table space's data rows become somewhat disorganized as a result of data-change activity.

So, if you see non-zero numbers for PREFETCH DISABLED - NO READ ENGINE for one or more of your buffer pools, should you try to do something about it? Maybe, maybe not. First, check a related counter: PREFETCH DISABLED - NO BUFFER. If that value is also non-zero, it's likely that the buffer pool in question is too small, and/or that VPSEQT (the virtual pool sequential threshold) has been changed from its default value of 80 to a too-small percentage of the pool's buffers. Respond to that situation by enlarging the pool, if the z/OS system's real storage resource is not constrained (system memory is not constrained if the z/OS LPAR's demand paging rate - available, among other places, in an IBM RMF CPU summary report - is zero or a very small non-zero value, like less than 1 per second), and/or by taking the pool's VPSEQT value from a too-small number (if applicable) to something closer to 80. Also consider enlarging the buffer pool if the pool's total read I/O rate is in the hundreds or more per second. How can a larger buffer pool reduce incidences of PREFETCH DISABLED - NO READ ENGINE? Here's how: a larger buffer pool that caches more pages makes it more likely that a prefetch request will NOT drive an associated prefetch read I/O operation. Examine information for your buffer pools, using a Db2 monitor or the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command. Look at (for example) the dynamic prefetch numbers for the various buffer pools. You will likely see, for at least one of your pools, a situation in which the number of dynamic prefetch requests is larger - perhaps considerably larger - than the associated number of dynamic prefetch reads. How can that be? Simple: let's say that there's a dynamic prefetch request to read 32 pages of a table space or index into memory. What if all 32 of those pages are ALREADY IN MEMORY? In that case, the dynamic prefetch request will not result in a dynamic prefetch read operation. Now, the dynamic prefetch request will still occupy a prefetch read engine, but the request will tie up the prefetch read engine for a much shorter time if it does not involve a prefetch read I/O operation. When prefetch read engines are freed up more quickly thanks to prefetch requests that do not drive prefetch reads, it becomes less likely that all of a Db2 subsystem's prefetch read engines will be busy at the same time.

Now, let's say that you have a buffer pool that is large enough to have a relatively low total read I/O rate, and you see for that pool no occurrences of PREFETCH DISABLED - NO BUFFER, but you still see a non-zero value for PREFETCH DISABLED - NO READ ENGINE. Is that a cause for concern? Quite possibly not. To understand this, consider a scenario: some application process is accessing a table space or index in a sequential fashion. A dynamic prefetch request (for example) is initiated by Db2 on behalf of the application process, but all of the subsystem's prefetch read engines are busy handling other requests. The dynamic prefetch request is abandoned. That will lead to synchronous read requests, right? Not necessarily. Suppose the aforementioned application process (or another process accessing the same database object) needs a page that would have been read into memory via the dynamic prefetch request that was abandoned due to a "no read engine" situation. Will a synchronous read operation be required for that page access? NO, if that page is already in the buffer pool. If the requested page is indeed already in memory, there was essentially NO penalty associated with the abandoned dynamic prefetch request - only the very small amount of CPU consumption associated with initiating and subsequently abandoning the prefetch request (so small that I think it would be hard to measure). Here's something you can check, using your Db2 monitor or the output of -DISPLAY BUFFERPOOL(ACTIVE) DETAIL: for a buffer pool that shows, at least sometimes, some occurrences of PREFETCH DISABLED - NO READ ENGINE, look at the total number of synchronous read I/Os, and look also at the total number of synchronous read I/Os that are labeled SYNCHRONOUS READS - SEQUENTIAL. What is a "sequential synchronous read?" It's a synchronous read I/O driven by a process that is accessing data in a sequential fashion. Why might there be synchronous reads for a process accessing data in a sequential manner? One reason: a data row (for example) that is to be FETCHed by a process is way out of place, in a clustering sense. That being the case, the row is not in the quantity of table space pages recently prefetched into memory on behalf of the application process. When the process needs the out-of-place row, the associated page will be synchronously read into the buffer pool, and that action will be recorded as a SYNCHRONOUS READ - SEQUENTIAL. Another reason: an application process accessing data in a sequential fashion has to drive a synchronous read I/O because a page that would have been prefetched into memory was not because the prefetch request was abandoned due to NO READ ENGINE.

If you see PREFETCH DISABLED - NO READ ENGINE go from zero to non-zero for a pool, or if that number goes from n to something considerably larger than n, and sequential synchronous reads do not increase as a percentage of total synchronous reads for the buffer pool, it is likely that the incidences of PREFETCH DISABLED - NO READ ENGINE did not materially impact application performance. In other words, it is likely that pages related to prefetch requests that were abandoned due to NO READ ENGINE were found to be already in memory when subsequently needed by application processes. In that case, occurrences of PREFETCH DISABLED - NO READ ENGINE are really nothing about which you should be concerned.

One more thing: if you have a buffer pool that is already big enough to have a relatively low total read I/O rate, and the pool has nothing in the way of PREFETCH DISABLED - NO BUFFER, and you see for this pool some occurrences of PREFETCH DISABLED - NO READ ENGINE, and you'd really like to try to take that to zero, you could look at the possibility of taking some objects in the pool that are a) not humongous and b) are frequently accessed via prefetch, and reassigning them to a pool defined with PGSTEAL(NONE). If the PGSTEAL(NONE) buffer pool is large enough to hold all pages of all objects assigned to the pool (and that's the objective for a PGSTEAL(NONE) pool), there should be few, if any, prefetch requests associated with the pool, and that decrease in prefetch request activity might reduce the incidence of PREFETCH DISABLED - NO READ ENGINE you've seen for other pools. Something to consider.

So, don't freak out if you see some occurrences of PREFETCH DISABLED - NO READ ENGINE for a buffer pool. It may be having little - if any - impact on application performance. If you do think that non-zero values for PREFETCH DISABLED - NO READ ENGINE are leading to increased synchronous read activity (check for increases in sequential synchronous reads), consider enlarging the buffer pool in question, and maybe consider moving some frequently-accessed, small- to medium-sized objects to one or more PGSTEAL(NONE) pools. No need to panic, in any case. 

Friday, September 21, 2018

The Two Paths to the Db2 for z/OS Distributed Data Facility

It's been almost two years since the general availability date of Db2 12 for z/OS and, by way of that new version, the availability of Db2's native REST interface (retrofitted to Db2 11 for z/OS via APARs PI66828 and PI70477). While the REST interface to Db2 is relatively new, the foundation on which it was built is not: Db2's native REST interface is an extension of the Db2 distributed data facility, also known as DDF. DDF has been around for more than 25 years, providing data services for high-volume, operational, mission-critical applications deployed by organizations in all kinds of industries, all over the world. Given the industrial-strength underpinnings of Db2's REST interface, it's important that this interface be seen as, essentially, a second path into DDF for applications that access Db2 for z/OS via TCP/IP connections. That's the focus of this blog entry

The two paths into DDF

If the REST interface is one path to DDF, what is the other? Why, the SQL path, of course. Hearing that, a Db2 for z/OS DBA or systems programmer might ask, "Oh, you mean the DRDA path?" Yes, that's technically what I mean, but I think that it would be a good idea to use the words "SQL path" instead of "DRDA path," because the latter term is not relevant to a lot of application developers. DRDA - distributed relational database architecture - is Db2's distributed database protocol, but a client-side developer doesn't see DRDA. The developer sees SQL statements, perhaps in JDBC or ODBC form. Those JDBC or ODBC statements are processed by the IBM Data Server Driver (or Db2 Connect), and are in DRDA form when they get to Db2 for z/OS. So, if an application with a TCP/IP connection to a Db2 for z/OS system wants to access Db2-managed data, it can do that by issuing SQL statements (which, again, might be in the form of JDBC or ODBC statements) or by issuing REST calls.

On the topic of the SQL path to DDF, for a long time that path involved going through a Db2 Connect "gateway" server en route to the target Db2 for z/OS system. Over the past several years, we (IBM) have been recommending to Db2 for z/OS-using organizations that they move away from Db2 Connect gateway servers in favor of the IBM Data Server Driver, which runs on the same server as an application program that is utilizing the SQL path (over TCP/IP) to Db2. The IBM Data Server Driver is lighter weight than Db2 Connect, it has the functionality you want (things like connection pooling, Sysplex workload balancing, etc.), and it delivers performance and system management benefits (both largely owing to the fact that the "hop" formerly required to a Db2 Connect gateway server is eliminated - you go straight in to the Db2 for z/OS system from the application server). Note that the IBM Data Server Driver is not licensed as such: your entitlement to use the IBM Data Server Driver is through your Db2 Connect license. If you have, for example, a Db2 Connect Unlimited Edition for System z license for a Db2 for z/OS system, you are entitled to deploy the IBM Data Server Driver in an unlimited fashion for applications that target said Db2 for z/OS system.

As there are two paths to DDF - SQL and REST - there are also two ways a client program can access Db2's native REST interface: the program could directly access that interface, or get to it through z/OS Connect (when Db2's REST interface is accessed by a client program through z/OS Connect, we say that Db2 for z/OS is acting as a REST provider for z/OS Connect). Given that a client program using the REST path to DDF will be getting to Db2's REST interface regardless of whether or not z/OS Connect is in the picture, you might wonder why your organization would want z/OS Connect to be in the picture. z/OS Connect does, in fact, add a good bit of value when client-side programmers want to go the RESTful route to access z/OS-based data services. Some of the benefits delivered by z/OS Connect:

  • Client-side programmers can code more-intuitive REST calls. When going directly to Db2's native REST interface, you have to use the HTTP verb POST in your REST calls. That doesn't limit what you can do, Db2-wise, in response to a REST call, but a client-side developer might want to use GET in a REST call if the requested service is of a data-retrieval nature, or PUT if the call will cause some data values to be persisted at the data server. With z/OS Connect in the picture, all of the HTTP verbs are available for use in REST calls.
  • Easier creation of RESTful services from SQL statements. Db2's REST interface enables invocation of a single static SQL statement (which could be a data manipulation statement such as SELECT, INSERT, UPDATE or DELETE; or a CALL to a Db2 stored procedure) by way of a REST call. Db2 enables creation of a RESTful service through the Db2 command BIND SERVICE, which can be issued from a batch job, or through DB2ServiceManager, a Db2-provided RESTful service that creates RESTful services from SQL statements. That command and that service certainly work, but creating RESTful services from static SQL statements is even easier using the GUI tooling that comes with z/OS Connect.
  • Swagger-based service description. Client-side developers like to be able to "discover" the RESTful services that are available from a host system. The RESTful service-creation mechanisms provided by Db2 for z/OS (the aforementioned BIND SERVICE command and the DB2ServiceManager RERSTful service) allow a service-creator to provide a description of a service in the form of a comment (e.g., "This service returns information about a customer based on a provided customer number"). That's helpful, but with z/OS Connect, service information can be provided to client-side developers in Swagger format - Swagger being an industry-standard specification for describing RESTful services.
  • More-comprehensive capabilities around the management, monitoring, securing, and auditing of RESTful services. Db2 for z/OS provides functionality that addresses all of these areas. z/OS Connect enriches and enhances that functionality.
  • A single entry point for REST-enablement of all kinds of z/OS-based programmatic assets. With z/OS Connect, not only can you REST-enable Db2 SQL statements (which, again, could be stored procedure calls) - you can also REST-enable CICS transactions, IMS transactions, WebSphere Application Server for z/OS transactions, and batch jobs.


Which path to DDF? The SQL path, or the REST path?

Keep in mind that this is not an either/or choice for a Db2 for z/OS system. I expect that, going forward, at many sites you'll see a mix of SQL and REST access to Db2 systems. The question is more relevant, then, in the context of appropriateness/attractiveness for a given application that will access Db2-managed data via TCP/IP connections. Here are some factors that might cause you to lean towards the SQL path:

  • You have client-side developers with a lot of SQL coding experience and expertise, and you want to leverage that capability. In particular, lots and lots of client-side developers know JDBC and/or ODBC (and/or ADO.NET) very well.
  • You want to take advantage of scalability and/or workload management capabilities provided by the IBM Data Server Driver (or Db2 Connect). These capabilities include connection pooling and Sysplex workload balancing functionality.
  • You want client-side programs to be able to dynamically form SQL statements that are then sent to Db2 for z/OS for execution. Oftentimes, SQL statements are hard-coded in client-side programs; sometimes, that is not the case. While it would be technically possible to dynamically form a SQL statement and then pass it as input to a REST-enabled Db2 stored procedure for preparation and execution, issuing dynamically formed SQL statements targeting a Db2 for z/OS system is more easily done via the SQL path to DDF.
  • You need or want client-side control over the scope of transactions. If you want a client-side program to be able to do something like issue SQL | issue SQL | issue SQL | commit, the SQL path is the way to go. When using the REST path, every interaction with the server is, from Db2's perspective, a separate unit of work. That's OK for some applications, not OK for others.


How about factors that might cause you to favor use of the REST path to DDF? Among those could be:

  • There is no need for Db2 client code on the application requester side. In some cases, it may not be desireable or feasible to have the IBM Data Server Driver (or Db2 Connect) installed on the client-side application server.
  • There is no need for client-side programmers to know anything about the particulars of a back-end data server. When a client-side programmer codes SQL statements, he or she knows that the back-end data server is a relational database management system (or something that at least looks like a relational DBMS). Maybe the client-side developers for a given application project don't have much in the way of SQL skills; or, maybe they do have SQL skills, but they prefer the high level of back-end system abstraction provided by the REST architectural style (in other words, they like the data-as-a-service programming model).


Next time a development team is getting ready to build a new application that will access Db2 for z/OS-managed data via TCP/IP connections, or when an existing Db2 for z/OS-based application is going to be reengineered along client-server lines, keep in mind that two paths to the Db2 distributed data facility from such applications are available: the SQL path and the REST path. Work with the development team and determine which path would be best for the project at hand.

Wednesday, August 29, 2018

How Big is Big? (2018 Update - Db2 for z/OS Buffer Pools and DDF Activity)

Almost 5 years ago, I posted to this blog an entry on the question, "How big is big?" in a Db2 for z/OS context. Two areas that I covered in that blog entry are buffer pool configuration size and DDF transaction volume. Quite a lot has changed since October 2013, and it's time for a Db2 "How big is big?" update. In particular, I want to pass on some more-current information regarding buffer pool sizing and DDF activity.

How big is big? (buffer pools)

Back in 2013, when I posted the aforementioned blog entry, the largest buffer pool configuration I'd seen (i.e., the aggregate size of all buffer pools allocated for a single Db2 for z/OS subsystem) was 46 GB. That Db2 subsystem ran in a z/OS LPAR with 180 GB of real storage. Fast-forward to August 2018, and my, how z/OS LPAR memory resources - and exploitation of same via large buffer pools - have grown. The biggest buffer pool configuration for a single Db2 subsystem that I've seen to date? How about 879 GB, in a z/OS LPAR that has 1104 GB (almost 1.1 TB) of central storage. A single pool in that configuration has 66,500,000 buffers of 4 KB each - that's over 253 GB of space in one pool. Does that humongous amount of buffer pool space - over 600 GB of which is page-fixed in memory - put undue pressure on the z/OS LPAR's real storage? No. The demand paging rate for that system (a busy data server, processing about 14,000 SQL statements per second during peak times) is a big fat zero. That's because the 225 GB of memory not used for Db2 buffer pools is plenty for the other real storage requirements in the LPAR.

What does the organization with the great big Db2 buffer pool configuration get in return for using lots and lots of memory for data and index page caching? It gets tremendous suppression of disk subsystem read I/Os: I saw that for an hour during which the volume of data access activity on the system was really high, the highest total read I/O rate for any of the buffer pools was 48 per second (very low). During that peak-busy time, two of the other buffer pools had total read I/O rates of 15 and 13 per second (very, very low), five pools had total read I/O rates between 2 and 5 per second (super-low), and the other nine active pools had total read I/O rates of less than 1 per second, or even 0 (super-duper low). And what are the payoffs from tremendous suppression of disk read I/Os? CPU savings (every I/O - synch or asynch - consumes CPU time) and improved transaction and batch job elapsed times (in Db2 monitor accounting-long reports, wait time related to synchronous and asynchronous database reads - the latter is labeled "wait for other read" - becomes a very small percentage of in-Db2 elapsed time).

z/OS LPAR memory sizes are getting larger all the time. If you've got it, use it - and using it for big (maybe really, really big) buffer pools can be a great move on your part. In doing that, don't forget to leverage fixed-in-memory buffer pools, large page frames, and maybe "pinning" pools (the latter are used to cache associated database objects in memory in their entirety, and should be defined with PGSTEAL(NONE)).


How big is big? (DDF transaction volume)

In the above-cited "How big is big" blog entry, I noted that the highest DDF transaction rate I'd seen for a single Db2 subsystem (average over a 1-hour period) was 786 per second. A few months ago, I saw data from a Db2 subsystem that was processing 3057 DDF transactions per second (again, that's the average over a 1-hour period) - almost 4 times the highest DDF transaction rate I'd seen back in 2013. [It's easy to calculate a DDF transaction rate: in a Db2 monitor accounting-long report with data grouped by connection type, in the section on the DRDA connection type, divide the commit count by the number of seconds in the reporting interval, and there's your transaction rate.]

I have seen that an ever-growing percentage of overall Db2 for z/OS workloads - and a really big percentage of new Db2-accessing application workloads - involve Db2 access via the distributed data facility. This, combined with the increasing processing capacity delivered by new generations of IBM Z servers, plus DDF-related performance features such as high-performance DBATs and the SMT2 mode in which zIIP engines can operate, add up to substantial growth in DDF transaction volumes at many Db2 for z/OS sites (the organization with the DDF transaction rate in excess of 3000 per second for a Db2 subsystem runs the zIIP engines in the associated z/OS LPAR in SMT2 mode). DDF transaction rates are likely to get a further boost as companies take advantage of Db2's built-in REST interface, since that interface provides a second path to the Db2 distributed data facility (the other path being the SQL path which could also be called the DRDA path - more on that in the next entry I'll post to this blog).


Big is likely to get bigger

These upward trends, regarding Db2 buffer pool configuration sizes and DDF transaction volumes, are two that I like to see. The former reflects growing recognition that large IBM Z server real storage resources can be very effectively leveraged to turbocharge Db2 application performance, and the latter shows that Db2 for z/OS is an increasingly popular choice as the data server for modern, multi-tiered, client-server applications that access data through standard interfaces (e.g., JDBC, ODBC, ADO.NET, REST). How big will Db2 buffer pool configurations get in the years to come? How high will DDF transaction rates go? We'll see. My take is that big - even really big - is going to get a lot bigger still.

Friday, July 27, 2018

Db2 for z/OS: Using the Profile Tables to Direct DDF Applications to Particular Package Collections

I've posted several entries to this blog on the topic of high-performance DBATs (for example, one from a few years ago covered a MAXDBAT implication related to high-performance DBAT usage). You may well be aware that a "regular" DBAT (a DBAT being a DDF thread) becomes a high-performance DBAT when a package bound with RELEASE(DEALLOCATE) is allocated to the DBAT for execution. How do you get a RELEASE(DEALLOCATE) package allocated to a DBAT? Well, for a DDF-using application that calls Db2 stored procedures, it's easy: bind the packages of frequently-executed stored procedures with RELEASE(DEALLOCATE) and, boom - you get high-performance DBATs when those stored procedures are called. The same goes for packages associated with DDF-using applications that issue static SQL statements from the client side (for example, Java programs that use SQLJ instead of JDBC): bind those packages (the ones that are executed frequently) with RELEASE(DEALLOCATE) and you get high-performance DBATs when the corresponding programs are executed.

Here's the thing, though: there are plenty of DDF-using applications that don't involve execution of static SQL in any form: not in stored procedures, not issued from client-side programs. Can these applications use high-performance DBATs? Sure, they can. Those applications use packages. Which packages? The IBM Data Server Driver packages, of course (these are also referred to as Db2 Connect packages - the Data Server Driver is the more modern version of the Db2 client software). The Data Server Driver packages are bound, by default, into a collection called NULLID. Could they be bound into NULLID with RELEASE(DEALLOCATE)? Yes, but that would NOT be a good idea. Why? Because that would cause all DDF work, by default, to run by way of high-performance DBATs. The optimal set-up is to limit high-performance DBAT usage to your higher-volume DDF-using applications (especially those characterized by transactions that have a low average in-Db2 CPU time - maybe just a few milliseconds). How could that be done, if the NULLID packages are bound with RELEASE(COMMIT)? Easy: you direct the higher-volume DDF applications to a collection, other than NULLID, into which the IBM Data Server Driver packages were bound with RELEASE(DEALLOCATE). And how is that accomplished? Well, for a while the aim was achieved by changing a client-side data source property to point to the alternate Data Server Driver package collection (the one associated with RELEASE(DEALLOCATE) packages) instead of the default NULLID collection. Actually getting the client-side change done? There's the rub: you had to rely on an application server administrator or a client-side programmer to do the deed, and those people were already busy and your change request might not be acted upon - or at least, might not be acted upon in the near future. Oh, for a way to do that Data Server Driver package collection redirect from the Db2 side!

In fact, you can direct a DDF-using application to a collection other than NULLID by way of a  Db2-side action, and that Db2-side capability is the crux of this blog entry.

The solution: you cause the value of a special register to be set automatically for a given DDF application. The special register of interest here is CURRENT PACKAGE PATH, and the means of accomplishing, on the Db2 for z/OS server side, the automatic setting of the special register for a particular DDF-using application is a Db2 profile and an associated profile attribute. I'm talking about the Db2 profile tables, SYSIBM.DSN_PROFILE_TABLE and SYSIBM.PROFILE_ATTRIBUTES. First, the profile: by way of a row inserted into SYSIBM.DSN_PROFILE_TABLE, you identify the DDF-using application for which you want high-performance DBATs to be used. That could be done in several ways, such as through the authorization ID the application uses in connecting to the Db2 for z/OS system. Alternatively, you could identify the application using the IP address (or addresses) of the server(s) on which the client application runs. There are other application identifier options, as well - check them out in the Db2 for z/OS Knowledge Center on the Web. Note that wild cards can be used with DSN_PROFILE_TABLE application-identifier values, so that one profile could cover a set of application server IP addresses, or a set of application-utilized authorization IDs that begin with (for example) the characters CRM (so, the value CRM* in the AUTHID column of DSN_PROFILE_TABLE would apply to authorization IDs CRM01, CRM02, CRM03, etc.).

With the profile row entered into DSN_PROFILE_TABLE, you're ready to set up the attribute that will cause the application of interest to use the IBM Data Server Driver (or Db2 Connect) packages in collection HIPRFCOLL (if that's what you name the collection with RELEASE(DEALLOCATE) packages) instead of those in the NULLID collection. To do that, you insert a row into SYSIBM.DSN_PROFILE_ATTRIBUTES with a value in the PROFILEID column that matches the PROFILEID value in the DSN_PROFILE_TABLE row for the application with which you're working. In that DSN_PROFILE_ATTRIBUTES row, place the value SPECIAL_REGISTER in the KEYWORDS column, and set the value of the ATTRIBUTES1 column to this character sting (again using HIPRFCOLL as the name of the collection into which Data Server Driver packages were bound with RELEASE(DEALLOCATE)):

SET CURRENT PACKAGE PATH = HIPRFCOLL

And you're done: the application associated with the profile will use the IBM Data Server Driver (or Db2 Connect) packages in the HIPRFCOLL collection, and so will utilize high-performance DBATs because the packages in HIPRFCOLL are bound with RELEASE(DEALLOCATE). You can read more about setting special registers via profiles in the Db2 for z/OS Knowledge Center.

Now, I've focused on alternate collections (alternate relative to NULLID, that is) as a means of getting high-performance DBAT functionality for DDF applications that don't otherwise use packages because there is a lot of interest in high-performance DBATs these days. Think, though, about other ways in which NULLID-alternative collections could be useful in your DDF application environment. What if you want certain DDF applications to run with ARCHIVESENSITIVE(NO) behavior, instead of the default ARCHIVESENSITIVE(YES)? What if you want certain DDF applications to run with an isolation level other than the default CURSOR STABILITY? What if you want to use statement concentration (i.e., the automatic parameterization by Db2 of queries that contain literal values in predicates) for certain DDF applications (starting with Db2 12, that can be "turned on" by way of the package bind option CONCENTRATESTMT)? By using profiles to direct DDF applications to Data Server Driver (or Db2 Connect) collections holding packages bound with the desired options and specifications, you can get the application behavior you want, without placing on client-side folks (application server administrators or application developers) the burden of using particular data source properties or coding SET statements for various Db2 special registers (and keep in mind that we're talking here about DDF applications that do not rely completely on Db2 stored procedures, as stored procedures provide you with a great deal of server-side control over application behavior, including package bind specifications).

For a long time, we've been accustomed to NULLID being the one and only collection for the IBM Data Server Driver / Db2 Connect packages, but with many DDF-using applications being of the type that use only (or at least primarily) those packages, and with a growing desire and need to establish behaviors for those applications that can be most readily implemented through Db2-side direction of the applications to collections with packages bound with particular option specifications, it will become more and more commonplace for Db2-using organizations to have multiple collections for the Data Server Driver / Db2 Connect packages, with direction of applications to collections accomplished via Db2 profiles as described above. Remember: in all likelihood it would be best for the packages in your NULLID collection to be bound with standard default specifications, so that applications for which specialized functionality is desired can be selectively directed to the Data Server Driver / Db2 Connect collections containing the packages bound with the options and specifications that can enable the desired functionality and behavior. Using the Db2 profile tables to set the CURRENT PACKAGE PATH special register is a great way to implement server-side direction of DDF applications to Data Server Driver / Db2 Connect package collections. Give this some thought and give it a try if you think it would be useful in your Db2 for z/OS environment.

Tuesday, June 26, 2018

Db2 for z/OS PBG Table Spaces: The Ins and Outs of Partition Addition

In an entry posted to this blog about three years ago, I advocated for putting some thought into the PBG versus PBR decision (partition-by-growth versus partition-by-range) for larger tables (for smaller tables, PBG is often the sensible choice, especially when the associated table space is likely to have only one partition). I pointed out in the aforementioned blog entry several advantages pertaining to PBR table spaces in a large-table context, but acknowledged as well the attractiveness of PBG table spaces from a DBA's perspective (easy to define, pretty easy to administer). In the entry I'm writing today, I want to shed some light on an aspect of PBG table spaces that is a) important, b) not often taken into account, and c) not very well understood by a lot of Db2 people. What am I talking about? Partition addition - in particular, the dynamic and automatic adding of a partition by Db2 to a PBG table space as needed to provide space for more table rows.

Now, you might be thinking, "I'm quite familiar with that mechanism. It's one of the features that I most like about PBG table spaces. When partition n of a PBG table space gets full, Db2 automatically adds partition n+1, to accommodate addition row-insert operations." The thing is, if that's your understanding of how PBG partition addition works then your understanding is a bit off the mark (aside from an exception I'll get to momentarily).

Here's the straight skinny: if a PBG table space has n partitions, Db2 will add partition n+1 not when partition n is full, but when the table space is full. Consider a PBG table space with 100 partitions. Partition 100 gets full. When will partition 101 be added? Partition 101 will be added when Db2 verifies that ALL of the table space's partitions are full: Db2 will check partitions prior to partition 100 - all 99 of them, if needs be - to see if space for a new row can be found. If no space is found in any of the table space's partitions, partition 101 will be added to hold a new row.

If receiving that bit of information has you felling as though you're about to break into a cold sweat, settle down. This doesn't necessarily mean that you've made a mistake in going the PBG route for a large table. Let's consider a couple of scenarios:

  • Table gets inserts, but no deletes. In this case, yes, Db2 will check all partitions of the table space before adding a new one to accommodate a new row, but that space-check will be really fast: with a quick look at the "end" of a partition (referring to the end of the data in the partition), Db2 can determine if there is space available for a new row.
  • Table has a continuously-ascending clustering key. With such a clustering key in effect (every new row has a clustering key value that is greater than that of any pre-existing row), a new row should always go to the end of the table. That can make APPEND YES a viable choice for the table, and when a table in a PBG table space is defined with APPEND YES then Db2 will NOT look in all of the table space's partitions before adding a new partition to accommodate a new row: if partition n is full then partition n+1 will be added, regardless of whether or not space for the row is available in a partition other than partition n (that's the exception to the check-all-partitions rule that I mentioned up in the second paragraph of this entry). Caveat: APPEND YES can lead to an insert "hot spot" at the end of the table, and that could be an issue in a high-volume insert situation. In that case, going with MEMBER CLUSTER for the PBG table space, in addition to APPEND YES for the table, could enable achievement of the desired level of insert throughput.

What if you have a situation that is not like either of the two scenarios described above? What if you have a PBG table space that has, or is likely to eventually have, a large number of partitions? Should you be anxious about the possibly negative performance impact of an all-partition search for space that would precede dynamic addition of a new partition to a PBG table space? I think maybe not. Below I've provided some points to ponder. One or more of them may assuage your worries.

  • Partition size can be a mitigating factor. You have, of course, a number of options with regard to the DSSIZE specification for a PBG table space (DSSIZE indicates the size of the table space's partitions). Why do I bring this up? Because, for a given volume of data rows, a larger DSSIZE means fewer partitions, and as a general rule the backward search for space in previous partitions that precedes the automatic adding of a new partition to accommodate more data will be accomplished more quickly if there are fewer larger partitions versus a greater number of smaller partitions. Now, before jumping to the conclusion that DSSIZE 256G (the maximum value for a PBG table space) is the way to go, consider that there can be challenges in managing larger data sets. That said, it could be that a specification of DSSIZE 32G or 64G could be a better choice than 2G or 4G for some of your larger PGB table spaces.
  • Online REORG can add a partition to a PBG table space before it's needed to accommodate new rows. You may already be aware that a REORG of an entire PBG table space can result in a new partition (or partitions) being added to the end of the table space - that will happen if the reestablishment of free space (e.g., as specified through the table space's PCTFREE value) via REORG requires additional space beyond that in the table space's existing partitions. What you might not know is that Db2 12 for z/OS added a nice enhancement pertaining to REORG of a partition (or partitions) of a PBG table space. The background: prior to Db2 12, if you executed a partition-level REORG for a PBG table space, and the data in a partition would not fit in the partition with free space reestablished by REORG, the utility execution would fail (that's why the ZPARM parameter REORG_IGNORE_FREESPACE was introduced - you could set that parameter's value to YES to cause Db2 to use 0 for PCTFREE and FREEPAGE for a partition-level REORG job). With Db2 12, if a partition of a PBG table space is REORGed and the partition's data won't fit in the partition with free space reestablished, Db2 will add a partition to the table space to hold the overflow rows - the utility job won't fail (and the REORG_IGNORE_FREESPACE parameter is no longer part of the ZPARM module).
  • You can "pre-allocate" some partitions at CREATE TABLESPACE time. Do you think that the NUMPARTS option of CREATE TABLESPACE is only applicable to PBR (partition-by-range) table spaces? If so, time to update your thinking. When you create a PBG table space, NUMPARTS (if specified) tells Db2 how many partitions to initially create for the table space (unless you also specified DEFINE NO in the CREATE TABLESPACE statement); so, with a few partitions available from the get-go, it could be a while before a space constraint would induce Db2 to automatically add another partition for a PBG table space.
  • You can add the "next" partition yourself, with a SQL statement. If you have a situation in which a PBG is approaching "full" status and you're concerned about an "exhaustive backwards partition space search" that would precede automatic addition of a new partition, you can preclude that space search by simply adding the next partition yourself. How would you do that? With an ALTER TABLE ADD PARTITION statement, of course. Oh, you were under the impression that that statement could only be used with a PBR table space? Join the club - I thought that myself, and for quite some time after it was no longer true (ALTER TABLE ADD PARTITION became applicable for PBG table spaces starting with Db2 10 for z/OS). Anyway, you could monitor (using Db2 real-time statistics information) the full-ness of a PBG (or of the last partition of a PBG - particularly in the case of a continuously-ascending partitioning key), and if it looks as though a new partition might soon be needed, add it yourself.

OK, to sum up: if you had assumed that a new partition got added to a PBG table space when the last partition could no longer accommodate a row insert, you need to set that assumption aside (except for the previously mentioned APPEND YES scenario) and understand that an exhaustive backwards search through preceding partitions - potentially through ALL preceding partitions - for row-insert space will occur before a new partition will be automatically added to the table space. Keep in mind that in some cases (especially when there are fewer larger partitions versus lots of smaller ones, and particularly when the table in question hasn't seen row-delete action), that space search will conclude quite rapidly. When the PBG table space has a large number of partitions, and when there have been DELETE as well as INSERT operations targeting the table in the table space, the exhaustive space search preceding automatic addition of a new partition could take longer to complete. In such situations, keep in mind that REORGs (of the entire table space, or - with Db2 12 - of one or more partitions of the table space) can result in added partitions that can hold new rows. Also remember that NUMPARTS can be used to "pre-allocate" partitions at CREATE TABLESPACE time, and that ALTER TABLE ADD PARTITION is a means by which you can add a new partition to a PBG table space at a time of your choosing.

Can PBG table spaces still provide ease-of-use benefits for DBAs, even for very large (i.e., multi-partition) tables? Sure they can. It's just that they are not quite so "set-it-and-forget-it" as you might have thought. With a clear-eyed understanding of how PBG partition addition works, you can take the steps needed to make PBG table spaces work well for you.

Thursday, May 24, 2018

Db2 for z/OS Buffer Pools: Index Root Pages and the Data Management Threshold

Sometimes, a change in the way Db2 for z/OS manages its resources is introduced, and the change "flies under the radar" (i.e, it escapes the notice of most Db2 users) because it is relatively small in terms of scope and impact and is not widely highlighted in the product documentation, and it delivers the intended benefit (e.g., enhances CPU efficiency for data access) and no one thinks about it -- BUT, it may have a consequence that pops up in a surprising way, causing, initially, some confusion on the part of a Db2 administrator before the "what" and "why" of the perplexing situation is understood. In this blog entry, I'll describe a scenario of this nature. It has to do with index root pages and something called the buffer pool data management threshold.

A Db2 for z/OS consultant working in Europe brought this matter to my attention. This consultant's client had, in a test environment, a Db2 for z/OS buffer pool that kept hitting the data management threshold, and the consultant could NOT figure out why this was happening.

Some background: the data management threshold relates to Db2 buffer pools -- it's hit when 95% of the buffers in a pool are non-stealable, with "non-stealable" meaning "in-use or updated-and-not-externalized (i.e., externalized to disk or to a group buffer pool in a Db2 data sharing system)." You don't want the data management threshold to be reached for a buffer pool. Why? Because when DMTH (the acronym that refers to the threshold) is hit for a buffer pool, Db2 will issue a GETPAGE for every row retrieved from a given page in the pool; so, if 20 rows are to be retrieved from a page in a buffer pool that has hit DMTH, Db2 will issue 20 GETPAGE requests instead of one. That, in turn, will drive up overhead for access to data cached in the pool, as GETPAGEs are a major factor with regard to the CPU cost of SQL statement execution.

So, why did the pool in question regularly hitting DMTH perplex the consultant? Because the pool, though not real big, was not tiny (10,000 4K buffers), and the deferred write thresholds were at their default values (30 for DWQT, aka the horizontal deferred write threshold, and 5 for VDWQT, aka the vertical deferred write threshold), and the pool wasn't all that active (about 200 GETPAGEs per second, and just a few read I/O operations per second). With the pool's deferred write thresholds being at their default values and the pool not being particularly busy and not being particularly small, no way should DMTH be getting hit due to a large percentage of the pool's buffers being in an updated-but-not-externalized state. THAT would have to mean that DMTH was being hit because a very large percentage of the pool's buffers were in an "in-use" state. But how could that be, given the relatively non-busy state of the pool (200 GETPAGEs per second is nothing -- I've seen plenty of pools with thousands, or tens of thousands, of GETPAGEs per second).

Digging a little further for answers, the consultant used a Db2 monitor to get a more detailed view of the usage of the buffer pool for objects assigned to it. What he saw left him even more puzzled: the pool had about 10,000 indexes assigned to it (keep in mind, this is a test environment we're talking about), and a little over 9000 of those indexes each had a single "currently in-use" page in the pool. That number of in-use pages explained why DMTH was often being hit for the buffer pool (as I mentioned, buffers holding in-use pages are non-stealable, just as are buffers holding updated-but-not-externalized pages), but how could so many pages be currently in-use for a fairly low-activity pool, and why would almost all of the currently in-use pages be the ONLY in-use page for a given index? The consultant scoured the Db2 documentation in search of an explanation, and that's where he found the key that unlocked the mystery -- not in the Administration Guide, not in Managing Performance, but in the What's New manual for Db2 10 for z/OS. There, in an unexpected location (under the heading, "Reduction in need for explicit REORG"), he saw these words (color highlighting added by me):
DB2 10 provides several performance enhancements that reduce the need to reorganize indexes frequently, resulting in a reduction in CPU time and synchronous I/O waits. Compared to DB2 9, in DB2 10 you might see increased activity in deferred writes because more buffers are in use by buffer pools to improve performance. For example, activity might increase for index root pages that are pinned in a buffer pool when the index page set or index partition is opened.
(You can check this out for yourself here.)

And there you have it. This is one example -- there are plenty more -- of the Db2 for z/OS development team effecting changes that aim to leverage larger memory resources (increasingly common for z/OS systems) for enhanced performance. Index root pages are accessed a lot, and memory is becoming more plentiful on IBM Z servers, so why not "pin" those pages in the associated buffer pools at index open time, to guarantee that they'll be there when we need to read them?

OK, for a buffer pool analyzed by the aforementioned Db2 consultant, the pinned index root pages caused a DMTH problem. Does that mean that this is a "Watch out!" kind of situation in a general sense? I would say, "No." The problem described herein was encountered in a test environment, with a buffer pool that is on the small size (as mentioned, 10,000 4K buffers) and which has assigned to it a very large number of indexes (as mentioned, about 10,000). In my experience, I have NEVER encountered a situation in which a buffer pool in a production Db2 system hit DMTH due to index root pages being fixed in the pool. In production environments, buffer pools are often quite large (sometimes hundreds of thousands -- or even several million -- buffers in a single pool), and they often have assigned to them something less than 10,000 objects (and keep in mind that fixed-in-the-pool root pages are only relevant to indexes -- not table spaces). No, the reason I've written this blog entry isn't to warn you about a problem situation that is quite likely to occur in a production Db2 system. I've written this piece because you, like the consultant to whom I have referred, could conceivably encounter this buffer pool DMTH issue (probably, if at all, in a test or development environment), and if you do then having the information I've provided could save you some problem identification time and associated worry and consternation.

If in fact you encounter the index-related buffer pool DMTH situation described in this blog entry (not likely), and if you want to make it go away, the best move would probably be to add buffers to the pool in question. If the system is memory-constrained so that you can't do that, consider taking one or more of these steps:
  • Reassign some of the objects assigned to the buffer pool to a different buffer pool.
  • Lower the DSMAX value for the subsystem, so that some index data sets will periodically be closed before the buffer pool fills up with "pinned" index root pages.
  • In a data sharing environment, have the indexes defined with CLOSE YES so that the data sets will be closed when they have gone a while without being accessed from a given Db2 member.

Monday, April 30, 2018

Db2 for z/OS Buffer Pools - Time to Give AUTOSIZE(YES) a Try?

Db2 9 for z/OS became generally available back in 2007. In the years since, most of the features introduced with Db2 9 have been broadly put to use. An interesting exception is the auto-sizing of buffer pools, implemented via the AUTOSIZE(YES) option of the -ALTER BUFFERPOOL command. Based on my experience, it seems that few Db2 for z/OS sites have gone with buffer pool auto-sizing. I find that a bit perplexing. In the Db2 for Linux/UNIX/Windows user community, a feature called self-tuning memory (which applies to buffer pools and more) appears to be quite popular. Why the seeming reluctance to leverage buffer pool auto-sizing in Db2 for z/OS systems?

I'm thinking that we may be seeing a situation not unlike the one that existed back in the 90s and early 2000s vis-a-vis SMS-management of Db2 for z/OS data sets (SMS is short for System-Managed Storage, a capability provided by the z/OS operating system). Db2 management of table space and index data sets, enabled through a specification of USING STOGROUP (instead of USING VCAT) in a CREATE TABLESPACE or CREATE INDEX statement (or in an ALTER of an existing object), had become commonplace, but many Db2 DBAs and systems programmers were reluctant to take the additional step of letting SMS take care of data set placement within a disk subsystem. Folks were fine with Db2 issuing the Access Method Services commands that were needed to create data sets (what you get with USING STOGROUP), but letting the system handle data set placement was another matter. The thought of having VOLUMES('*') in a CREATE STOGROUP statement, which would hand data set placement responsibility over to SMS, made lots of Db2 systems people uneasy - they just didn't trust z/OS to get that done right; so, folks would create STOGROUPs that were associated with particular disk volumes (e.g., CREATE STOGROUP SGPRD01 VOLUMES(ABC123), CREATE STOGROUP SGPRD02 VOLUMES(XYZ139), etc.), and they would then assign individual Db2 data sets to this STOGROUP or that one - in effect, hand-placing the data sets within a disk subsystem - so as to ensure that the data sets in question went on different disk volumes. "I want to make sure that partition 1 of partitioned table space TS1 doesn't go on the same volume as partition 2 of that table space," they'd say; or, "I want to make sure that index IX1 goes on a volume other than the one holding the table on which IX1 is defined."

This very hands-on approach to Db2 data set placement started to become a major burden on DBAs and systems programmers as Db2 databases mushroomed in size and the number of data sets associated with Db2 databases shot up. Something had to give, and so some Db2 systems people - not without trepidation - decided to let SMS do a job that had become too big to handle via the old hand-placement approach. I had recently started work in the IT department of a Db2 for z/OS-using organization, back in 2000 (between my two IBM stints), when our lead z/OS systems programmer told me that she was going to go with SMS-management of Db2 data sets in production. She just hoped that the performance penalty wouldn't be too severe. Guess what? At our site, and at other sites, SMS management of Db2 data sets worked just fine. Performance didn't suffer, and tons of Db2 administrator time got freed up for use in higher-value activities. Within a short time, SMS management of Db2 data sets became the norm. Key to that development: the old approach (hand-placement of Db2 data sets within disk subsystems) became increasingly untenable, and technology advances made old worries concerning the new approach (SMS management of Db2 data sets) largely moot (z/OS and IBM Z and disk storage improvements made contention at the spinning-disk level much less of a concern than it had been in the past).

Today, a Db2 administration practice that ought to be reexamined is micro-management of buffer pools. Historically, people have sliced and diced buffer pool memory usage in a very fine-grained way. With ongoing growth in the size and number of Db2 subsystems that individual organizations use, and with Db2 for z/OS administration teams being, typically, small in terms of headcount, the time needed to effectively manage Db2 buffer pool sizing is, more and more, just not there. Turning that task over to Db2 (and the z/OS Workload Manager) is an increasingly attractive alternative. At the same time, technology changes have made buffer pool micro-management less worthy of major time commitment: IBM Z memory resources are getting to be huge by recent standards, and as z/OS LPAR real storage sizes soar, super-fine tuning of the use of that space for Db2 buffer pools becomes less and less of an imperative. Not only that, but many a Db2 administrator's mind-set, regarding appropriate buffer pool sizing, has fallen behind the reality of the current hardware situation. What I mean by that: a 10-GB buffer pool configuration (aggregate size of all pools allocated for a subsystem) may seem large in a Db2 administrator's eyes, given historical standards, but in an LPAR with 100+ GB of memory (pretty commonplace these days), that 10 GB looks dinky. It's like z/OS is saying to these people, "Hey, there are a lot of gigabytes in this system that are just lying around, doing nothing. I can see them. Give me and Db2 the keys to the buffer pool sizing car, and we'll put those sleeping gigabytes to work, and application performance will be the better for it. And don't worry about my overloading my own memory resource - I'm not going to do that. I know how to manage what I've got. Give me a chance."

In what I think might have been conceived as an enticement aimed at Db2 people who look askance at system-managed buffer pool sizing, Db2 11 provided "floor" (VPSIZEMIN) and "ceiling" (VPSIZEMAX) options for the -ALTER BUFFERPOOL command: "OK, Db2, I'll turn on auto-sizing for BP10, but I'm not going to let you make the pool smaller than X or larger than Y." I'd say, give that a shot. You can do that for several of your buffer pools (even all of them), or just one. Say you have a z/OS LPAR with lots and lots of memory (when organizations upgrade a mainframe, they often load it with buckets of additional real storage, due in no small part to attractive memory pricing), and a demand paging rate that doesn't vary from zero, even during the busiest processing times of the day (the demand paging rate for an LPAR is available, among other places, in an RMF CPU Summary Report). Say you have in this system a Db2 for z/OS buffer pool (I'll stick with BP10) that is sized at 100,000 buffers. Consider issuing a command of this nature:

-ALTER BUFFERPOOL(BP10) VPSIZE(100000) 
  VPSIZEMIN(75000) VPSIZEMAX(150000)
  ...
  AUTOSIZE(YES)

Let the workload go for a while, and periodically issue -DISPLAY BUFFERPOOL(BP10) DETAIL to see what's happening with the pool. Even better, have "before" and "after" activity data for the pool, for the same hour of the same day (e.g., 10-11 AM on a Thursday), from either a Db2 monitor-generated statistics report - long or the output of -DISPLAY BUFFERPOOL(BP10) DETAIL (if you go with the command for activity data, and you want data from 10-11 AM, issue the command at 10 AM and again at 11 AM, and you'll have activity data for that 1-hour period in the output of the second issuance of the command). My expectation would be that things will be fine, and some performance metrics might move in a positive direction. And, the system's demand paging rate will likely still be zero. Next, maybe open up auto-sizing for other pools in your system, again with floor and ceiling values with which you're comfortable. And, go back to a pool for which you previously activated auto-sizing, and, if the system-adjusted pool size is larger than its initial size, try another -ALTER BUFFERPOOL with a larger VPSIZE and larger VPSIZEMIN and VPSIZEMAX values, to let the system take the pool size further in the direction that the system has determined is best for overall performance. Over time, you may well become not only comfortable, but happy, with Db2 and z/OS handling sizing for all of a subsystem's buffer pools.

And, just in case you hear a little voice in your head, saying, "Careful, buddy. You spend a pretty good bit of time on buffer pool sizing. If you let the system handle that then you'll be sitting around twiddling your thumbs, and that's not good," tell that voice to shut up. Did you run out of things to do when you let SMS manage Db2 data set placement? Or when you no longer had to manage index page locking because Db2 stopped doing that kind of locking? Or when you could quit managing the package table part of the EDM pool because Db2 quit using EDM pool space for allocating packages to threads? Or when you let Db2 manage secondary disk space allocation for table space and index data sets with MGEXTSZ=YES in ZPARM and Db2's "sliding scale" algorithm? Of course not. In each and every case in which Db2 and/or z/OS took on tasks you used to do (or Db2 stopped doing things that you used to have to work to manage), you ended up staying plenty busy, with a different mix - a higher value mix - of work. System-managed buffer pool sizing offers another opportunity to move your work mix up the value chain. I think the time's right for more people to take that step. Maybe it's your time.