Tuesday, January 31, 2017

Are You Using System Profile Monitoring to Manage Your DB2 for z/OS DDF Workload? Perhaps You Should

Here's a scenario that might sound familiar to you: you have a particular transaction, which I'll call TRNX, that is the source of quite a lot of deadlock activity in a DB2 for z/OS system. It seems that whenever more than one instance of TRNX is executing at the same time, a deadlock situation is highly likely. You went with row-level locking for the table spaces accessed by TRNX, but the trouble persisted. It is conceivable that rewriting the program code associated with TRNX might eliminate the problem, but the task would be challenging, the development team has limited bandwidth to accomplish the recommended modifications, and it could take months -- or longer -- for the fix to get into production. What can you do?

Well, as I pointed out in an entry posted to this blog a few years ago, sometimes the right approach in a case such as this one is to single-thread TRNX. Though it may at first seem counter-intuitive, there are circumstances for which transactional throughput can be increased through a decrease in the degree of transactional multi-threading, and that approach can be particularly effective when the rate of transaction arrival is not particularly high (i.e., not hundreds or thousands per second), transaction elapsed time is short (ideally, well under a second), and probability of a DB2 deadlock is high if more than one instance of the transaction is executing at the same time.

Lots of people know how to single thread a CICS-DB2 or IMS-DB2 transaction, but what about a DDF transaction (i.e., a transaction associated with a DRDA requester, which would be an application that accesses DB2 for z/OS by way of TCP/IP network connections)? Is there a means by which a DDF transaction can be single-threaded?

The answer to that question is, "Yes," and the means is called system profile monitoring, and DDF transaction single-threading is just one of many useful applications of this DB2 for z/OS capability. I'll provide a general overview of DB2 system profile monitoring, and then I will cover transaction single-threading and a couple of other use cases.

System profile monitoring is put into effect by way of two DB2 tables, SYSIBM.DSN_PROFILE_TABLE, and SYSIBM.DSN_PROFILE_ATTRIBUTES. Those tables were introduced with DB2 9 for z/OS, and DB2 10 enabled their use in managing a DDF workload in a more granular fashion than was previously possible. Prior to DB2 10, the number of connections from DRDA requesters allowed for a DB2 subsystem, and the maximum number of those connections that could be concurrently in-use, and the maximum time that an in-use (i.e., non-pooled) DBAT (database access thread -- in other words, a DDF thread) could sit idle without being timed out, could be controlled only at the DB2 subsystem level via the ZPARM parameters CONDBAT, MAXDBAT, and IDTHTOIN, respectively. What if you want to exert control over a part of a DDF workload in a very specific way? With system profile monitoring, that is not at all hard to do.

A row inserted into SYSIBM.DSN_PROFILE_TABLE indicates the scope of a particular DDF workload managemnt action, and the corresponding row (or rows) in SYSIBM.DSN_PROFILE_ATTRIBUTES indicates what you want to manage for this component of your DDF workload (number of connections, number of active connections, idle thread timeout, or two of the three or all three) and how you want that management function to be effected (e.g., do you want DB2 to take action when a specified limit is exceeded, or just issue a warning message). The columns of the two tables, and their function and allowable values, are well described in the DB2 for z/OS documentation, and I won't repeat all that information here (the DB2 11 information is available online at http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/perf/src/tpc/db2z_monitorthreadsconnectionsprofiles.html, and you can easily go from there to the DB2 10 or DB2 12 information, if you'd like). What I will do is take you through a few use cases, starting with the single-threading scenario previously referenced.

To single-thread a DDF transaction, you would first need to identify that transaction by way of a row inserted into the SYSIBM.DSN_PROFILE_TABLE. You have multiple options here. You might identify the transaction by workstation name (a string that is easily set-able on the client side of a DDF-using application, as described in a blog entry I wrote back in 2014); or, you might identify the transaction via package name, if, for example, it involves execution of a particular stored procedure; or, you might use collection name [Collection name can be specified as a client-side data source property, and it is increasingly used to manage applications that use only IBM Data Server Driver (or DB2 Connect) packages -- these packages, which belong by default in the NULLID collection, can be copied into other collections, and in that way a DDF-using application can be singled out by way of the name of the Data Server Driver (or DB2 Connect) package collection to which it is pointed.] And, there are multiple additional identifier choices available to you -- check the DB2 documentation to which I provided the link above.

In your SYSIBM.DSN_PROFILE_TABLE row used to identify the transaction you want to single-thread, you provide a profile ID. That ID serves as the link to an associated row (or rows) in SYSIBM.DSN_PROFILE_ATTRIBUTES. In a row in that latter table, you would provide the ID of the profile you'll use to single-thread the transaction, 'MONITOR THREADS' in the KEYWORD column, 1 in the ATTRIBUTE2 column (to show that you will allow one active DBAT for the identified transaction), and 'EXCEPTION' in the ATTRIBUTE1 column to indicate that DB2 is to enforce the limit you've specified, as opposed to merely issuing a warning message (you could also specify 'EXCEPTION_DIAGLEVEL2' if you'd like the console message issued by DB2 in the event of an exceeded threshold to be more detailed versus the message issued with EXCEPTION, or its equivalent, EXCEPTION_DIAGLEVEL1, in effect). Then you'd activate the profile with the DB2 command -START PROFILE, and bingo -- you have put single-threading in effect for the DDF transaction in question.

Something to note here: Suppose you have set up single-threading in this way for transaction TRNX, and an instance of TRNX is executing, using the one thread you've made available for the transaction. Suppose another request to execute TRNX arrives. What then? In that case, the second-on-the-scene request for TRNX will be queued until the first-arriving TRNX completes (if TRNX typically executes in, say, less than a second, the wait shouldn't be long). What if a third request for TRNX comes in, while the second request is still queued because the first TRNX has not yet completed? In that case, the third TRNX request will fail with a -30041 SQLCODE. This is so because DB2 will queue requests only up to the value of the threshold specified. If you specify 1 active thread for a transaction, DB2 will queue up to 1 request for that transaction. If you specify a maximum of 4 active threads for a transaction, DB2 will queue up to 4 requests for the transaction if the 4 allowable active threads are busy. With this in mind, you'd want to have the TRNX program code handle the -30041 SQLCODE and retry the request in the event of that SQLCODE being received. Would you like to be able to request a "queue depth" that is greater than your specified threshold value? So would some other folks. That enhancement request has been communicated to the DB2 for z/OS development team.

Something else to note here: What if you are running DB2 in data sharing mode. Does a limit specified via SYSIBM.DSN_PROFILE_TABLE and SYSIBM.DSN_PROFILE_ATTRIBUTES apply to the whole data sharing group? No -- it applies to each member of the group. How, then, could you truly single-thread a DDF transaction in a data sharing environment? Not too hard. You'd set up the profile and associated threshold as described above, and you'd start the profile on just one member of the group (-START PROFILE is a member-scope command). On the client side, you'd have the application associated with TRNX connect to a location alias, versus connecting to the group's location, and that alias would map to the one member for which the single-threading profile has been started (I wrote about location aliases in an entry posted to this blog a few years ago -- they are easy to set up and change). If the one member is down, have the profile ready to go for another member of the group (you could have leave the GROUP_MEMBER column blank in the DSN_PROFILE_TABLE row to show that the profile applies to all members of the group, or you could have two rows, one for the "primary" member for the transaction in question, and one for an alternate member, in case the "primary" member for the transaction is not available), and start the profile on that member. You would also change the location alias that maps to the one member, so that it maps instead to the other member (or you could map the alias to two members, and only start the alias on one member at any given time -- location aliases can be dynamically added, started, and stopped by way of the DB2 command -MODIFY DDF); so, no client-side changes would be needed to move single-threading for a transaction from one data sharing member to another.

A couple more use cases. What else can be accomplished via DB2 system profile monitoring? There are many possibilities. Consider this one: you have a certain DDF-using application for which you want to allow, say, 50 active threads. Easily done: if the application connects to DB2 using a particular authorization ID -- very commonly the case -- then set up a profile that is associated with the application's ID, and in the associated DSN_PROFILE_ATTRIBUTES row indicate that you want to MONITOR THREADS, that the threshold is 50, and the action is EXCEPTION. Note, then, that up to 50 requests associated with the application could be queued, if the 50 allotted DBATs are all in-use.

Or how about this: there are hundreds (maybe thousands) of people employed by your organization that can connect to one of your DB2 for z/OS systems directly from their laptop PCs. You know that a single individual could establish a large number of connections to the DB2 system, and you are concerned that, were that to happen, your system could hit its CONDBAT limit, to the detriment of other DDF users and applications (and maybe that's actually happened at your shop -- you wouldn't be the first to encounter this situation). How could you limit individuals' laptop PCs to, say, no more than 5 host connections apiece? Would you have to enter hundreds (or thousands) of rows in DSN_PROFILE_TABLE, each specifying a different user ID (or IP address or whatever)? That is what you WOULD have had to do, before a very important system profile monitoring enhancement was delivered with DB2 12 for z/OS (and retrofitted to DB2 11 via the fix for APAR PI70250). That enhancement, in a word: wildcarding. By leveraging this enhancement (explained below), you could limit EACH AND EVERY "laptop-direct" user to no more than 5 connections to the DB2 for z/OS subsystem by way of a single row in DSN_PROFILE_TABLE (and an associated MONITOR CONNECTIONS row in DSN_PROFILE_ATTRIBUTES).

More on wildcarding support for system profile monitoring: with DB2 12 (or DB2 11 with the fix for the aforementioned APAR applied), you can use an asterisk ('*') in the AUTHID or the PRDID column of DSN_PROFILE_TABLE (the latter can identify the type of client from which a request has come); so, an AUTHID value of 'PRD*' would apply to all authorization IDs beginning with the characters PRD (including 'PRD' by itself), and an asterisk by itself would apply to ALL authorization IDs (with regard to rows in DSN_PROFILE_TABLE, a DRDA request will map to the profile that matches it most specifically, so if there were a profile row for auth ID 'PROD123' and another row for auth ID '*', the former would apply to requests associated auth ID PROD123 because that is the more specific match).

You can also use wildcarding for the IP address in the LOCATION column of a row in SYSIBM.DSN_PROFILE_TABLE, but in a different form. For an IPv4 TCP/IP address, a wildcard-using entry would be of the form address/mm where mm is 8, 16, or 24. Those numbers refer to bits in the IP address. Here's what that means: think of an IPv4 address as being of the form A.B.C.D. Each of those four parts of the address consists of a string of 8 bits. If you want to wildcard an IPv4 address in the LOCATION column of a DSN_PROFILE_TABLE row, so that the row will apply to all addresses that start with A.B.C but have any possible value (1-254) for the D part of the address, the specification would look like this (if A, B, and C were 9, 30, and 222, respectively):

9.30.222.0/24

And note that a specification of 0.0.0.0 applies to all IP addresses from which requests could come for the target DB2 for z/OS system. A similar convention is used for IPv6 addresses -- you can read about that in the text of the APAR for which I provided a link, above.

Why use this convention, instead of something like 9.30.222.* for addresses 9.30.222.1 through 9.30.222.254, or an * by itself for all IP addresses? Because the convention used is already prevalent in the TCP/IP world, and in this case it made sense to go with the flow.

So, that's what I have to say about DB2 system profile monitoring. It's a great way to manage a DB2 for z/OS DDF workload in a more granular way than is offered by the ZPARMs CONDBAT, MAXDBAT, and IDTHTOIN (though those ZPARM values remain in effect in an overall sense when system profile monitoring is in effect). If you aren't yet using this functionality, think of what it could do for your system. If you are using system profile monitoring but haven't used the new wildcard support, consider how that enhancement could provide benefits for you. In an age of ever-growing DB2 DDF application volumes, system profile monitoring is a really good thing.

Friday, December 30, 2016

In Praise of the Invisible DB2 for z/OS System

I've been working with DB2 for z/OS -- as an IBMer, as a user, as an independent consultant -- for 30 years. I was with IBM when DB2 was introduced to the market. I've met people who were part of the core team at IBM that invented relational database technology. I was privileged to serve for a year as president of the International DB2 Users Group. DB2 has been very good to me, and the work I do as a DB2 for z/OS specialist I do with commitment and passion. And here I am, getting ready to tell you that I hope your DB2 systems will become invisible to a key segment of the user community. Have I gone off the rails? You can read on and decide that for yourself.

To begin, what is that key segment of the user community for which, I hope, DB2 for z/OS will be invisible? Application developers. Why? Simple: the most meaningful indicator of the vitality of a DB2 for z/OS environment is new application workload. Growth, volume-wise, of existing DB2 for z/OS-accessing applications is certainly welcome, but you know, as a person supporting DB2, that your baby is in really good shape when newly-developed applications that interact with DB2 are coming online. If you want this to happen -- and you should -- then you should understand that a vital success factor is having application developers who are favorably inclined towards writing code that accesses DB2 for z/OS-managed data. And THAT depends largely on those developers not having to do anything different, in creating DB2-accessing applications, than they would do in working with another data server. In other words, you want application developers who are working with DB2 to not have to notice that DB2 is the database management system with which they are working.

Why is invisibility, from an application developer's perspective, of major import when it comes to growing new application workload for a DB2 for z/OS system? Several reasons:
  • First, you need to realize that a fundamental balance-of-power shift has occurred over the past 10-15 years: developers, once beholden, in many shops, to "systems" people, are now in the driver's seat. As far as I'm concerned, this is as it should be. I believe that the value of IT systems (and the value of the data housed in those systems) is largely dependent on the value of the applications that use those systems (and access that data). Applications bring in revenue, serve customers, schedule deliveries, determine product mix, pay suppliers, manage inventory, etc., etc. If you are a DB2 "systems" person (e.g. a systems programmer or a database administrator), my advice to you is this: view your organization's application developers as your customers, and seek to excel in customer service. If what those folks want is to not have to worry about the particulars of a given database management system when developing an application, deliver that.
  • Second (and related to the first point, above), DB2 invisibility, in the sense in which I'm using that phrase, removes a reason for which some developers might be opposed to writing DB2 for z/OS-accessing code. Put yourself in a developer's shoes. Wouldn't your productivity be enhanced if you could write database-accessing code without having to stop and think, "Hmm, for this application, I'm working with DBMS X versus DBMS Y. I know there are some things I need to do differently for DBMS X. Hope I can recall the details of those differences?" Wouldn't it be better if you could just code the same way regardless of the target data server?
  • Third, when a DB2 for z/OS system looks to a developer like other data-serving platforms, what becomes more noticeable is the way it acts versus other platforms. Developers might notice that this one data-serving system (which happens to be powered by DB2 for z/OS), for which they code as they do for other systems, always seems to be up, never seems to get hacked, and performs well no matter how much work gets thrown at it. Noticing these characteristics, developers might start expressing a preference for the data associated with their new applications being stored on the platform with the industrial-strength qualities of service. A friend of mine who is a long-time DB2 for z/OS DBA has a good nickname for the system he supports: the "super-server." I can certainly imagine a team of developers making a request along the lines of, "Could we put this application's data on the super-server?" Who wouldn't want that?

OK, so DB2 invisibility is a good thing. How to achieve it? The answer, in a word, is: abstraction. I want to talk here about two especially important forms of abstraction available for DB2 for z/OS systems -- one with which you are likely familiar, and one which may be news to you.

The familiar form of DBMS abstraction to which I referred above is the kind that makes a particular relational database management system (like DB2 for z/OS) look, to an application program, like a generic relational DBMS. In my experience, the two most common of these DBMS-abstracting interfaces are JDBC (Java Database Connectivity) and ODBC (Open Database Connectivity). ADO.NET is another example. These interfaces are enabled by drivers. In the case of DB2 for z/OS, JDBC and ODBC (and ADO.NET) drivers are provided, for network-attached applications (i.e., applications that access a target data server through a TCP/IP connection), via the IBM Data Server Driver (or DB2 Connect, though the former is recommended, as I pointed out in an entry posted to this blog a couple of months ago). IBM also provides, with DB2, JDBC and ODBC drivers that can be used by local-to-DB2 programs (i.e., programs running in the same z/OS system as the target DB2 for z/OS data server).

Lots and lots of applications, at lots and lots of sites, have been accessing DB2 for z/OS systems for years using JDBC, ODBC, and other interfaces that make DB2 look like a generic relational database management system. More recently, a growing number of developers have indicated a preference for a still-higher level of abstraction with regard to data server interaction. To these developers, using a DBMS-abstracting data interface such as JDBC or ODBC is less than ideal, because such interfaces indicate that the target data-serving system is a relational database management system. While it's true that the use of JDBC or ODBC (or a similar interface) means that a programmer does not have to be aware of the particular relational database management system being accessed, many developers feel -- and they have a point -- that even the form of a target data-serving system (one form being a relational DBMS) is something they should not have to know. From their standpoint, the form of a target data-serving system is a matter of "plumbing" -- and plumbing is not their concern. These developers want to access data-related services (retrieve data, create data, update data, delete data) as just that -- services. When data-as-a-service (DaaS) capability is available to a developer, the nature of what is on the other side of a service request -- a relational DBMS (like DB2), a hierarchical DBMS (like IBM's IMS), a Hadoop-managed data store, a file system -- is not consequential.

What is consequential to developers who want data-as-a-service capability is consistency and ease of use, and those concerns have much to do with the rapid rise of REST as a means of invoking programmatically-consumable services. REST -- short for Representational State Transfer -- is a lightweight protocol through which services can be invoked using HTTP verbs such as GET and PUT (services accessed in this way can be referred to as RESTful services). Adding to the appeal of REST is the use of JSON, or JavaScript Object Notation -- an intuitive, easy-to-interpret data format -- for the data "payloads" (input and output) associated with REST calls.

REST enables client-side programmers to be very productive and agile in assembling services to create applications, and that likely has much to do with REST being widely used for applications that access a server from a mobile device front-end and/or from cloud-based application servers. But DON'T think that the cloud and mobile application worlds are the only ones to which RESTful services are applicable. There are plenty of developers who want to use REST API calls to build in-house applications that may have nothing to do mobile devices or clouds, for the reasons previously mentioned: productivity and agility. Mobile and cloud application developers aren't the only ones who don't want to deal with IT infrastructure plumbing.

DB2 for z/OS moved into the realm of RESTful services in a big way with the native REST interface built into DB2 12 (and retrofitted to DB2 11 via the fix for APAR PI66828). That DB2 REST API can be invoked directly from a client, or by way of IBM z/OS Connect (as mentioned in my blog entry referenced previously in this post). When z/OS Connect is in the picture, not only DB2 services, but CICS, IMS, WebSphere Application Server for z/OS, and batch services, as well, can be accessed via REST calls from clients, with data payloads sent back and forth in JSON format. z/OS Connect also provides useful tooling that facilitates discovery (by client-side developers) and management of z/OS-based RESTful services. Whether the DB2 REST API is accessed directly or through z/OS Connect, what it does is enable you to make a single SQL statement -- which could be a call to a stored procedure or maybe a query -- invoke-able via a REST call with a JSON data payload. Some very helpful information on creating and using DB2 RESTful services can be found in an IBM developerWorks article by Andrew Mattingly, in examples posted to developerWorks by IBM DB2 for z/OS developer Jane Man, and in the DB2 for z/OS IBM Knowledge Center. Keep in mind that because the DB2 REST API is a function of the DB2 for z/OS Distributed Data Facility (DDF), SQL statements and routines invoked via that interface are eligible for zIIP engine offload (up to 60%) just as are any other SQL statements and routines that execute under DDF tasks.

And lest you think that DB2 invisibility is applicable only to "operational" applications...


An invisibility provider for analytical DB2 for z/OS workloads

Data scientists, like application developers, can be more productive when they don't have to think about the particulars of the data server with which they are working. As interfaces such as JDBC and ODBC and REST provide much-appreciated abstraction of data server "plumbing" to application developers, so, too, for data scientists, does an analytics-oriented software technology called Spark.

Spark, produced by the Apache Software Foundation, is an open-source framework for data analytics. It can run natively under z/OS (and on zIIP engines, to boot) because, from an executable standpoint, it's Java byte code (Spark is written in the Scala programming language, which when compiled executes as Java byte code). The IBM z/OS Platform for Apache Spark provides connectors that enable Spark to ingest data from z/OS-based sources, such as VSAM, that do not have a JDBC interface. Add that to Spark's native ability to access data sources with a JDBC interface, such as DB2 and IMS, and data in Hadoop-managed data stores, and you have, with Spark under z/OS, some very interesting possibilities for federated data analytics (Spark builds in-memory structures from data it reads from one or more sources, and subsequent access to those in-memory data structures can deliver high-performance analytical processing).

If you go to one of your organization's data scientists and say, "There is some data on the mainframe that could probably yield for you some valuable insights," you might get, in response, "Thanks, but I don't know anything about mainframe systems." If, on the other hand, you lead with, "We've got some really important data on one of our systems, and we're running Spark on that system, and you could analyze that data using R and uncover some insights that would be very helpful to the company," you might well get, "Okay!" (the R language is very popular with data scientists, and there is an R interface to Spark); so, platform abstraction can be a big plus for analytical, as well as operational, DB2 for z/OS workloads.


And on more thing...

(As Steve Jobs used to like to say)

I generally can't leave a discussion about DB2 for z/OS invisibility (from the application development perspective) without bringing up DB2 row-level locking. I'm often left shaking my head (and have been for over 20 years) over how reluctant many DB2 for z/OS DBAs are to allow the use of row-level (versus the default page-level) locking granularity. It's as though DBAs -- especially verteran DBAs -- think that they will have to turn in their DB2 for z/OS Club cards if they are caught with table spaces defined with LOCKSIZE(ROW). Recall what I pointed out near the beginning of this blog entry: the really meaningful measure of the vitality of the data-serving platform you support is growth in the form of new application workloads. Getting that type of workload growth means presenting a developer-friendly environment and attitude. Platform invisibility is very helpful in this regard, and other relational DBMSs that protect data integrity with locks (including DB2 for Linux, UNIX, and Windows) do so, by default, with row-level locking granularity. Telling a development team with programs that are experiencing lock contention issues with DB2 for z/OS page-level locking in effect that the problem is their code is not going to win you friends, and you want to be friends with your organization's developers. Yes, you should use page-level locking where you can (and that is often going to be for the large majority of your table spaces), but selectively using row-level locking and incurring what is likely to be a relatively small amount of additional CPU cost in return for growth in new-application workload is a good trade in my book. I have more to say about this in an entry I posted to this blog a couple of years ago.

And that's that. Yeah, DB2 for z/OS is your baby, and you like it when people tell you how beautiful your baby is, but to keep that baby healthy and growing it often pays to make the little darling invisible to application developers (and data scientists). When those developers talk about that whatever-it-is data-serving platform that's as solid as a rock, you can smile and think to yourself, "That's my baby!"

Sunday, November 27, 2016

DB2 for z/OS ZPARMs that Organizations Consistently Set in a Sub-Optimal Fashion

Over the past several years, I have reviewed DB2 for z/OS systems running at quite a few client sites. Part of the analysis work I do in performing these reviews involves looking over a DB2 subsystem's DSNZPARM values (or ZPARMs, for short -- the parameters through which the set-up of a DB2 system is largely specified). I have seen that certain ZPARM parameters are very regularly set to values that are not ideal. In this blog entry I will spotlight those ZPARMs, showing how they often ARE set and how they SHOULD be set (and why). Without further ado, here they are:

  • PLANMGMT -- Often set to OFF -- should be set to EXTENDED (the default), or at least BASIC. "Plan management" functionality (which, though not suggested by the name, is actually about packages) was introduced with DB2 9 for z/OS. At that time, it was one of those "not on a panel" ZPARMs (referring to the installation/migration CLIST panels), and it had a default value of OFF. Starting with DB2 10, PLANMGMT landed on a panel (DSNTIP8), and its default value changed to EXTENDED (BASIC is the other possible value). Here is why PLANMGMT should be set to EXTENDED (or BASIC -- and I'll explain the difference between these options momentarily): when that is the case, upon execution of a REBIND PACKAGE command, DB2 will retain the previous "instance" of the package. Why is that good? Because, if the previous instance of a package has been retained by DB2, that instance can very quickly and easily be put back into effect via REBIND PACKAGE with the SWITCH option specified -- very useful in the event that a REBIND results in an access path change that negatively impacts program performance. In this sense, plan management functionality (along with REBIND PACKAGE options APREUSE and APCOMPARE) is intended to "take the fear out of rebinding" (in the words of a former colleague of mine). That is important, because, starting with DB2 10, we (IBM) have been strongly encouraging folks to rebind ALL plans and packages after going to a new version of DB2 (and that should be done upon initial migration to the new version, i.e., in conversion mode if we are talking about DB2 10 or 11). Those package rebinds are critically important for realizing the performance improvements delivered by new versions of DB2, and they also promote stability because system reliability is likely to be optimized when package code -- which is, after all, executable code, being, essentially, the compiled form of static SQL statements -- is generated in the current-version DB2 environment. As for the difference between the EXTENDED and BASIC options for PLANMGMT, it's pretty simple: while both specifications will cause DB2 to retain the previous instance of a package when a REBIND PACKAGE command is executed, EXTENDED will result in DB2 also retaining a third instance of the package, that being the "original" instance -- the one that existed when REBIND was first executed for the package with plan management functionality enabled (you can periodically execute the FREE PACKAGE command for a package with PLANMGMTSCOPE(INACTIVE) to free previous and original instances of the package, to establish a new "original" instance of the package). With plan management functionality serving such a useful purpose, why is PLANMGMT regularly set to OFF? That could be due to a package storage concern that was effectively eliminated with a DB2 directory change wrought via the move to DB2 10 enabling new function mode. Lack of understanding about that change, I suspect, explains why this next ZPARM is commonly set in a sub-optimal way.

  • COMPRESS_SPT01 -- Often set to YES -- should be set to NO (the default). Some organizations have one or more DB2 systems with LOTS of packages -- so many, that hitting the 64 GB limit for the SPT01 table space in the DB2 directory (where packages are stored on disk) became a real concern. This concern was exacerbated by the just-described plan management capability of DB2, because that functionality, as noted above, causes DB2 to retain two, and maybe three, instances of a given package. To allay this concern, DB2 9 for z/OS introduced the COMPRESS_SPT01 parameter in DSNZPARM (another ZPARM that was of the "not on a panel" variety in a DB2 9 environment and landed on a panel -- DSNTIPA2 -- starting with DB2 10). With COMPRESS_SPT01 set to YES, DB2 would compress the contents of SPT01, causing the table space to occupy considerably less disk space than would otherwise be the case. Why would you not want to take advantage of that compression option? Because you don't have to. How is it that you don't have to? Because when the CATENFM utility was run as part of your move to DB2 10 enabling new function mode, one of the many physical changes effected for the DB2 catalog and directory saw the bulk of SPT01 content go to a couple of BLOB columns (and LOB columns can hold a TON of data, as noted in an entry I posted to this blog a few years ago). As a consequence of that change, the relatively small amount of non-LOB data in rows in SPT01 should easily fit within 64 GB of disk space, without being compressed (if, for example, the average row length in the SPTR table in SPT01 were 400 bytes -- and that would mean using unusually long location, collection, and package names -- then SPT01, uncompressed, would accommodate about 160 million packages). It's true that compressing SPT01 does not cost a lot in terms of added CPU overhead, but why spend even a little CPU on something you don't need?

  • UNIT (and UNIT2) -- Often set to TAPE (the default) -- should be set to a value that will cause DB2 to write archive log data sets to disk. Why do you want DB2 archive log data sets to be written to disk? Because, if multiple database objects have to be recovered and those recoveries will all require information from a given archive log data set, the recoveries will have to be SINGLE-THREADED if that archive log data set is on tape. Why? Because multiple jobs cannot access the same archive log data set at the same time if the data set is on tape. If the archive log data set is on disk, the aforementioned recovery jobs can be run in parallel, and the multi-object recovery operation will complete much more quickly as a result. Wouldn't the storage of archive log data sets on disk consume a whole lot of space? Yes, if that's where the archive log data sets stayed. In practice, what typically occurs is an initial write of an archive log data set to disk, followed by an automatic move, a day or two later (by way of HSM, which is more formally known as DFSMShsm), of the archive log data set to tape. In the event of a recovery operation that requires an archive log data set that has been HSM-migrated to tape, there will be a brief delay while the data set is restored to disk, and then multiple concurrently executing RECOVER jobs will be able to read from the archive log data set on disk. [Note that single-threading of RECOVER jobs is forced when archive log data sets are written to virtual tape, even though the actual media used is disk -- it's still tape from a DB2 perspective.]

  • UTSORTAL -- Often set to NO -- should be set to YES (the default). When a DB2 utility such as REORG is executed, and one or more sort operations will be required, by default that sort work will be accomplished by DFSORT. You could, in the JCL of REORG (and other sort-using) utility jobs, explicitly allocate sort work data sets for DFSORT's use. A better approach would be to let DFSORT dynamically allocate required sort work data sets. An EVEN BETTER approach is to have DFSORT dynamically allocate required sort work data sets AS DIRECTED BY DB2. That latter approach depends, among other things, on the UTSORTAL parameter in ZPARM being set to YES. More information on this topic can be found in an entry I posted to this blog back in 2011 -- that information is still applicable today.

  • MGEXTSZ -- Often set to NO -- should be set to YES (the default). This ZPARM pertains to secondary disk space allocation requests for DB2-managed data sets (i.e., STOGROUP-defined data sets -- and all your DB2 data sets should be DB2-managed). For such data sets, you can set a SECQTY value yourself that will be used when a data set needs to be extended, but a MUCH BETTER approach is to have DB2 manage secondary disk space allocation requests for DB2-managed data sets. That will be the case when the ZPARM parameter MGEXTSZ is set to YES, and when SECQTY is either omitted for a table space or index at CREATE time, or ALTERed, after CREATE, to a value of -1. When DB2 manages secondary disk space allocation requests, it does so with a "sliding scale" algorithm that gradually increases the requested allocation quantity from one extend operation to the next. This algorithm is virtually guaranteed to enable a data set to reach its maximum allowable size, if needs be, without running into an extend failure situation. DB2 management of secondary space allocation requests has generally been a big success at sites that have leveraged this functionality (as I noted some years ago in an entry posted to the blog I maintained while working as an independent DB2 consultant, prior to re-joining IBM in 2010). About the only time I've seen a situation in which DB2 management of secondary space allocation requests might be problematic is when disk volumes used by DB2 are super-highly utilized from a space perspective. If space on disk volumes used by DB2 is more than 90% utilized, on average, it could be that DB2 management of secondary space allocation requests might lead to extend failures for certain large data sets (caused by reaching the maximum number of volumes across which a single data set can be spread), due to very high degrees of space fragmentation on disk volumes (such high levels of volume space utilization can also preclude use of online REORG, as there might not be sufficient space for shadow data sets). Personally, I like to see space utilization of DB2 disk volumes not exceed 80%, on average. The cost "savings" achieved by utilizing more than that amount of space are, in my mind, more than offset by the reduction in operational flexibility that accompanies overly-high utilization of space on DB2 disk volumes.

  • REORG_PART_SORT_NPSI -- Often set to NO -- should be set to AUTO (the default) in a DB2 11 or later environment. A few years ago, a new option was introduced concerning the processing of non-partitioned indexes (aka NPSIs) for partition-level executions of online REORG. With the old (and still available) processing option, shadow NPSIs would be built by way of a sort of the keys associated with table space partitions NOT being reorganized via the REORG job, and then entries associated with rows in partitions being REORGed would be inserted into the shadow indexes. With the new option, shadow NPSIs would be built by way of a sort of ALL keys from ALL partitions, and then entries for rows of partitions being reorganized would be updated with the new row ID (RID) values reflecting row positions in the reorganized partitions. It turns out that the new option (sort all keys, then update entries with new RID values, as needed) can save a LOT of CPU and elapsed time for some partition-level REORG jobs versus the old option (sort keys of rows in partitions not being reorganized, then insert entries for rows in partitioned being reorganized). The REORG_PART_SORT_NPSI parameter in ZPARM specifies the default value for NPSI processing for partition-level REORGs (it can be overridden for a particular REORG job), and when it is set to AUTO then DB2 will use the newer processing option (sort all, then update as needed) when DB2 estimates that doing so will result in CPU and elapsed time savings for the REORG job in question (versus using the other option: sort keys from non-affected partitions, then insert entries for partitions being REORGed). I highly recommend going with the autonomic functionality you get with AUTO.

  • RRF -- Often set to DISABLE -- should be set to ENABLE (the default). With Version 9, DB2 introduced a new mode by which columns of a table's rows can be ordered (and here I am referring to physical order -- the logical order of columns in a table's rows is always determined by the order in which the columns were specified in the CREATE TABLE statement). When reordered row format (RRF) is in effect, a table's varying-length columns (if any) are all placed at the end of a row, after the fixed-length columns, and in between the fixed-length and varying-length columns is a set of offset indicators -- one for each varying-length row (the 2-byte offset indicators replace the 2-byte column-length indicators used with basic row format, and each offset indicator provides the offset at which the first byte of the corresponding varying-length row can be found). Reordered row format improves the efficiency of access to varying-length columns (an efficiency gain that increases with the number of varying-length columns in a table), and can reduce the volume of data written to the log when varying-length column values are updated. The ZPARM parameter RRF can be set to ENABLE or DISABLE. With the former value, new table spaces, and new partitions of existing partitioned table spaces, will be created with RRF in effect; furthermore, tables for which basic row format (BRF) is in effect will, by default, be converted to reordered row format when operated on by REORG or LOAD REPLACE utility jobs. Besides the aforementioned benefits of greater efficiency of access to varying-length columns and potentially reduced data volume for logging, RRF is GOING to be the primary column-ordering arrangement in a DB2 12 environment: the RRF parameter in ZPARM is gone with that version, new table spaces WILL be created with RRF in effect, and existing table spaces WILL be converted to RRF by way of REORG and LOAD REPLACE. My advice is to get ahead of the game here, and setting the ZPARM parameter RRF (in a DB2 10 or 11 environment) to ENABLE will help you in this regard.

You might want to examine ZPARM settings at your site, and see if you spot any of these (in my opinion) less-than-optimal specifications. If you do, consider making changes to help boost the efficiency, resiliency, and ease-of-administration of your DB2 systems.

Monday, October 31, 2016

How Should Your Network-Attached Applications Access DB2 for z/OS?

Some relatively recent developments have plenty of people asking, "How should my applications that connect to a z/OS server via a network link access DB2 for z/OS data?" In writing today I hope to provide some information that will be useful to folks pondering this question. I'll organize my thoughts on the matter around two key questions.

z/OS Connect or DB2 Connect?

z/OS Connect, which is becoming one of my favorite IBM software products, first showed up a couple of years ago. It's been promoted pretty heavily by IBMers presenting at various events (with good reason), and that has caused some people to wonder whether z/OS Connect might be intended as a replacement for the more venerable DB2 Connect product. Such is most definitely NOT the case. DB2 Connect and z/OS Connect address two different DB2 data access situations.

First, DB2 Connect (and through the next several paragraphs, where you see "DB2 Connect," add, in your head, "and the IBM Data Server Driver" -- more on that in the second section of this blog entry). DB2 Connect does, and will continue to do, what it has always done: provide a means whereby an application program, running in an environment other than z/OS and with a TCP/IP connection to a DB2 for z/OS system, can use a non-DBMS-specific data access interface to interact with the target DB2 server.

One of the more popular of these non-DBMS-specific data access interfaces is JDBC, short for Java Database Connectivity. When a Java programmer codes JDBC calls for data access purposes, he or she knows that the target server is a relational database management system (or at least something that can look like an RDBMS -- IBM's IMS, a hierarchical database management system, has a JDBC interface). What the Java developer may not know (and certainly doesn't have to know) is technical information specific to the relational database management system that will be accessed from the Java program being written. The Java developer does not have to be concerned about DBMS-specific technical information because the JDBC driver that his or her program will use is going to take care of mapping non-DBMS-specific JDBC calls to a particular protocol that can be processed natively by the target data server. When the target data server is IBM's DB2 for z/OS, the JDBC driver is provided by IBM's DB2 Connect. Other RDBMS vendors similarly provide JDBC drivers that can be used to access their database server products. Thus it is that DB2 Connect abstracts for a Java programmer the particulars of the specific RDBMS called DB2 for z/OS (and what I've written about JDBC applies as well to other common, non-DBMS-specific data access interfaces such as ODBC and ADO.NET).

Now, suppose that an application developer is writing a program that will send some data to and/or receive some data from a relational database management system on a network-accessible server, but he or she does not care to know that a relational database management system will be involved in the equation. To this programmer, the nature of the data persistence engine on the server side of the application -- even if abstracted as a generic-looking RDBMS (an abstraction presented, as noted above, by way of a driver such as DB2 Connect) -- is "plumbing," and he or she does not want to be concerned with plumbing. Relational database management system? Sequential file? Hierarchical database? Hadoop-managed data store? Doesn't matter. All plumbing. All the programmer wants to do is invoke a service on the remote server. That service will take data from the application being coded by the programmer, and/or send data to the application. How that service does what it does is immaterial.

OK, how to invoke this service on the remote data processing platform? Not long ago, that service invocation mechanism might have been a SOAP call. SOAP is short for Simple Object Access Protocol. Only, it's not all that simple. Kind of involved, actually, with attendant XML documents that spell out the what and how of the associated service. How about something simpler? Something more lightweight? Enter REST, short for Representational State Transfer. REST is a protocol that enables remote services to be invoked through what are essentially extensions of basic HTTP verbs (GET, PUT, etc.), with data sent back and forth between client and server in a format called JSON (JavaScript Object Notation, a data representation involving name/value pairs that is very easy for programs -- and people -- to parse).

But what does this have to do with DB2 for z/OS? That's where z/OS Connect comes in. With z/OS Connect, a z/OS-based service, such as a CICS or an IMS transaction, or a DB2 stored procedure (or just a single SQL DML statement, if desired) can be invoked by a remote client via a REST call (making it, therefore, a so-called RESTful service). The client-side programmer doesn't have to know anything about CICS or IMS transactions or DB2 stored procedures, and z/OS-side programmers don't have to know anything about the REST protocol or the JSON data format. z/OS Connect provides the bridge that links these two worlds.

A bridge, and more: z/OS Connect provides security functions to ensure that z/OS-side services can be invoked only by authorized requesters, and it enables z/OS SMF tracing of service usage activity. z/OS Connect also provides "service discovery" functionality that allows requesters to find available services and to get information on how a service can be invoked and what the service delivers (using "Swagger," more formally known as the OpenAPI Specification -- a lingua franca for providers and consumers of RESTful services).

And when you think about z/OS Connect, think beyond your own organization's developers who would love to have a REST interface to z/OS-based services. Think, as well, about developers outside of your organization. Think of people developing applications with mobile front-ends like smart phones and tablets. Does your organization have (or could you readily have) z/OS-based services that could be exposed, securely and in an industry-standard fashion, to the wider world as REST APIs? Might these APIs be useful components of what could become a widely used mobile app? Could these APIs create new revenue streams for your organization, or, if provided in a no-charge manner, extend your organization's brand to new segments of a population and/or provide information that could deliver new and actionable insights for your company? The possibilities are wide-ranging and exciting. Get with some of your application development leaders and get some conversations started.

By the way, DB2 for z/OS Support for z/OS Connect Enterprise Edition is provided by way of the native REST support that was built into the distributed data facility (DDF) for DB2 12 for z/OS, and retrofitted to DB2 11 via the fix for APAR PI66828.


DB2 Connect or the IBM Data Server Driver?

When it's appropriate to use DB2 Connect instead of z/OS Connect (and that's the case when programmers know they are interfacing with a relational database management system and want to use a non-DBMS-specific data interface such as JDBC or ODBC), should you in fact use DB2 Connect itself, or should you use the IBM Data Server Driver?

Increasingly these days, you'll see in presentations from IBM DB2 for z/OS people (myself included) references to functionality provided by the IBM Data Server Driver, and you might find yourself wondering, "Is that same functionality present in DB2 Connect, and if it is, why isn't that mentioned?" Yes, the functionality mentioned is virtually certain to be present in DB2 Connect (given an equivalent version, such as 11.1 or 10.5). Why is DB2 Connect not mentioned? Two reasons: 1) "The IBM Data Server Driver and DB2 Connect" is kind of a mouthful, and (more importantly) 2) we REALLY want you to be using the IBM Data Server Driver instead of DB2 Connect (and by "we" I mean IBM DB2 for z/OS specialists "in the field," such as myself, and people in the DB2 for z/OS development organization and people in the DB2 Connect/Data Server Driver development organization). Why are we so keen on your using the IBM Data Server Driver instead of DB2 Connect? Several reasons:

  1. It's a lighter-weight piece of code.
  2. It should deliver better performance, as it is a "type 4" driver that provides a direct path from an application server to a DB2 for z/OS system, thereby avoiding what would otherwise be an extra "hop" to a DB2 Connect gateway server.
  3. It can help with management of your DB2 for z/OS client-server environment, in particular with things like problem source identification (as pointed out in an entry that I posted to this blog last year).
  4. It's easier to upgrade than DB2 Connect.
  5. It provides most all the functionality that you get with DB2 Connect (including connection pooling, connection concentration, and Sysplex workload balancing).

How can you use the Data Server Driver? Easy. Your DB2 Connect license entitles you to use the Data Server Driver instead of DB2 Connect (the one exception of which I'm aware being a "concurrent user" DB2 Connect license, which requires use of a DB2 Connect gateway server).

Is there any DB2 Connect functionality that is not in the Data Server Driver? Two things come to mind. One is the federation capability provided by way of DB2 Connect "nicknames." The other is support for two-phase commit when the application server in use utilizes a so-called dual-transport model. IBM's WebSphere Application Server uses a single-transport model, so no problem there. I believe that the Encina and Tuxedo transaction managers use a dual-transport model. Even if you have a transaction manager that utilizes a dual-transport model, if two-phase commit functionality is not needed you shouldn't have an issue in using the IBM Data Server Driver instead of DB2 Connect.

And there you go. Hope this information is helpful for you. As always, use the right tool for the right job.

Friday, September 23, 2016

DB2 for z/OS: Using PGFIX(YES) Buffer Pools? Don't Forget About Large Page Frames

Not long ago, I was reviewing an organization's production DB2 for z/OS environment, and I saw something I very much like to see: a REALLY BIG buffer pool configuration. In fact, it was the biggest buffer pool configuration I'd ever seen for a single DB2 subsystem: 162 GB (that's the combined size of all the buffer pools allocated for the subsystem). Is that irresponsibly large -- so large as to negatively impact other work in the system by putting undue pressure on the z/OS LPAR's central storage resource? No. A great big buffer pool configuration is fine if the associated z/OS LPAR has a lot of memory, and the LPAR in question here was plenty big in that regard, having 290 GB of memory. The 128 GB of memory beyond the DB2 buffer pool configuration size easily accommodated other application and subsystem memory needs within the LPAR, as evidenced by the fact that the LPAR's demand paging rate was seen, in a z/OS monitor report, to be zero throughout the day and night (I'll point out that the DB2 subsystem with the great big buffer pool configuration is the only one of any size running in its LPAR -- if multiple DB2 subsystems in the LPAR had very large buffer pool configurations, real storage could be considerably stressed).

A couple of details pertaining to this very large buffer pool configuration were particularly interesting to me: 1) the total read I/O rate for each individual buffer pool (total synchronous reads plus total asynchronous reads, per second) was really low (below 100 per second for all pools, and below 10 per second for all but one of the pools), and 2) every one of the buffer pools was defined with PGFIX(YES), indicating that the buffers were fixed in real storage (i.e., not subject to being paged out by z/OS). And here's the deal: BECAUSE the buffer pools all had very low total read I/O rates, page-fixing the buffers in memory was doing little to improve the CPU efficiency of the DB2 subsystem's application workload. Why? Because all of the pools were exclusively using 4K page frames.

Consider how it is that page-fixing buffer pools reduces the CPU cost of DB2 data access. When the PGFIX(YES) option of -ALTER BUFFERPOOL was introduced with DB2 Version 8 for z/OS, the ONLY CPU efficiency gain it offered was cheaper I/O operations. Reads and writes, whether involving disk volumes or -- in the case of a DB2 data sharing configuration on a Parallel Sysplex -- coupling facilities, previously had to be bracketed by page-fix and page-release actions, performed by z/OS, so that the buffer (or buffers) involved would not be paged out in the midst of the I/O operation. With PGFIX(YES) in effect for a buffer pool, those I/O-bracketing page-fix and page-release requests are not required (because the buffers are already fixed in memory), and that means reduced instruction pathlength for DB2 reads and writes (whether synchronous or asynchronous).

DB2 10 extended the CPU efficiency benefits of page-fixed buffer pools via support for 1 MB page frames. By default, in a DB2 10 (or 11) environment, a PGFIX(YES) buffer pool will be backed by 1 MB page frames if these large frames are available in the LPAR in which the DB2 subsystem runs. How does the use of 1 MB page frames save CPU cycles? By improving the hit ratio in the translation lookaside buffer, leading to more cost-effective translation of virtual storage addresses to corresponding real storage addresses for buffer pool-accessing operations. DB2 11 super-sized this concept by allowing one to request, via the new FRAMESIZE option for the -ALTER BUFFERPOOL command, that a page-fixed pool be backed by 2 GB page frames (note that 2 GB page frames may not save much more CPU than 1 MB frames, unless the size of the buffer pool with which they are used is 20 GB or more).

Having described the two potential CPU-saving benefits of page-fixed buffer pools, I can make the central point of this blog entry: if you have a PGFIX(YES) buffer pool that has a low total read I/O rate, and that pool is backed by 4 KB page frames, the PGFIX(YES) specification is not doing you much good because the low read I/O rate makes cheaper I/Os less important, and the 4 KB page frames preclude savings from more-efficient virtual-to-real address translation.

This being the case, I hope you'll agree that it's important to know whether a page-fixed buffer pool with a low read I/O rate is backed by large page frames. In a DB2 11 environment, that is very easy to do: just issue the command -DISPLAY BUFFERPOOL, for an individual pool or all of a subsystem's buffer pools (in that latter case, I generally recommend issuing the command in the form -DISPLAY BUFFERPOOL(ACTIVE)). You'll see in the output for a given pool one or more instances of a message, DSNB546I. That message information might look like this:

DSNB546I  - PREFERRED FRAME SIZE 1M
        0 BUFFERS USING 1M FRAME SIZE ALLOCATED
DSNB546I  - PREFERRED FRAME SIZE 1M
        10000 BUFFERS USING 4K FRAME SIZE ALLOCATED

What would this information tell you? It would tell you that DB2 wanted this pool to be backed with 1 MB page frames (the default preference for a PGFIX(YES) pool), but the pool ended up using only 4 KB frames. Why? Because there weren't 1 MB frames available to back the pool (more on this momentarily). What you'd rather see, for a PGFIX(YES) pool that is smaller than 2 GB (or a pool larger than 2 GB for which 2 GB page frames have not been requested), is something like this:

DSNB546I  - PREFERRED FRAME SIZE 1M
        43000 BUFFERS USING 1M FRAME SIZE ALLOCATED

(This information is also available in a DB2 10 environment, though in a somewhat convoluted way as described in an entry I posted to this blog a couple of years ago.)

So, what if you saw that a PGFIX(YES) pool is backed only by 4 KB page frames, and not by the preferred larger frames (which, as noted above, are VERY much preferred for a pool that has a low total read I/O rate)? Time then for a chat with your friendly z/OS systems programmer. That person could tell you if the LPAR has been set up to have some portion of the real storage resource managed in 1 MB (and maybe also 2 GB) page frames. Large frames are made available by way of the LFAREA parameter of the IEASYSxx member of the z/OS data set SYS1.PARMLIB. Ideally, the LFAREA specification for a z/OS LPAR should provide 1 MB page frame-managed space sufficient to allow PGFIX(YES) buffer pools to be backed to the fullest extent possible by 1 MB frames (and/or by 2 GB frames as desired). It may be that DB2 is the one major user of large real storage page frames in a z/OS LPAR, and if that is the case then the amount of 1 MB (and maybe 2 GB) page frame-managed space could reasonably be set at just the amount needed to back page-fixed DB2 buffer pools (in the case of 1 MB frames, I'd determine the amount needed to back PGFIX(YES) buffer pools, and increases that by about 5% to cover some smaller-scale uses of these frames in a z/OS environment). If WebSphere Application Server (WAS) is running in the same z/OS LPAR as DB2, keep in mind that WAS can use 1 MB page frames for Java heap memory -- your z/OS systems programmer should take that into account when determining the LFAREA specification for the system.

There you have it. To maximize the CPU efficiency advantages of page-fixed buffer pools, make sure they are backed by large page frames. This is particularly true for pools with a low total read I/O rate. The more active a buffer pool is (and the GETPAGE rate is a good measure of activity -- it can be thousands per second for a buffer pool), the greater the CPU cost reduction effect delivered by large page frames.

And don't go crazy with this. Don't have a buffer pool configuration that's 80% of an LPAR's memory resource, and all page-fixed. That would likely lead to a high level of demand paging, and that would be bad for overall system performance. Know your system's demand paging rate, and strive to keep it in the low single digits per second or less, even during times of peak application activity. Leveraging z Systems memory for better performance is a good thing, but like many good things, it can be overdone.

Monday, August 29, 2016

DB2 for z/OS: Clearing Up Some Matters Pertaining to Database Access Threads

I have recently received a number of questions pertaining to DB2 for z/OS database access threads, or DBATs. DBATs are threads used in the execution of SQL statements that are sent to DB2 from network-attached applications (i.e., from DRDA requesters that access DB2 for z/OS by way of DB2's distributed data facility, also known as DDF). Thinking that these questions (and associated answers) might be of interest to a good many people in the DB2 for z/OS community, I'm packaging them in this blog entry. Without further ado, here are the Qs and As.


Question: What makes a DBAT a high-performance DBAT?

Answer: This question actually come to me in reverse form, as (and I'm paraphrasing), "What makes a DBAT a 'regular' DBAT versus a high-performance DBAT?" The answer's pretty simple: a "regular" DBAT is one that is not a high-performance DBAT, and a high-performance DBAT is one to which at least one package bound with RELEASE(DEALLOCATE) has been allocated for execution; therefore, if a DBAT does not have a RELEASE(DEALLOCATE) package allocated to it then it is a "regular" DBAT. Note that this answer presupposes that high-performance DBAT functionality has been enabled on the DB2 subsystem in question (more on this below) -- if that has not happened then all DBATs will be of the "regular" variety because there cannot be any high-performance DBATs.


Question: Can I get high-performance DBATs by binding the IBM Data Server Driver (or DB2 Connect, if that's what you use) packages with RELEASE(DEALLOCATE)?

Answer: Yes, but you'll want to do that the right way. This question is of greatest interest to organizations having DDF application workloads characterized by little (if any) in the way of static SQL statements. Execution of a static SQL statement, of course, involves execution of code in a DB2 package, and packages can be bound with RELEASE(DEALLOCATE), and, as noted above, executing a RELEASE(DEALLOCATE) package by way of a DBAT makes the DBAT a high-performance DBAT, if it wasn't such already. For a DDF-using application, static SQL-related packages could belong to stored procedures called by the application, or they might be associated with static SQL statements issued by the application itself -- for example, SQLJ statements embedded in a Java program. What if a DDF-using application uses dynamic SQL exclusively? That's quite common, as SQL statements issued through non-DBMS-specific interfaces such as JDBC and ODBC (both very widely used) are dynamically prepared and executed on the target DB2 for z/OS server. Are such applications blocked from the CPU efficiency gains that can be delivered via high-performance DBATs? No, they are not. Why not? Because even when a DDF-using application issues SQL statements that are exclusively dynamic from the DB2 perspective, packages are still used. Which packages? IBM Data Server Driver -- or, alternatively, IBM DB2 Connect -- packages (your license for the latter entitles you to use the former, and it's recommended that you use the former). Those packages are bound, by default, into a collection called NULLID; so, should you bind all the packages in the NULLID collection with RELEASE(DEALLOCATE), so as to make your DBATs high-performance DBATs? No, you should not do that. Why? Because that would cause ALL of your DBATs to be of the high-performance variety, and that would not be an optimal use of this functionality. You want to be smart about your use of RELEASE(DEALLOCATE) -- for DDF-using applications and for "local-to-DB2" applications, as well. That means, for a DDF workload, utilizing high-performance DBATs for frequently executed transactions that have a relatively small average in-DB2 CPU time (for such transactions, the proportional cost of repeatedly acquiring, releasing, and re-acquiring the same table space-level locks and package sections -- greatly reduced via high-performance DBATs -- is relatively significant versus transactions with higher in-DB2 times). You want, then, to use high-performance DBATs selectively. How can you do that? By binding the IBM Data Server Driver (or DB2 Connect) packages into the default NULLID collection with RELEASE(COMMIT), and into a collection not named NULLID with RELEASE(DEALLOCATE). THEN you can selectively use high-performance DBATs by pointing an application, by way of a client-side data source property, to the collection into which the Data Server Driver (or DB2 Connect) packages were bound with RELEASE(DEALLOCATE), when you want that application to use high-performance DBATs. And remember, by the way, to give consideration to upping your MAXDBAT value before starting to use high-performance DBATs.


Question: I'm using RELEASE(DEALLOCATE) packages with my DBATs, but my DB2 monitor indicates that I'm not getting high-performance DBAT usage. Why is that?

Answer: First of all, about this monitoring thing: check a DB2 monitor-generated statistics long report for your subsystem (or an online display of DDF activity provided by your monitor), and in the DDF activity section of the report, look for the field that is labeled HWM ACTIVE DBATS-BND DEALLC (or something like that -- field names can differ slightly from one DB2 monitor to another). If that field shows zero, your DDF applications are not using high-performance DBATs. If you see a zero in that field and think that you shouldn't, because you have RELEASE(DEALLOCATE) packages being executed via DBATs, issue the command -DISPLAY DDF DETAIL on the DB2 subsystem of interest. In the output of that command, look for the line with message DSNL106I. If what you see there is PKGREL = COMMIT, you're not getting high-performance DBATs because your DDF is not enabled for high-performance DBAT functionality. To change that, issue the command -MODIFY DDF PKGREL(BNDOPT). As a result of this command being executed, DDF will honor the RELEASE(DEALLOCATE) specification for packages allocated to DBATs for execution, and you'll then see instantiation of high-performance DBATs in the system. Note that it is possible for certain BIND/REBIND, ALTER, and pending DDL-materializing online REORG operations to be blocked by a RELEASE(DEALLOCATE) package that is allocated to a persistent thread such as a high-performance DBAT. Even though DB2 11 delivered a RELEASE(DEALLOCATE) "break-in" feature to enable BIND/REBIND and ALTER and pending DDL-materializing online REORG operations to proceed in the face of RELEASE(DEALLOCATE) packages that would otherwise be in the way, it still may be necessary to issue -MODIFY DDF PKGREL(COMMIT) to temporarily "turn off" high-performance DBATs when you need to perform these operations. When you're done with the database administration actions, issue -MODIFY DDF PKGREL(BNDOPT) to turn high-performance DBAT functionality back on.


Question: I'm seeing more idle thread timeouts for my DBATs than I'd like. What can I do about that?

Answer: The best way to prevent DBATs from timing out due to inactivity is to let the connections that use DBATs go into an inactive state when DDF transactions complete (and I'm referring here to "regular" DBATs -- a high-performance DBAT will remain tied to the connection through which it was instantiated until it terminates, typically as a result of being reused 200 times). It is important to realize that an application's connection to DB2 going inactive at the end of a transaction is a GOOD THING -- it's a big part of the DB2 for z/OS client-server scalability story. The virtual storage footprint of an inactive connection is very small, and the CPU cost of switching a connection from an active to an inactive state and back again is very small. When an application's connection to DB2 goes inactive, the DBAT that it was using goes into a "disconnected" state (NOT inactive), which is to say that it goes into the DBAT pool, ready to be called up to service another transaction coming from some other connection. Pooled DBATs are not subject to the idle thread timeout limit (specified via the DB2 ZPARM parameter IDTHTOIN), and inactive connections don't time out. How do you get inactive connection functionality? First, the DB2 ZPARM parameter CMTSTAT needs to be set to INACTIVE (that's been the default value for a LONG time). Second, transactions using DBATs need to finish with a "clean" commit. Basically, this means that nothing is allocated to the DBAT when a transaction using the DBAT completes -- so, no locks of any kind, no WITH HOLD cursors left un-closed, no declared global temporary tables left with data in them, etc. The most important thing that goes into getting "clean" commits is the COMMIT itself. Some client-side application developers think that a COMMIT isn't needed for a transaction that only reads data. Not true. If a read-only DDF transaction doesn't issue a COMMIT, the associated DBAT will hold one or more table space-level locks, and that would be enough to keep the connection from going inactive and the associated DBAT from going back into the DBAT pool -- if the DBAT stays in a connected state because no COMMIT has been issued, and if the amount of time specified in the ZPARM parameter IDTHTOIN goes by with no activity for the thread, it will time out. Oh, and another thing: if you have client-side developers occasionally issuing SQL statements like SELECT 1 FROM SYSIBM.SYSDUMMY1 in a well-intentioned effort to keep connections to DB2 "alive," TELL THEM TO STOP DOING THAT. Such actions can keep DB2 connections from going inactive when transactions complete, and as I stated previously it is GOOD for DB2 connections to go inactive when DDF transactions finish processing.


Question: Can the KEEPDYNAMIC(YES) package bind specification lead to idle thread timeouts for DBATs?

Answer: Yes. Here's the deal: with KEEPDYNAMIC(YES) in effect, prepared dynamic SQL statements are kept in a thread-specific local cache. Reusing one of these locally cached dynamic SQL statements is even more CPU efficient than getting a hit in the global dynamic statement cache in the DB2 EDM pool. That's good, but here's the flip side: with KEEPDYNAMIC(YES) in effect, you'll have prepared dynamic statements allocated locally to the thread, and as I noted above, a DBAT cannot go into a disconnected state (into the DBAT pool) at the end of a transaction if something is allocated to the DBAT. With the DBAT thus prevented from going into the DBAT pool (because of the prepared statements allocated locally to the DBAT), it is subject to hitting the idle thread timeout limit and being cancelled for that reason. Now, in a couple of situations you get some relief from this stricture: if either Sysplex workload balancing or seamless failover is in effect (these are DB2 Connect or IBM Data Server Driver configuration options) then at commit time, if the only thing keeping a DBAT in the connected state is the local cache of prepared dynamic SQL statements that exists because KEEPDYNAMIC(YES) is in effect, the DBAT can go unused for 20 minutes before the DB2 idle thread checker will drive a cancellation of the thread. Also, when either of the aforementioned DB2 client configuration options is in effect (Sysplex workload balancing or seamless failover), and a DBAT associated with a KEEPDYNAMIC(YES) package is being repeatedly used (so that idle thread timeout does not occur), the DBAT can stay in the connected state for an hour before being terminated (this so that resources allocated to the DBAT will be periodically freed up). Given all this, my thinking on KEEPDYNAMIC(YES) can be summed up as follows: if you have a DDF-using application that is specifically designed to take advantage of the KEEPDYNAMIC(YES) bind option, you can certainly consider going that route to achieve related CPU savings; otherwise, my inclination would be to use KEEPDYNAMIC(NO), go for hits in the global dynamic statement cache (not quite as good, efficiency-wise, as reuse of a locally cached prepared statement, but still a whale of a lot better than full PREPAREs of dynamic statements), and use high-performance DBATs to achieve additional CPU savings.

That's all for now. I hope that this information will be of use to you.

Thursday, July 28, 2016

DB2 for z/OS: Clearing the Air Regarding CLOSE YES for Table Spaces and Indexes

This is another of my DB2 for z/OS blog entries motivated by a desire to clear up a matter about which there is some misunderstanding in the DB2 community. In this case, the misunderstanding concerns the CLOSE option of the CREATE and ALTER statements for DB2 for z/OS table spaces and indexes.

The confusion here, as I've encountered it, is mainly related to differences between "soft close" and "hard close" of DB2 data sets (table spaces or indexes, or partitions of same if the objects are partitioned). "Soft close" is the term that some people use to describe pseudo-close processing, while "hard close" refers to the physical closing of a data set that had been open and allocated to DB2. The primary point I want to make can be summed up simply: specification of CLOSE YES or CLOSE NO affects "hard close" activity, not "soft close" activity. Information provided below is intended to flesh out and provide context for that point. 

I'll start by explaining what pseudo-close (aka "soft close") is about. Pseudo-close is done largely for the purpose of updating a table in the DB2 for z/OS directory called SYSLGRNX. This table contains information about the time periods (expressed as ranges within the DB2 transaction log) during which DB2 data sets are open and in a read/write state. Why does DB2 record this information? To speed up RECOVER utility jobs. Typically, when RECOVER is run for a table space or index (or partition of same, if partitioned), the job involves recovery "to currency" -- that is, to what would be the current state of the object. In that case, the RECOVER utility restores the most recent image copy backup of the target object, and applies data changes made subsequent to that backup to the object to bring it to currency. These data changes are obtained from the transaction log, and therein we find the benefit of pseudo-close processing. [Note that even if recovery will be to a prior point in time, versus "to currency," if that point in time is "later" than the time of the restored image copy, post-copy changes will be applied as necessary from the transaction log. RECOVER with the BACKOUT option also involves transaction log access, but no image copy restoration.]  

When a DB2 data set that is open for read/write access has gone for a certain interval of time without any data changes, it will be pseudo-closed. That interval of time is determined by the value of two DB2 ZPARM parameters, PCLOSEN and PCLOSET. The former parameter refers to a number of checkpoints (the default is ten), and the latter to a number of minutes (the default is ten), and the pseudo-close action is taken when the first of those thresholds is reached; so, if PCLOSEN and PCLOSET are at their default values, a DB2 data set open for read/write will be pseudo-closed if it goes for 10 DB2 system checkpoints or 10 minutes (whichever happens first) without being updated (with the default pseudo-close parameters in effect, 10 minutes will typically pass before 10 DB2 checkpoints have occurred -- it's unusual for DB2 checkpoints to be separated by less than a minute). When a data set is pseudo-closed, DB2 switches the data set's state to read-only, and records that action (in terms of log RBA, or relative byte address, and -- in a data sharing environment -- log record sequence number, or LRSN, information) in the SYSLGRNX table in the DB2 directory. Though the data set, after being pseudo-closed, is in a read-only state, it is still available for update, and the state will be switched back to read/write from read-only when the next data-change operation (e.g., INSERT, UPDATE, DELETE) targeting the data set comes along -- and that (the switching of the data set's state from read-only back to read/write) is also recorded in SYSLGRNX. If the data set is recovered via the RECOVER utility at some time, RECOVER will get information pertaining to the data set that will indicate portions of the log that can be skipped over during change-apply processing -- if SYSLGRNX information indicates that the data set was in a read-only state between points X and Y (RBA or LRSN values) in the log, there's no need for RECOVER to process that part of the log because there's no way that data-change operations involving the data set will be recorded in that log range. The more of the log that RECOVER can skip over in recovering an object, the sooner the job will complete.

By the way, you can monitor pseudo-close activity for a DB2 subsystem by way of a DB2 monitor-generated statistics long report (or an online display provided by your monitor). In a statistics long report, find the section of information with the heading OPEN/CLOSE ACTIVITY. In that section, find the field with a label like DSETS CONVERTED R/W -> R/O. That shows the rate of pseudo-close actions. As a very rough rule of thumb, if the number of open data sets for the subsystem is somewhere in the vicinity of 10,000, I'm comfortable with a pseudo-close rate of around 20-40 per minute (you'd adjust that range proportionately for smaller or larger numbers of open data sets). 

That, then, is pseudo-close, and it is NOT affected by the specification of CLOSE YES or CLOSE NO for a DB2 data set. The CLOSE specification for a data set does affect "hard close" processing -- that is, the physical closing of data sets that had been open and allocated to DB2. These physical close actions can happen for two reasons -- one of which applies only to a DB2 data sharing environment (and I'm not talking about the data set close actions that are a normal part of shutdown processing for a DB2 subsystem). First, every DB2 subsystem has a limit on the number of data sets that can be open and allocated to the subsystem at one time. That limit is determined by the value of the ZPARM parameter DSMAX. If the number of data sets open and allocated to a DB2 subsystem reaches the DSMAX limit, DB2 will physically close some of those open data sets, starting with those that are a) defined with CLOSE YES and b) have gone the longest time without being accessed. If the DSMAX limit is reached for a subsystem and there are no open data sets defined with CLOSE YES, or if DB2 needs to close a number of open data sets beyond those defined with CLOSE YES, some data sets defined with CLOSE NO that have gone a long time without being referenced will be physically closed. Thus, CLOSE YES provides a means whereby you can indicate to DB2 that, in the event of the DSMAX open data set limit being reached, certain data sets are to be candidates for physical closing before other data sets are so considered. With this said, I'll note that my preference is to have DSMAX set to a value that is either not reached, or reached infrequently. More information about DSMAX, including how associated "hard close" activity can be monitored for a DB2 subsystem, can be found in an entry I posted to this blog a few months ago.

In a DB2 data sharing environment, there is an interesting relationship between "soft close" and "hard close" activity. Imagine that you have a 2-way DB2 data sharing group comprised of subsystems DB2A and DB2B. Suppose that data set X is being updated ("update" meaning "changed" -- via INSERT, UPDATE, and/or DELETE) by processes running on both DB2A and DB2B. If data set X goes for a pseudo-close interval without being changed on DB2B, it will be pseudo-closed on that member, and its state will be changed to read-only. If another pseudo-close interval goes by without any access at all (no read, no nothing) to data set X from DB2B, data set X will be physically closed on DB2B if it is defined with CLOSE YES. This physical closing of data set X on DB2B could result in DB2A getting an exclusive page set P-lock (physical lock -- used for data coherency versus access concurrency purposes) on the data set, causing it to become non-group-buffer-pool-dependent. That would reduce data sharing overhead by eliminating, from the system on which DB2A is running, group buffer pool-related coupling facility requests that would otherwise be required as part of accessing data set X.

So, in a DB2 data sharing environment, should objects be defined with CLOSE YES or CLOSE NO? There is, as you might expect, no one-size-fits-all answer to this question. CLOSE YES could be a good choice for an object if you think it likely that there will be significant stretches of time during which the object (or a partition thereof, if it is a partitioned table space or index) will be accessed exclusively by a process or processes running on a single member of the DB2 data sharing group (meaning, no access from other group members). In that case, CLOSE YES could enable acquisition by the one DB2 member of an exclusive page set P-lock on the object (or partition thereof). During the time interval in which that exclusive page set P-lock is held by the DB2 member, a lot of group buffer pool accesses could be eliminated, and that would be good from a CPU efficiency perspective. On the other hand, if you think that it would be unlikely for objects to be accessed for any significant period of time solely from one member of the data sharing group, CLOSE NO could be a better choice. Why? Because if there is a lot of pseudo-close activity going on in the data sharing group -- that is, if data sets are quite frequently being switched by DB2 members to a read-only state and then, very shortly, back to a read-write state -- then CLOSE YES could result in data sets frequently going from group buffer pool-dependent to non-group buffer pool-dependent and back again, and that could drive a lot of coupling facility activity (due to things like large-scale page registration and group buffer pool write activity). Ultimately, the CLOSE YES / CLOSE NO decision in a data sharing environment comes down to knowing how a given database object is likely to be accessed by processes running on various members of the DB2 data sharing group.

And there you have it. CLOSE YES has an impact on DB2 data set "hard close" (physical close) activity, but not on "soft close" (pseudo-close) activity, though in a data sharing environment CLOSE YES can lead to the physical closing of a data set as a result of a pseudo-close action. I hope that this information will be helpful to you.