Friday, May 28, 2021

The REST Interface: Db2 for z/OS as a Utility, and the Role of SQL Programmers

Recently, I was out on the West Coast of the United States to celebrate my son's graduation from college. At a party with some students and their parents, I conversed for a while with one of my son's good friends. This young man was about to get his degree in computer science. We talked shop, and I asked him about the code he'd typically write to interact with a data server. Perhaps he used a generic (i.e., non-DBMS-specific) SQL interface, such as JDBC or ODBC? No, he said, he really didn't know SQL that well. I was a little surprised (on the inside) to hear that. I mean, SQL is kind of fundamental, isn't it? Then I reframed my question: "OK - I guess you'd rather just invoke a data service via something like a REST API, and let that service do what you want done, without having to worry about how the service works or being concerned with any of the particulars of the back-end system on which the service executes?" "Yeah," he said. "More like that."

I know this friend of my son is a smart and hard-working guy. It's not at all like he's too lazy to write data-access code like SQL. He doesn't care to do that because he doesn't need to. He's part of a new generation of developers who see a universe of services of all kinds. Sometimes they'll be involved in the creation of new services. Often they'll look to discover and invoke existing services - data-related and otherwise - to provide application functionality needed by organizations and end-users. This change in the application-development mindset is gaining momentum, and it will have some fairly profound implications for people who administer Db2 for z/OS and other data-serving platforms. Earlier this year I posted a blog entry that focused on the nuts and bolts of Db2's REST interface. In this entry, I'm going to shift from the "how" of the Db2 REST interface to the "so what?"

A really big "so what" for Db2 for z/OS people will be the change in how the data server they support is perceived by application developers - particularly, developers on the client side of client-server applications. For growing numbers of such developers, Db2 for z/OS itself will be seen as, essentially, a utility. And I'm not using "utility" in the technical Db2 sense (e.g., the REORG utility); rather, I mean "utility" as in, "electric utility." Think about that. When you flip the light switch in a room, you expect electricity to flow and lights to come on. You probably don't think about where the electricity came from (maybe a power plant located some miles from your home), or how it was generated (maybe a natural gas-powered generator, maybe a wind farm, maybe a hydroelectric installation by a dam). Your "electricity interface" - the light switch - is easy to use.

if you go from your home to someone else's home, or to a hotel room or to a conference room in an office building, you easily locate the light switch because it looks like the ones in your home; and, you benefit from "consistency of electricity invocation": whether you're at home or in an office or wherever, you flip the switch up and electricity flows. Flip the switch down, flow stops and the lights go out.

Suppose you have an electrical outlet hooked to a switch. You have all manner of electrical devices - a music player, a blender, a printer. They can all be utilized through the same standard interface (the outlet). If you take these devices to another room in another building, the interface is the same: plug 'em in and use 'em. Your thinking is about what you'll do with the electricity you can invoke through the standard interface - not about the way in which the electricity was generated for your use.

OK, your a client-side application programmer, coding in accordance with the REST architectural style. You have a consistent interface by which you can invoke services, and some of those services might be data-related. You (or, more likely, your program will) "flip the switch" by issuing a REST request, and the requested data service (retrieve such-and-such data associated with thus-and-so input, or persist this value, or update this value, or whatever) will get done, as you expect (the lights come on), and you (or your program will) move on to some other action. How did that data service get performed? Where did that data service get performed? DON'T KNOW. DON'T CARE. It worked, didn't it? Do you care about that? Probably not. Why? Because the result - successful execution of the requested service - was expected. When you flip the light switch in a room, do you think, "Cool! THE LIGHTS CAME ON!" No? Neither do I.

Shift the perspective now to that of the people - like DBAs - who work at the "data-services utility," which they know as a Db2 for z/OS system. That system has such cool stuff, doesn't it? Partition-by-growth table spaces! Automatic clean-up of pseudo-deleted index entries! Relative page numbering! LOB in-lining! OLAP expressions! Global locking on a Parallel Sysplex! We geek out about this stuff at conferences. Wouldn't client-side developers like to know about that cool stuff? No, they would not, and that is especially true if their way of interacting with your data-serving utility - invoking REST APIs - completely abstracts your world from their point of view.

Is this "don't know, don't care" mind set of client-side developers regarding the particulars of the system you support a reason to slump your shoulders and feel down about what you do? Of course not. I know plenty of people who work for electric utilities. You can take pride in doing what they do for their consumers: you provide a service (execution of data-processing services) that is easy to invoke via a consistent interface (REST requests). The provided service does WHAT is requested WHEN it is requested (any time, day or night) and as expected (no surprises). Interestingly, the BETTER your data-services utility is at doing what it does, the LESS your consumers (client-side developers) will think about the goings-on at your end. Freak-out time is when the switch is flipped and the lights DON'T come on, or they come on in an unsteady, flicker-y kind of way. You keep delivering service that is boringly (from the consumer perspective) excellent, and when those consumers DO have reason to think about your utility, it will be because they need a data-server for a new application and they want to use that data-services utility that has that rock-steady quality - the one you support.

And let's not forget, in our survey of personas in this new client-server world, the group of people who are key colleagues of the DBAs and the systems programmers at the Db2 for z/OS utility: the SQL programmers. That's right. Remember, when the REST architectural style is employed there is no client-side SQL. Any SQL statements executed at your site will be those invoked via REST requests issued by client-side developers. The SQL - whether an individual statement or a stored procedure - that performs a service associated with a REST request has to be developed by someone, and that someone is generally going to be, from the client-side developer's perspective, a person who works at the data-service utility. A client-side developer may need a new service (as opposed to the situation in which you REST-enable an existing service), and that service may be performed on the utility side through execution of SQL. That detail doesn't matter to the client-side developer, who just says, "I want to send this input and have this action performed and receive this output." Someone on the utility side will analyze the specifications of the required service and will code SQL to get it down. The SQL will be REST-enabled, and away things go: service is invoked (switch is flipped) on the client side, and on the utility side the associated SQL statement (about which the consumer likely has no clue) is executed, and it executes quickly and efficiently and reliably because the DBAs and systems programmers have constructed and maintain an environment that delivers those qualities of service. Thus it is that the SQL developers' role will come to overlap more and more with the role of the DBAs, and vice versa. They're on the same team, and together they provide the so-good-it's-not-noticed level of data service to the client-side consumers.

I've been doing what I do for the better part of 35 years, and this emerging application development landscape has me feeling young again. Client-side developers, I am A-OK with you having little to no visibility with regard to the the technology I support and what I do to support it. When you flip the switch, and my utility is responsible for the lights coming on right away and as expected so that you (your program) can proceed to do something that needs doing, I'll be happy even if you don't give it another thought. Trust me on that one.

Thursday, April 29, 2021

Db2 for z/OS: Dealing With Application Compatibility Errors (-4743 SQL Error Code)

The notion of "application compatibility," in a Db2 for z/OS context, has been around for a while - it was introduced with Db2 11, which came out in 2013; still, in my experience it's been only recently that a number of Db2 for z/OS administrators have found themselves dealing with problems in this area. The incidents typically involve dynamic SQL statements. A person - maybe a Db2 DBA, maybe an application developer, maybe a user - issues a dynamic SQL statement via some tool (Db2's built-in SPUFI, the Db2-supplied DSNTEP2 program, a query tool such as IBM QMF, whatever) and execution of the statement fails, with Db2 returning a -4743 SQL error code: "ATTEMPT TO USE NEW FUNCTION WHEN THE APPLICATION COMPATIBILITY SETTING IS SET FOR A PREVIOUS LEVEL." Quite often, it's the first time that a Db2 DBA has encountered this error code, and he or she is left wondering, "Huh? What does THAT mean, and what am I supposed to do about it?" I'll try to provide some helpful information through this blog entry.

Why now?

If the -4743 Db2 SQL error code has been around since 2013 (and it has), why is it that a number of people have encountered problems of this nature just within the last year or so? I'd say that's largely due to a couple of factors:

  • Very helpful new SQL syntax and functionality introduced with Db2 12 for z/OS, with function level 500 or higher activated. Wait - hasn't Db2 12 itself been out for over four years? Yes, but there's often a lag between a new Db2 feature being introduced and people acting to take advantage of the new feature. Db2 12 delivered a number of important SQL-related enhancements - piece-wise DELETE, result set pagination and advanced triggers being just a few examples - and as people try these enhancements out they find that the appropriate application compatibility level needs to be in effect. Want to use the LISTAGG function, introduced with Db2 12 function level 501? Application compatibility level V12R1M501 (or higher) needs to be in effect.

  • In a Db2 12 environment, application compatibility level affects DDL statements (e.g., CREATE and ALTER) as well as DML statements such as SELECT and DELETE. When the Db2 for z/OS concept of application compatibility level was introduced with Db2 11, it applied only to SQL DML statements (DML is short for data manipulation language); so, if you tried to execute, in a Db2 11 new-function mode environment, a query that referenced a Db2 global variable (something introduced with Db2 11), and the application compatibility level in effect were V10R1, the statement would fail with a -4743 SQL error code. If on the other hand you issued, in that same Db2 11 environment, a CREATE VARIABLE statement, with application compatibility level V10R1 again being in effect, that statement would execute successfully (assuming you had the Db2 privileges needed to create a global variable). Why would creating a global variable be OK in a Db2 11 environment with the application compatibility level being V10R1, while in the same Db2 environment with that same application compatibility level being in effect a query referencing a global variable would fail with a -4743 error code? The reason: CREATE VARIABLE is an SQL DDL statement, and SELECT is a DML statement, and as noted the application compatibility level affected execution of DML statements but not DDL statements in a Db2 11 system. In a Db2 12 system, the application compatibility level affects both DDL and DML statements.

Examples of -4743 error situations in a Db2 12 environment

A few weeks back, a Db2 for z/OS DBA, working in a Db2 12 environment with function level 500 activated, contacted me to see why the following rather straightforward-looking CREATE LOB TABLESPACE statement was failing with - yep - a -4743 error code (I have masked the actual names of the table space, database, etc.):


"What application or tool did you use to issue the CREATE statement?" I asked. The DBA's response: "SPUFI." "OK," I said, "Your SPUFI packages probably have an APPLCOMPAT value of V10R1 or V11R1. Rebind them with APPLCOMPAT(V12R1M500)." The DBA rebound the SPUFI packages with the suggested APPLCOMPAT value, and the CREATE LOB TABLESPACE statement worked just fine.

What messed things up initially? The COMPRESS NO specification. Why? Because the ability to compress a LOB table space (done using a mainframe compression feature called zEDC) was introduced with Db2 12 with function level 500 activated. "So what?" you might say. "The DBA didn't specify COMPRESS YES for the LOB table space. He specified COMPRESS NO. He wasn't requesting compression for the LOB table space." True, but you can't include a COMPRESS specification - YES or NO - for a LOB table space until function level 500 has been activated for your Db2 12 system. That means application compatibility level V12R1M500 (or higher) needs to be in effect when you issue the CREATE LOB TABLESPACE statement with a COMPRESS specification, and THAT'S where APPLCOMPAT comes in. I've repeatedly referred to a Db2 application compatibility level being "in effect." What determines the application compatibility level that is in effect when a SQL statement is issued? It's the APPLCOMPAT value of the package associated with the SQL statement; and, there is ALWAYS a package associated with a Db2 SQL statement, whether the statement is static or dynamic, DML or DDL or whatever. If the SQL statement is dynamic, the package might be a SPUFI package (as in this case), or a DSNTEP2 package, or maybe an IBM Data Server Driver / Db2 Connect package, if the SQL statement is issued by a DRDA requester application or tool.

OK, example number 2. A Db2 for z/OS team at an organization here in the USA had recently activated function level 500 for a Db2 12 system, and they were eager to try out the way-improved MERGE statement enabled via that function level. They gave it a shot, and the statement failed with a -4743 SQL error code. In a meeting with the team, I asked what tool or application had been used to issue the MERGE statement. They told me: a workstation-based tool that used IBM's JDBC driver to access the target Db2 for z/OS system. Probably an issue with the APPLCOMPAT value of the IBM Data Server Driver packages, I explained; but, these folks were ahead of me. "We already rebound all of those packages with APPLCOMPAT(V12R1M500)." Oh. Still, Db2 was telling us via the -4743 that the application compatibility level in effect when the MERGE statement was issued was not what it needed to be. I requested that someone use the workstation-based tool to issue this statement:


Here's what the query returned:


Aha! OK, some explanation: CURRENT APPLICATION COMPATIBILITY is one of the Db2 special registers. Where does its value come from? Well, the special register's initial value for a Db2 session comes from the APPLCOMPAT value of the package being used. The V10R1 returned by the query meant that the workstation-based tool was using a package bound with an APPLCOMPAT value of V10R1. Using that piece of information, and knowing the names of the IBM Data Server Driver packages, one of the DBAs did a quick query of the Db2 system's SYSPACKAGE catalog table, selecting packages having an APPLCOMPAT value of V10R1. Sure enough, there was a set of the IBM Data Server Driver packages bound with APPLCOMPAT(V10R1) in a collection other than the default (for those packages) NULLID collection, and the workstation-based tool was using a package in that alternate collection (these days, it is increasingly common for organizations to have several collections of the IBM Data Server Driver packages, with the packages in the various collections differentiated by bind specifications). The DBA rebound the IBM Data Server Driver packages in the alternate collection with APPLCOMPAT(V12R1M500), the MERGE was issued again via the workstation-based tool, and it worked like a champ.

"OK, I understand this," you might be thinking, "But in the case of a tool or application used with dynamic SQL statements, couldn't you just issue SET CURRENT APPLICATION COMPATIBILITY = 'V12R1M500' (or whatever higher application compatibility level is needed), instead of rebinding the Db2 package being used?" To answer that question: you can indeed use a SET CURRENT APPLICATION COMPATIBILITY statement to change the Db2 application compatibility level in effect for subsequent SQL statements that you or a program might issue, but you can only change that level to a value BELOW the APPLCOMPAT value of the package being used - you can't dynamically go to a level higher than the package's APPLCOMPAT value. If you're wondering, "What good would dynamically going to a LOWER application compatibility level do you?" I'll cite the example of needing to create a traditional segmented table space using a package bound with an APPLCOMPAT value of V12R1M504 (or higher). You can get the details of that scenario in an entry I posted to this blog exactly one year ago.

And that wraps up this blog entry. I hope that it will help you to avoid -4743 SQL errors (or help you to get past those errors if they show up at your site).

Wednesday, March 31, 2021

Db2 for z/OS: Clearing Up Some Matters Pertaining to PGSTEAL(NONE) Buffer Pools

I have written a few entries for this blog in which I provided information about Db2 for z/OS buffer pools defined with PGSTEAL(NONE), including one from a few years ago in which I differentiated the PGSTEAL and PGFIX buffer pool specifications, and another from just a few months ago in which I explained - among other things - why a PGSTEAL(NONE) buffer pools is referred to as a "contiguous buffer pool" in a Db2 12 environment. With z/OS LPAR real storage sizes getting larger and larger and Db2 people looking - rightly - for more ways to exploit "big memory" to gain performance advantages, PGSTEAL(NONE) buffer pools are getting more attention than ever. That said, I have noticed that there continues to be some misunderstanding among Db2 users about a couple of matters pertaining to PGSTEAL(NONE) pools: the use of such pools for read/write versus read-only tables, and the appropriate use of large real storage page frames for PGSTEAL(NONE) pools. In this blog entry I'll try to clear things up on those two points.

First, a little background (for more background information you can check one or both of the blog entries for which I provided links, above). The PGSTEAL(NONE) specification does NOT mean that Db2 cannot steal buffers in such a pool - Db2 will always steal a buffer in a pool when it needs to, and it needs to when the pool's buffers are all occupied and a page of an object assigned to the pool needs to be read from disk storage into memory. PGSTEAL(NONE), rather, is your way of telling Db2, "There should not be a need to steal buffers in this pool, because I have made it (or I think I have made it) large enough to hold all pages of all objects assigned to the pool." A PGSTEAL(NONE) buffer pool, then, is intended for use as a "pinning" pool (i.e., as a buffer pool in which assigned database objects - table spaces and/or indexes - are "pinned" in memory, which is to say, cached in memory in their entirety). Of course, you could set up a pinning pool long before we got the PGSTEAL(NONE) buffer pool specification with Db2 10 for z/OS - all you had to do was make the pool large enough to completely cache the assigned object or objects in memory. Why, then, was the PGSTEAL(NONE) option provided?

PGSTEAL(NONE) was introduced so that Db2 would know that the pool in question is intended for the pinning of objects in memory. You see, when Db2 realizes that the pool is to be used for completely caching objects in memory, it will pitch in to help in achieving your no-read-i/O goal (no read I/Os after an object is fully in memory, that is). When an object assigned to a PGSTEAL(NONE) buffer pool is first accessed after the pool has been allocated, Db2 will immediately get for the associated process (e.g., an application program) the needed page or pages, so that the process can get on with whatever it's doing; then, Db2 will asynchronously (i.e., via prefetch) read into the buffer pool all of the remaining pages of the object. Voila! The object is now completely in-memory. Db2 12 goes a step further and enhances the efficiency of access to an object's pages in a PGSTEAL(NONE) buffer pool by arranging the pages in a contiguous fashion in the pool, as explained in the second of the two blog entries for which I provided links, above.

That is the gist of PGSTEAL(NONE) buffer pool functionality. On now to the two misunderstandings I want to clear up via this blog entry.

Read/write versus read-only tables

This misunderstanding regarding PGSTEAL(NONE) buffer pools has persisted for quite some time. Somehow, a lot of Db2 for z/OS people got the idea that a PGSTEAL(NONE) buffer pool is appropriate only for objects that are read-only (or very nearly so) in nature. WRONG. Read/write versus read-only access will generally not be an important factor when deciding whether an object would be a good candidate for a PGSTEAL(NONE) buffer pool. So what if a page of an object cached in a PGSTEAL(NONE) buffer pool gets updated? The updated page will be externalized (to disk, probably in an asynchronous fashion, or to a group buffer pool, likely in a synchronous manner as part of commit processing, when the Db2 subsystem is part of a data sharing group) as needed and in the usual way. No problem.

There are basically two things to consider when you're looking for database objects (table spaces or indexes) that would be good candidates for a PGSTEAL(NONE) buffer pool: size and frequency of access. Let's take a closer look at these two factors:

  • Size - That this is a factor should be obvious. Suppose you have a table with a billion rows of data (not unusual - an organization with which I often work has more than 50 Db2 for z/OS tables that each hold more than 1 billion rows of data). That table's rows might take up several hundred gigabytes of space. Could you cache that table in memory in its entirety? Sure you could - an individual Db2 buffer pool can be multiple terabytes in size. Do you want to use that much memory for a single database object? Probably not. With that said, when you think about size as a factor in evaluating candidates for a PGSTEAL(NONE) buffer pool, you don't have to think, "Tiny." As I mentioned in my opening paragraph, z/OS LPAR real storage sizes are getting larger all the time (I have seen plenty of LPARs with several hundred gigabytes of real storage - the largest real storage size for an LPAR that I've personally seen in the real world is 1.1 TB). Db2 buffer pool configurations are growing along with z/OS LPAR real storage sizes (I've seen a real-world Db2 for z/OS subsystem with an 879 GB buffer pool configuration; I've seen an individual buffer pool sized at 253 GB). With a large real storage resource available on a system, could you reasonably consider assigning to a PGSTEAL(NONE) buffer pool an object that takes up several hundred megabytes of space? Maybe a gigabyte or more? Yes, you could. The size-appropriateness, then, of an object for a PGSTEAL(NONE) buffer pool is going to depend in large part on the size of the relevant z/OS LPAR's real storage resource and the degree to which that resource can be used for the Db2 subsystem (even a large real storage resource can be over-utilized - I like to see a z/OS LPAR's demand paging rate at either zero or less than 1 per second).

[Something else to think about when considering the size of an object that you might assign to a PGSTEAL(NONE) buffer pool: online REORG. If you execute an online REORG for an object in a PGSTEAL(NONE) buffer pool, that buffer pool will also be used - understandably - for the object's shadow data set(s). That, in turn, means that you'll either size the PGSTEAL(NONE) buffer pool to accommodate the shadow data set(s), or you'll be prepared for a lot of I/O activity for the buffer pool when the REORG job is running, or you'll think about not REORGing the object (and that may not be so far-fetched - some objects truly do not have to be reorganized).]

  • Frequency of access - Yes, z/OS LPAR real storage sizes are getting bigger and bigger, but mainframe memory isn't free. You want to use it effectively - maximize its value. Can you put a lot of dinky objects in a PGSTEAL(NONE) pool? Yes. If those objects are not accessed particularly frequently, would that be a good use of mainframe memory? Probably not. What you want in a PGSTEAL(NONE) buffer pool are objects that are accessed very frequently - that will give you the biggest bang for your buck, as we say. You may already know which not-too-big (see the discussion of size, above) table spaces and/or indexes in your system get a great deal of access activity. If you don't know that, information in the real-time statistics tables in the Db2 catalog can provide clarity (for example, the GETPAGES column of SYSTABLESPACESTATS and SYSINDEXSPACESTATS).
So, consider object size, and consider frequency of access when looking for good candidates for a PGSTEAL(NONE) buffer pool. Type of access, in the read/write versus read-only sense, is generally not going to be an important factor (yes, read/write access could impact the need for online REORG, and that has an implication that I noted in the [bracketed] paragraph above).

Large real storage page frames

I've often recommended to Db2 for z/OS-using organizations that they take steps to ensure that busy buffer pools - those, I'd say, for which the GETPAGE rate is greater than 1000 per second - are fully backed by large real storage page frames. As I explained in an entry posted to this blog a few years ago, large real storage page frames improve page-access performance by making the translation of virtual storage to real storage addresses more CPU-efficient.

It's likely that a PGSTEAL(NONE) buffer pool in your system will be quite busy, in a GETPAGEs per second sense (I just got through explaining why frequency of access is an important consideration when evaluating candidate objects for a PGSTEAL(NONE) pool). That being the case, you should ensure that such a pool is backed by large real storage page frames, right? Well, maybe - maybe not.

Why the equivocation? Here's the deal. As I explained in my blog entry covering the "contiguous" nature of a PGSTEAL(NONE) buffer pool in a Db2 12 for z/OS system, a given real storage page frame used for a PGSTEAL(NONE) pool will hold pages belonging to one and only one database object. Suppose you want to assign to a PGSTEAL(NONE) buffer pool a pretty good number of small, very frequently accessed table spaces and/or indexes. Fine. That could be a very good use of a PGSTEAL(NONE) pool. Should that pool be backed by 1 MB real storage page frames, for a performance boost? Maybe not. Consider that it takes 256 buffers of 4 KB each to exactly fill a single 1 MB page frame. What if you assign, to a 4K pool defined with PGSTEAL(NONE) and FRAMESIZE(1M), a frequently accessed "code" or "look-up" table whose table space consists of only, say, 20 pages? Those 20 pages will go in one 1 MB page frame, and none of the rest of the space in that page frame will be usable for any other object assigned to the pool. What if there are quite a few such small objects assigned to the pool? That could mean that the actual size of the pool will have to be quite a bit larger than you'd first thought, owing to the unusable space in the 1 MB frames that each are limited to holding the few pages of a single object. Are you OK with that? Maybe you'll decide to just go with 4 KB page frames for this PGSTEAL(NONE) pool, to make more efficient use of memory, and I'd find that decision to be very much understandable. What if objects assigned to a PGSTEAL(NONE) pool have, on average, a few thousand pages apiece? Might you still have some "wasted" space in 1 MB page frames if it's a FRAMESIZE(1M) pool? Yes, but in a proportionate sense this won't be such a big deal. What if an object assigned to the pool has 5000 pages? Those pages will exactly fill nineteen 1 MB page frames, and the 136 pages left over will go in another 1 MB frame, and the remaining space in the 1 MB frame, which could hold another 120 pages of 4 KB each, will not be usable for another object. That would be nineteen 1 MB pages fully utilized, and about 47% of the twentieth 1 MB page frame "wasted," or a little over 2% wastage when you consider all of the twenty 1 MB page frames used for the object. Personally, I wouldn't lose any sleep over that.

Here's something that follows from what I've just explained: if you have identified a set of objects that you want to assign to a PGSTEAL(NONE) pool, and a number of those objects are quite small (with several having maybe fewer than 50 pages apiece) and others are larger (maybe a few thousand pages apiece), use two different PGSTEAL(NONE) pools. Specify FRAMESIZE(1M) for one of the pools, and assign the larger objects to that pool, and assign the smaller objects to the pool for which you're using the default frame size of 4K (assuming we're talking about objects with 4 KB-sized pages). That will give you some performance boost from large frames for the one pool, while ensuring efficient use of memory for the other pool.

By the way, I'm not talking about FRAMESIZE(2G) for a PGSTEAL(NONE) pool because Db2 does not allow 2 GB page frames to be used for a PGSTEAL(NONE) pool. If you want to back a PGSTEAL(NONE) pool with large frames, go with FRAMESIZE(1M).

I hope that this blog entry will be helpful for you. Thanks for being a reader of the blog.

Wednesday, February 24, 2021

Are You Using the REST Interface to Db2 for z/OS?

Db2 for z/OS got a built-in REST interface about four years ago. Have you put it to work? Perhaps you should.

Some information about REST

REST is short for representational state transfer, an architectural style that enables invocation of services through a straightforward, consistent mechanism. It is increasingly popular among application developers in large part because it is lightweight, easy to use, and provides a high degree of abstraction with respect to the particulars of a service-providing system. That can be really helpful when an application's client-side programs have to interact with a number of disparate back-end systems - if the service requests your programs issue look largely the same in form, even though the responding systems are quite different from each other, that is likely to be a big boost to your coding productivity, versus a scenario in which you have to interact in different ways (maybe very different) with the various systems that provide services your programs consume.

Let me show you what I mean. A REST request to get information related to account ID 123456789, issued by a client program and fielded by a Db2 for z/OS system, might look something like this:


Body: { “ID”: 123456789 } 


The response received by the client from the Db2 for z/OS system could look like this:



  “FIRSTNAME” : “John”,

  “LASTNAME”  : “Smith”,

  “BALANCE”   : 1982.42,

  “LIMIT”     : 3000.00


There is nothing in the REST request issued by the client program, and nothing in the request response received by the program, that gives any indication that the request was processed by a Db2 for z/OS system, and that's the point. This type of programming model, sometimes referred to as data-as-a-service, eliminates the need for a client-side developer to know the technical particulars of a request-processing system. Is it a z/OS system? A Linux or a Windows system? Will a relational database management system be involved? Will data be retrieved from flat files, or maybe from a Hadoop data store or a NoSQL database? For the developer of the REST request-issuing program, the answer to these questions can be, "Don't know, don't care - all that stuff's plumbing, as far as I'm concerned." What matters to the developer is that a requested service - retrieval of data for a given account ID, in the example above - worked as expected.

When Db2 for z/OS is the service-providing system, what happens when the REST request is received, and how did that REST service come to be, in the first place? Read on.

What Db2 does with a REST request

So, let's say that the first part of the example REST request seen above,, resolves to a Db2 for z/OS system (in that case, 4711 would be the Db2 system's SQL listener port - it could be the system's secure SQL port, and if that is true then Db2 will require SSL encryption for the transaction). The request would be received by the Db2 distributed data facility (DDF), as Db2's REST interface is an extension of DDF functionality (that brings a cost-of-computing advantage: as many Db2 people know, when a SQL statement is executed via DDF, up to 60% of the CPU cost of SQL statement execution can be offloaded to a zIIP engine). Db2 parses the HTTP and JSON associated with the request (JSON is short for JavaScript Object Notation, and I'm referring to the request's input of { "ID": 123456789 } - more on that in a moment), and then does security checks. Security check number one is for user authentication purposes. Db2 requires that a REST request be associated with either an ID and a password or an ID and a certificate (if a password is supplied, it will be in the HTTP header of the request). The ID and password (or certificate) will be validated by RACF (or an equivalent z/OS security manager), and if authentication is successful then we move on to security check number two (note that, additionally, a RACF profile for REST access to the target Db2 subsystem can be defined, giving you the ability to permit an ID to access the Db2 subsystem via the REST interface but not through another interface such as CICS, or vice versa).

That second security check is of an authorization nature: does the ID associated with the REST request have the EXECUTE privilege on the Db2 package associated with the requested service? If that required Db2 privilege has indeed been granted to the ID (or to a group ID to which the ID associated with the REST request is connected), it's package execution time: the package is allocated for execution to the REST request's Db2 thread (a DDF thread, also known as a DBAT - short for database access thread), and the corresponding SQL statement is executed (with the input that accompanied the request, if input is required and supplied).


The Db2 REST service example I have used involves execution of this SQL statement:









The input value supplied with the REST request, account ID 123456789, is substituted for the parameter marker (the question mark) in the query text, and the result set is sent, in JSON format, to the REST client. That output document, as we saw before, looks like this:



  “FIRSTNAME” : “John”,

  “LASTNAME”  : “Smith”,

  “BALANCE”   : 1982.42,

  “LIMIT”     : 3000.00



This JSON document shows what a one-row query result set would look like when sent to the REST client. If it were a multi-row result set, the output JSON document would contain several sets of name-value pairs - one such set for each result set row.


A word about these JSON documents used for Db2 REST service input and output "payloads": this is another aspect of the REST model that appeals to a lot of developers. Though it is not technically required that input and output data associated with REST requests be in the form of JSON documents, that tends to be the case, and with good reason: JSON documents, as you can see, are really easy to form and to parse (by programs and by people). That's not always true for, say, XML documents. JSON also tends to be "lighter weight" versus XML - fewer bytes are transmitted between client and server.


With regard to the name-value pairs seen in the JSON documents associated with a Db2 REST service, let's consider first the input. We saw, for my example, { "ID": 123456789 }. Where did that "ID" come from? It's the name of the Db2 table column referenced in the query predicate (referring to the SELECT statement associated with the REST service) that is coded with a parameter marker. "ID" is a pretty intuitive column name. What if that were not the case? What if you wanted to have a Db2 REST service for retrieval of information related to a product ID, and the table column had an obtuse name like PRID? In that case, the predicate of the query to be REST-enabled could be coded with a host variable instead of a parameter marker, and the host variable name could then be used in the input document for the REST request instead of the column name (so, if the query predicate were of the form WHERE PRID = :PRODUCT_ID, the input JSON document could be { "PRODUCT_ID": 56789 } instead of { "PRID" 56789 }).


For an output JSON document that contains data from a query result set, the names in the name-value pairs are, by default, the names of the columns in the query's select-list. As noted above, you could have a table column with a not-very-intuitive name. That situation can be addressed by renaming the column in the query select-list. If, for example, you want to return a mobile phone number from a column named MONUM, you could code the query that you're going to REST-enable in this way: SELECT MONUM AS MOBILE_NUMBER, ... Having done that, you'll have an output JSON document with "MOBILE_NUMBER" : "111-222-3333" instead of "MONUM" : "111-222-3333".


How Db2 for z/OS REST services are created


I've referred to packages associated with Db2 REST services, and that might cause you to think that what you're REST-enabling is static SQL, and you'd be right about that: the Db2 capability I'm talking about enables invocation of a pre-coded, server-side, static SQL statement in response to receipt of a properly coded (and security-checked) REST request. And yes, I do mean a single statement. Sometimes that's all you need for a given Db2 REST service, and in such a case the single SQL statement can be a SELECT, INSERT, UPDATE, DELETE or TRUNCATE. What if you want a Db2 REST service to involve execution of several SQL statements? Can that be done? Absolutely, because the single static SQL statement for which you create a REST service can be a CALL of a Db2 stored procedure, and the called stored procedure could issue any number of SQL statements (if you REST-enable a CALL of a stored procedure, I'd recommend going with a native SQL procedure because execution of such a stored procedure, when the call is through DDF, is up to 60% zIIP-offload-able).


With a SQL statement that you want to REST-enable, you have two service-creation options. One option is to use the Db2-supplied REST service called DB2ServiceManager (a REST service that can be used to create a REST service from a SQL statement). Information on using DB2ServiceManager to create a REST service from a Db2 SQL statement (including an example) can be found in the Db2 for z/OS Knowledge Center on IBM's Web site.

Another option for creating a REST service from a SQL statement is the Db2 command BIND SERVICE, which can be used in a batch job such as the one shown below:



//         DD  DSN=DB2A.SDSNLOAD,DISP=SHR      

//SYSTSPRT DD  SYSOUT=*                        

//SYSPRINT DD  SYSOUT=*                        

//SYSUDUMP DD  SYSOUT=*                        


//             DISP=SHR                        

//SYSTSIN  DD  *                               

 DSN SYSTEM(DB2A)                              



  NAME("simpleSelect1") -                      

  SQLENCODING(1047) -                          

  DESCRIPTION('return a list of deptname-      

 based on input location') 


More information about BIND SERVICE can be found in the Db2 Knowledge Center. Also in the Knowledge Center is additional information on invoking a Db2 REST service. And, check out the series of helpful videos on creating and invoking Db2 REST services (in - yes - the Db2 Knowledge Center).

Db2 REST service capabilities, made even better

As I noted at the beginning of this blog entry, the REST interface is built into Db2 for z/OS: if you have Db2, you have what you need to create and use Db2 REST services. With that said, if you want to have even greater ease of use in creating Db2 REST services, have your Db2 REST services described using Swagger (a standard specification with which many developers are familiar), have more flexibility in invoking Db2 REST services, and have more options for formatting the JSON output document associated with a Db2 REST service, you'll want to take a look at an IBM product called z/OS Connect. In an entry I'll post to this blog in the near future, I'll expand on what z/OS Connect can do to make Db2 REST services even more valuable for your organization.

Wednesday, January 27, 2021

Db2 for z/OS: Why You Might See a Lot of Catalog Access for Package Authorization Checks

From time to time I will receive a note from a Db2 for z/OS DBA, asking for help in understanding why there is a larger-than-expected amount of Db2 catalog access activity associated with package authorization checks, and what can be done to address that situation. In my experience there are two primary causes of this observed Db2 behavior, one more straightforward than the other. In this blog entry I'll describe these two drivers of head-scratchingly high numbers for catalog accesses related to package authorization, and I'll provide associated mitigating actions that you can take.

Before proceeding, a couple of points of clarification. When I mention "package authorization," what I'm talking about, in the context of this blog entry, is authorization to execute a package (not authorization to create a package by way of a BIND PACKAGE operation - that's a different subject). Also, this whole discussion is relevant primarily to packages that are the compiled and executable form of what Db2 people refer to as static SQL statements (there are other packages that essentially enable preparation and execution of dynamic SQL statements). Such a static SQL package could be associated with a Db2 stored procedure, or with a CICS-Db2 transaction program, or with a batch job. If the application process requesting execution of a static SQL package is local to the Db2 subsystem (i.e., it is not accessing the Db2 system by way of a network connection and the Db2 distributed data facility), the package will be executed in conjunction with a Db2 plan, with which it will be associated by way of the plan's package list (more on that to come).

OK, on to the first of the two aforementioned drivers of unexpectedly high catalog access activity related to package authorization checking.

A too-small package authorization cache

In a Db2 system in which the workload involves execution of a lot of static SQL statements, there can be a LOT of package authorization checks - like, hundreds per second. To help make this authorization checking as efficient as it can be, Db2 has, in its database services address space, an in-memory cache of package authorization information. When Db2 checks to see if ID ABC is authorized to execute package XYZ, it will check the package authorization cache first, to see if the authorization information is there. If that information is not found in the cache, Db2 checks the catalog (the SYSPACKAUTH table, for example) to see if ID ABC is authorized to execute package XYZ. The information found in the catalog is then placed by Db2 in the package authorization cache in memory, so that the next time a check of ID ABC's authorization to execute package XYZ is needed, the check can be very quickly and efficiently accomplished via the cache.

All well and good, but sometimes there's a bit of a problem here, and it has to do with the size of the in-memory package authorization cache. If it's too small, there could be a lot of cache "misses" for package authorization checks, and those misses will drive catalog accesses. How can you tell if this is happening? Use your Db2 monitor to generate a statistics long report for the Db2 subsystem of interest, covering a busy hour of a busy day (depending on the Db2 monitor in use at your site, this report might be called a statistics detail report). In that report, you should see a set of fields under the heading AUTHORIZATION MANAGEMENT, such as those shown in the report snippet below (headings and field names might vary slightly from one Db2 monitor to another).


---------------------------  --------  -------

PKG-AUTH SUCC-W/O CATALOG     5885.1K  1634.76

PKG-AUTH UNSUCC-CACHE        43464.00    12.07

These numbers look good. Why do I say that? Because more than 99% of the time, a package authorization check was completed without a requirement for catalog access (1634.76 + 12.07 package authorization checks per second, with only 12.07 of those requiring catalog access). There were very few package authorization cache "misses" (as indicated by the smallness of the values in the PKG-AUTH UNSUCC-CACHE row relative to the values in the PKG-AUTH SUCC-W/O CATALOG row).

Sometimes, the numbers in this part of a Db2 monitor statistics long report don't look so good.  I've seen that the value in the PKG-AUTH UNSUCC-CACHE row can be as high as several hundred per second (even over 1000 per second). When that's the case, it can be an indication that the package authorization cache is way smaller that it should be. Why might that be the case? Well, the size of the cache is determined by the value of the CACHEPAC parameter in ZPARM, and prior to Db2 10 the default CACHEPAC value was only 100 KB. Sometimes, an existing ZPARM value is carried forward when a Db2 system is migrated to a new version, and that old and small CACHEPAC value did indeed go forward at quite a few sites; so, check the value of CACHEPAC for your Db2 system. If it is anything less than the max value of 10M (meaning, 10 MB), change the value to 10M.

Having said all this, I'll tell you that depending on the maintenance level of your Db2 system, you might look for CACHEPAC in your ZPARM listing and NOT SEE IT. Why is that? It's because the fix for APAR PH28280, which came out in the latter part of 2020, removed CACHEPAC from ZPARM and set its value internally to 10M (that fix did the same for CACHERAC, the parameter that specifies the size of the routine authorization cache, used to check authorization to execute Db2 routines such as stored procedures and user-define functions).

OK, so a too-small package authorization cache can lead to high levels of catalog access activity related to package authorization checks. You can make the size of that cache what it should be, which is 10M (or IBM will do that for you, via the fix for the aforementioned APAR), but there's something else that could cause a lot of catalog access for package authorization checks...

Location asterisks in a plan's PKLIST

I noted earlier that a static SQL package used by a local-to-Db2 application will be executed in conjunction with a Db2 plan (remote client applications that access Db2 via the Db2 distributed data facility are associated, by default, with plan DISTSERV, which is not a plan in the traditional sense). The plan for a local-to-Db2 application has something called a PKLIST, which is short for package list. In a plan's PKLIST are the packages that can be executed by way of the plan. You can make use of asterisks in a plan's PKLIST in a couple of ways. One way, which is very common and helpful, is to use an asterisk for the packages in a given package collection; so, if packages in collection COLL1 are to be executed through plan PLAN1, the PKLIST for PLAN1 can have the entry COLL1.*. That covers all the packages in collection COLL1, and it's a lot more convenient than putting each individual package in COLL1 in the PKLIST for PLAN1.

Another use of asterisks in a plan's PKLIST can be problematic. I'm referring here to the location of a collection. You could have, in the PKLIST for PLAN1, the entry *.COLL1.*, and here's what that high-order asterisk means: it means that PLAN1 can be used to execute packages in collection COLL1 at the local Db2 system and at any remote location at which a collection named COLL1 might exist. Sometimes, there actually is a collection COLL1 at a remote server, and there is a need for packages in the remote COLL1 collection to be executable through PLAN1. Usually, that is not the case - the plan is only going to be used with local packages. "So what?" you might say, "If PLAN1 is only going to execute packages in the local COLL1 collection, it can do that it it has an *.COLL1.* entry in its PKLIST." True, but here's the deal: if there is an asterisk in the location-name part of an entry in a plan's PKLIST, there will be a package authorization check done every time a package in that collection is executed via the plan. That, in turn, could lead to a lot of associated catalog access activity. If, instead, the entry is the plan's PKLIST is COLL1.* (i.e., nothing - not even an asterisk - is specified for the collection's location), that will be as though the local Db2 had been explicitly specified as the collection's location, and that single-location specificity will mean that authorization to execute the packages in COLL1 is checked at BIND PLAN time and will NOT be checked when packages in COLL1 are subsequently executed via the plan (the authorization checked at BIND PLAN time is that of the ID of the process issuing the BIND PLAN command - for example, that ID may have been granted EXECUTE ON PACKAGE COLL1.*).

With this in mind, if for a local-to-Db2 application involving static SQL execution you see an unexpectedly high level of catalog accesses (specifically, accesses to catalog tables such as SYSPACKAUTH), check the entries in the plan's PKLIST. If you see an entry of the form *.COLL1.*, you might want to do one of two things: remove the asterisk in front of the collection's name if the collection is local; or, if packages in a collection with the specified name at remote location LOC2 will be executed via the plan, change an entry of the form *.COLL1.* to LOC2.COLL1.*. What if the plan will be used to execute packages both in a local collection named COLL1 and in a COLL1 collection at location LOC2? In that case, have a COLL1.* entry in the plan's PKLIST for the local collection COLL1, and a LOC2.COLL1.* entry for the collection COLL1 at location LOC2. Yes, this is more of a hassle than just putting an *.COLL1.* entry in the plan's PKLIST, but it can result in elimination of execution-time package authorization checks (and associated catalog access activity).

And there you have it. To make package authorization checking as efficient as it can be, make the size of the package authorization cache as large as it can be, which is 10M (if you have not already done this or if Db2 has not already done that for you via the fix for APAR PH28280), and - for local-to-Db2 applications using static SQ packages - don't use an asterisk for the location of the packages' collection in the relevant plan's PKLIST (instead, don't specify anything - not even an asterisk - for the location of a local collection, and explicitly specify the remote location name of a not-local collection).

I hope this information will be useful for you.

Thursday, December 31, 2020

Db2 for z/OS: What Do You Know About Rebind Phase-In?

For many years, DBAs at a lot of Db2 for z/OS sites have been challenged in performing an important operational task: rebinding a frequently executed package. This can be tricky because a package cannot be rebound while it is in use, and if a package is executed with great frequency - in some environments, certain packages are executed hundreds of times per second during busy periods - then its use count may only rarely go to zero. When a REBIND command is issued for such a package, there are two likely outcomes, neither of which is desirable: either the REBIND command times out because the target package is continuously in-use, or the REBIND gets going thanks to a little gap that emerges in the package's in-use status. Why is that latter outcome potentially undesirable? It is undesirable because once the rebind of a package is underway, a subsequent request to execute the package must wait for the rebind to complete. Yes, a package rebind is usually completed quite quickly, but if rebind processing for a package is delayed for some reason, even if only for a few seconds, that could trigger application-side time-outs - not good.

Could you just rebind a high-use package at a less-busy time? Perhaps, but sometimes there is a pressing need to rebind a package now, not later. Besides that, in an increasingly 24X7 world you might have packages that are rarely not-in-use even outside of peak periods. Think, for example, about the packages associated with the IBM Data Server Driver (which you can think of as the successor to Db2 Connect). These packages, whose default collection is named NULLID, are used by applications (other than REST clients) that access a Db2 for z/OS system via Db2's distributed data facility (DDF). A DDF transaction rate in the thousands per second is not terribly unusual these days, and that rate can remain pretty high around the clock, making rebind of the NULLID packages a daunting prospect for some DBA teams.

The rebind challenges associated with ever-higher and ever more round-the-clock package execution frequency are made more serious by the growing need for package rebind actions. There was a time when an organization might rebind packages (and plans) only infrequently, outside of the rebinds that are recommended when you migrate a system to a new version of Db2 (if your organization does not rebind all plans and packages following migration to a new version of Db2, I recommend that you address that situation). Nowadays, we have newer reasons to rebind packages - you might change a RELEASE specification to DEALLOCATE, perhaps to enable high-performance DBAT functionality; you might rebind a package to get a performance gain after executing RUNSTATS to leverage the statistics profile enhancements introduced with Db2 12; you might rebind a package with the new (with Db2 12) CONCENTRATESTMT option, to "turn on" Db2's statement concentration feature for an application from the server side (statement concentration could formerly only be activated by a client-side action); and so on.

The IBM Db2 for z/OS development team recognized some time ago that rebind concurrency was an issue that needed to be addressed. The bind concurrency feature provided by Db2 11 for z/OS definitely helped, but more was needed. That "more" was largely delivered via the rebind phase-in capability enabled by way of Db2 12 function level 505. Although the fix (for APAR PH09191) that takes a Db2 system's code to the 505 level came out about 18 months ago, I have found that quite a few Db2 for z/OS people are not aware of what rebind phase-in is and how it works; so, I'll provide explanatory information in the remainder of this blog entry.

In a nutshell, rebind phase-in allows a package to be immediately and successfully rebound even if the package is in-use at the time the REBIND PACKAGE command is issued. Not only that, but when rebind phase-in is in effect, a package execution request that comes in following the issuance of the REBIND command will not be held up, even if the rebind of the package has not yet completed. Gotta love that.

So, how is this goodness made possible? Well, it's based on a Db2 capability, called plan management, that was introduced with Db2 9. When plan management is fully in effect (when the PLANMGMT parameter of ZPARM is set to its default value of EXTENDED), Db2 can retain up to three instances (referred to as copies) of a given package: the original copy, the previous copy, and the current copy. These package copies are retained primarily for the purpose of "taking the fear out of rebinding" (as a former colleague of mine cleverly put it) - if you rebind a package and access paths change in a way that negatively impacts performance, you can quickly and easily go back to the previous copy of the package by issuing a REBIND PACKAGE command with the SWITCH(PREVIOUS) option.

The copies of a package retained by way of plan management functioning in extended mode are differentiated by a number: copy 0 is the current instance of a package, copy 1 is the previous instance, and copy 2 is the original instance. Rebind phase-in builds on this functionality, and in fact having PLANMGMT in ZPARM set to EXTENDED is required if you want to take advantage of the rebind phase-in feature. I'll illustrate with an example: suppose that you issue a REBIND command for package ABC, and Db2 (Version 12, with function level 505 or higher activated) sees that the package is currently in use. In that case, Db2 will utilize rebind phase-in processing (if the package is not in-use at the time of the issuance of the REBIND PACKAGE command, the rebind request will be processed in the traditional way). A new instance of package ABC will be generated, even as the current instance of the package is being used, and this will be do-able because the newly-generated instance of the package will be identified as copy 4 (we jump from copy identifier 0 to 4 because - as noted above - 1 identifies the previous instance of the package and 2 identifies the original, and identifier 3 has been reserved for future use by Db2). If a request to execute package ABC comes in before generation of the new copy 4 has completed, copy 0 of the package will be allocated to the requester's Db2 thread for execution. Requests for execution of package ABC that come in after copy 4 has been generated will result in allocation of copy 4. In time (quickly if the package is bound with RELEASE(COMMIT) - a little longer if the package is bound with RELEASE(DEALLOCATE) and is allocated to a persistent thread such as a high-performance DBAT), instances of copy 0 of package ABC will be released from the threads to which they had been allocated for execution, and copy identifier 0 will then go back into the pool of numbers available for package differentiation when package ABC is again involved in rebind phase-in processing. That pool, for any given package, is the number 0 together with the numbers 4-16. When package ABC, the current copy of which is now 4, is again rebound while in-use, the new instance of the package will be identified as number 5. If package ABC is rebound in a phase-in way enough times for its current copy identifier to be 16, and it is again rebound while in-use, the new copy will be identified as 0, after which the numbers 4-16 will again be used as needed.

Note, by the way, that rebind phase-in processing is also used when a REBIND PACKAGE command with a SWITCH specification is issued and the target package is found to be in-use. Let's say that the current instance of package XYZ is copy 4 (meaning that copy 0 was current when package XYZ was rebound in a phase-in way). If a REBIND PACKAGE command with SWITCH(PREVIOUS) is issued for package XYZ while the package is in-use, the previous instance of the package (always designated as copy 1) will be replicated to generate the new current instance of the package, which will be identified as copy 5. New requests to execute package XYZ will result in copy 5 being allocated to the requester's thread. Copy 4 of package XYZ, the previously current copy which is now the phased-out copy, will be replicated to become the new previous instance of the package (copy 1, as always).

Also note that rebind phase-in functionality is not applicable to packages associated with SQL PL routines (native SQL procedures, compiled SQL scalar functions and advanced triggers). This is not as big a deal as it may seem, owing to the fact that there were already ways to non-disruptively generate and activate a new instance of a SQL PL routine. For example, a new instance of a native SQL procedure can be non-disruptively generated and activated via an ALTER PROCEDURE statement with an ADD VERSION specification, followed by a second ALTER PROCEDURE statement with ACTIVATE VERSION specified (referencing the version added via the ALTER PROCEDURE with ADD VERSION).

Rebind phase-in, enabling successful and non-disruptive rebind operations even for packages executed with great frequency, is one more way in which the IBM Db2 for z/OS development team has made life easier for Db2 DBAs. I hope that you will find this functionality to be useful at your site.