A few days ago I received a note in which a DBA reported strange and inconsistent information in some DSNL030I messages that were generated periodically by a DB2 for z/OS subsystem in his charge. The messages in question were tied to authentication failures associated with DDF-using applications (i.e., applications that access DB2 via network connections). Such a failure could occur if an application used an invalid password on a DB2 for z/OS connection request (and that could be the result of the password for the application's DB2 authorization ID being changed on the z/OS side, with the corresponding change to the new password not being made on the application server side, due to an oversight on someone's part).
When a failure of this nature occurred for a DB2 subsystem at the DBA's company, he would see a DSNL030I message from DB2, with a 00F30085 reason code ("The requester's password could not be verified"). Nothing strange there. What struck the DBA as odd was the information that he would see in the LUWID part of the DSNL030I message text. Now, at first glance the form of the information in the LUWID part of the message text might look weird: a three-part string of seemingly meaningless letters and numbers. In fact, the first of the three parts of that string is a client IP address, the second part is the associated port number, and the third part is just a unique sequence number generated by DB2 Connect (or the by the IBM Data Server Driver Package if that DB2 client-side software is being used). OK, but what if the first part of the LUWID string looks like this:
J56F045C
That sure doesn't look like an IP address. And it couldn't be a hex (i.e., hexadecimal) representation of an IP address, could it? Not with a 'J' in the string.
Actually, it is an IP address in hex form, with the twist being that the high-order letter is translated to a number as described in the IBM "Technote" at this URL:
http://www-01.ibm.com/support/docview.wss?uid=swg21055269
As pointed out in the Technote, the string J56F045C in the first part of the LUWID section of a DB2 for z/OS DSNL030I message would resolve to an IP address of 53.111.4.92 (and using the same scheme for substituting a numerical value for the high-order letter in the second part of the DSNL030I LUWID string, G422 would be seen to be a representation of port number 1058).
The DBA who wrote to me wasn't thrown off by the representation of the IP address in the LUWID part of a DSNL030I message, because he'd already seen the aforementioned Technote and therefore knew how to derive the actual IP address. What had the DBA perplexed was the variability he saw in these IP addresses when there should not have been variability. Sometimes, he would see the IP address of a DB2 Connect "gateway" server (more on this in a moment), and sometimes he would see the address of a client application server "upstream" from the DB2 Connect gateway server. On top of that, when the DBA saw an upstream client IP address in the LUWID section of a DSNL030I message, that address was not always consistent with regard to the actual application server for which the authentication failure had occurred. What was going on?
The inconsistent IP address information that the DBA saw in the LUWID part of DB2 DSNL030I messages is related to the fact that client application servers at his site access DB2 for z/OS via DB2 Connect gateway servers, as opposed to going directly to DB2 using the IBM Data Server Driver Package. Here's how the two situations are linked: when a client application requests a connection to a DB2 for z/OS server through a DB2 Connect gateway server, authentication is a very early step in the process of establishing that connection. IF authentication is successful, the DB2 Connect gateway server will send the client's IP address to the DB2 for z/OS subsystem. If authentication is NOT successful then the client address will not be sent by the DB2 Connect gateway server to DB2 for z/OS.
If the DB2 Connect gateway server does not send the upstream client's IP address to DB2 for z/OS when client application authentication is not successful, why did the DBA sometimes see a client IP address in the LUWID part of a DSNL030I authentication failure message? That can happen when the DB2 Connect gateway server connection associated with the client authentication failure is being reused following a previously successful client authentication (keep in mind that the DB2 Connect gateway, by default, keeps a pool of connections to a downstream DB2 subsystem that it reuses for upstream clients -- a feature that boosts efficiency versus having to frequently terminate and then re-establish connections to the DB2 for z/OS host system). In that case -- when an authentication failure occurs using a pooled connection from the DB2 Connect gateway server to DB2 for z/OS that had previously been used for a successful authentication -- you will see in the LUWID part of the DSNL030I message an upstream client IP address.
What client IP address will that be? It will be the address of the last client to successfully authenticate using the DB2 Connect gateway server connection in question. That may OR MAY NOT be the client for which authentication failed. It WILL be the IP address of the client that encountered the authentication failure IF the same client was the last one to successfully authenticate to DB2 for z/OS using the connection. If the last client to successfully authenticate to DB2 using the connection between the DB2 Connect gateway server and the DB2 subsystem is DIFFERENT from the client that encountered the authentication failure, you'll see the IP address of that SUCCESSFULLY authenticated client application in the LUWID part of the DSNL030I authentication failure message.
But sometimes the DB2 DBA saw the IP address of a DB2 Connect gateway server, instead of a client IP address, in the LUWID part of a DSNL030I authentication failure message. Why? That can happen when the first client to use a connection between the DB2 Connect gateway server and the DB2 subsystem gets an authentication failure. In that case, the LUWID part of the DSNL030I message will contain the IP address of the "adjacent" (to DB2 for z/OS) server, and that will be, given a DB2 Connect gateway server set-up, the IP address of a DB2 Connect gateway server.
So, what you know is this: the LUWID part of a DB2 for z/OS DSNL030I authentication failure message will contain an IP address. Depending on the particular circumstances of the authentication failure, the IP address in the LUWID part of the DSNL030I message will be the IP address of the client that encountered the failure, or the IP address of a different client that had previously used the connection (and successfully authenticated to DB2), or the IP address of the DB2 Connect gateway server (if no client had previously used the connection and had successfully authenticated to DB2). The bottom line: you may not see a client IP address in the LUWID part of a DSNL030I message, and even if you do, that client IP address may be different from the address of the client that encountered the authentication failure.
To ensure some consistency in DSNL030I output, the fix for DB2 APAR PM82054 causes DB2 to consistently record the IP address of the "adjacent" server in the THREAD-INFO part of the DSNL030I message when an authentication error occurs. When DB2 Connect is running on a gateway server, that IP address will be the DB2 Connect gateway server's IP address. The information in the LUWID part of the message will not be consistent, and if it does contain a client IP address that address may or may not be that of the client that encountered the authentication failure.
This is another good reason to go to an IBM Data Server Driver Package, direct-to-DB2 connection set-up, versus a DB2 Connect gateway server set-up: if an authentication error occurs, the IP address of the application server on which the Data Server Driver Package is installed will show up -- consistently -- in the THREAD-INFO part of the DSNL030I message, because that server will be the "adjacent server" to the DB2 for z/OS subsystem. Note that entitlement to deploy the IBM Data Server Driver Package is based on DB2 Connect licensing: if you're licensed for the latter, you can deploy the former, and you should deploy the former -- not only for the reason I've just mentioned (having the IP address of the server "adjacent" to DB2 for z/OS be that of an application server versus a DB2 Connect gateway server), but also for a simplified IT infrastructure, better performance (through elimination of a "hop" between application servers and DB2 for z/OS), and easier upgrades to new releases of the DB2 client code (and, speaking of ease, if you license DB2 Connect Unlimited Edition for System z, you can deploy the IBM Data Server Driver on any application server or other client that directly accesses a DB2 for z/OS system, without having to have a license file on each of those client servers -- the client licenses are managed on the DB2 for z/OS host system). On top of that, going from a DB2 Connect gateway server configuration to the IBM Data Server Driver Package direct-to-DB2 configuration typically involves little to nothing in the way of application code changes -- it should just be a matter of updating the client's connection string for the target DB2 for z/OS server. In (usually) rare cases, there could be an application dependency on DB2 Connect, such as when an application needs two-phase commit capability AND the client transaction manager uses a dual-transport processing model (IBM's WebSphere Application Server uses a single-transport processing model).
The more you know about the IBM Data Server Driver Package, the better it looks. There was a time when DB2 Connect gateway server configurations made sense, but for most DB2 for z/OS-using organizations that time has passed.
This is the blog of Robert Catterall, an IBM Db2 for z/OS specialist. The opinions expressed herein are the author's, and should not be construed as reflecting official positions of the IBM Corporation.
Thursday, February 19, 2015
Thursday, February 12, 2015
The New IBM z13 Mainframe: A DB2 for z/OS Perspective
Last month, IBM announced the z13 -- the latest generation of the mainframe computer. The z13 is chock full of great technical features, and there are already plenty of presentations, white papers, and "redbooks" in which you can find a large amount of related information. Going through all that information is an option, but you might be thinking, "I'm a DB2 for z/OS person. What's in the z13 for me?" In this blog entry, I'll give you my take on that question.
My favorite z13 feature is the larger and less expensive memory resource available on the servers. From a technical perspective, I like what I call Big Memory because nothing boosts application performance and CPU efficiency in a DB2 for z/OS environment like expansive real storage (as I pointed out in an entry that I posted to this blog a couple of months ago). A single z13 server can be configured with as much as 10 TB of memory (up from a maximum of 3 TB on a zEC12, the previous top-of-the-line mainframe), and a single z/OS LPAR on a z13 can use up to 4 TB of memory (with z/OS V2.2, or z/OS V2.1 with some PTFs) -- that's up from 1 TB for a z/OS LPAR on a zEC12. How much memory should a z/OS LPAR have? I will tell you that for an LPAR in which a production DB2 for z/OS subsystem is running, I like to see at least 20-40 GB of real storage per engine -- and I mean total engines, zIIP as well as general-purpose (so, for example, if a z/OS LPAR with a production DB2 for z/OS subsystem has eight engines -- four general-purpose and four of the zIIP variety -- then my recommendation would be to configure that LPAR with at least 160-320 GB of memory).
How would I want to exploit that memory for DB2 performance purposes? Well, for starters I'd like to have a big buffer pool configuration -- to the tune of 30-40% of the LPAR's memory resource (so, for example, if I had a z/OS LPAR with 300 GB of memory then I'd want the aggregate size of all the buffer pools allocated for a production DB2 subsystem in that LPAR to be 90-120 GB). I'd also want to have PGFIX(YES) specified for most of these buffer pools -- certainly for the more active pools. I'd also give consideration to specifying PGSTEAL(NONE) for one or more pools, and using those to cache some really performance-critical table spaces and or indexes in memory in their entirety. [Note: my 30-40% of memory guideline for the size of a production DB2 subsystem's buffer pool configuration assumes one production DB2 subsystem in the LPAR. If there were several production DB2 subsystems in a z/OS LPAR, you would not want each of them to have a buffer pool configuration sized at 30-40% of the LPAR's real storage resource. If there were multiple production DB2 subsystems in an LPAR, I would generally try to keep the combined size of all of the subsystems' buffer pool configurations at not much more than 50% of the LPAR's memory.]
I would not stop my exploitation of a z13 Big Memory environment with a large buffer pool configuration. In addition to that, I'd look at enlarging a production DB2 subsystem's dynamic statement cache, to drive the cache "hit ratio," ideally, to 95% or more (avoided "full" prepares of dynamic SQL statements can reduce CPU consumption significantly). I'd also go for a larger in-memory sort work area (specifically, I'd consider a value of SRTPOOL in ZPARM of 40-60 MB or more, and a MAXSORT_IN_MEMORY value of 20-30 MB or more). Additionally, I'd make greater use of the RELEASE(DEALLOCATE) package bind option, together with persistent threads (e.g., high-performance DBATs and CICS-DB2 protected entry threads), to improve the CPU efficiency of frequently executed programs. With the kind of LPAR memory I'm talking about, I think that I could do all of these real-storage-leveraging things and still have a demand paging rate of zero (though I wouldn't get too concerned if I had a small but non-zero demand paging rate of maybe 1 or 2 per second).
Now, I mentioned that I like the z13 memory picture from both an expanse and an expense point of view. I just covered the expanse angle. Now a look through the expense lens. The cost of memory on the z13 starts out substantially lower versus the cost of memory on a zEC12 or z196 server (the latter being the mainframe generation that preceded the zEC12). Depending on how much real storage you order for a z13, beyond what you have on a zEC12 or z196 from which you are upgrading to a z13, the memory cost can go lower still -- a LOT lower. Talk to an IBM z Systems sales representative for details about the memory deals available when upgrading your mainframe to a z13. They are impressive, if I do say so myself. Time to load up on gigabytes (or terabytes).
While I am definitely enamored with Big Memory (as DB2 for z/OS people ought to be), that's not all that I like about the z13. I'm also a fan of two enhancements that, in different but complementary ways, enable z13 processors to work smarter for enhanced application efficiency and throughput. One of these processing enhancements is called SMT -- short for simultaneous multi-threading. SMT allows multiple software threads to run on the same processor core at the same time. On a z13, two threads can execute concurrently on one SMT-supporting processor, and that is why the feature is sometimes referred to as SMT2. With SMT in effect, each of the two threads using one core will run more slowly than would be the case for a single-thread core, but throughput is boosted. My colleague Jim Elliott likes to use the example of a two-lane road with a 45 miles-per-hour speed limit versus a single-lane road with a speed limit of 60 miles per hour (and don't read anything into these speed limit numbers -- the only point is that one is lower than the other). Cars go faster on the one-lane road, but considerably more cars per unit of time travel a given distance using the two-lane road. In other words, the two-lane road with the lower speed limit has a greater carrying capacity than the one-lane road with the higher speed limit. Similarly, an SMT-exploiting processor should provide greater transactional throughput than it would if it were running in the traditional single-thread mode.
SMT is not applicable to all z13 engines. Specifically, the z13 processors that support SMT are zIIP engines and IFLs (the latter being an acronym for Integrated Facility for Linux -- processors dedicated to Linux workloads on z Systems servers, running either in native, "on the metal" LPARs or as virtual Linux systems in z/VM LPARs). That these engines support SMT is good from a DB2 for z/OS standpoint. zIIP engine utilization is often driven to a large extent by DB2 for z/OS DDF workloads (i.e., client-server applications that access DB2 data over network connections). DB2 for z/OS also utilizes zIIP engines for significant portions of utility processing, and (starting with DB2 10) for prefetch read and database write operations. Another reason z13 zIIP support for SMT is good for DB2: Java programs running in z/OS systems use zIIP engines, and z/OS is an increasingly popular environment for Java applications, and those applications very often involve access to data managed by DB2. z13 IFL support for SMT is great performance news for applications that run under Linux on z Systems, and many such applications interact with DB2 in an adjacent z/OS LPAR.
The other "work smarter" z13 processing enhancement that I like a lot is SIMD, or Single Instruction Multiple Data. With SIMD, if the same operation needs to be performed on several data elements, the operation can be performed once for all of the data elements at the same time, versus being performed for first data element, then performed again for the second element, then again for the third, etc. Again, fellow IBMer Jim Elliott provided a nice analogy: suppose you need to get three packages of the same type from point A to point B. It would be more efficient to do that by sending all three packages in one truck, as opposed to sending three trucks carrying one package apiece. Here's the DB2 for z/OS angle: SIMD should boost the performance of two kinds of application in particular: those that are data-intensive (referring to the volume of data operated upon) and those that are compute-intensive (referring to operations performed on data elements). Put those two application characteristics together, and what do you get? Analytics (among other things). In recent years, there has been a marked increase in the use of z Systems servers for analytics applications, driven in part by the steady convergence of transactional and analytics processing. z13 SIMD technology will add fuel to this trend, and it's a trend that most definitely benefits DB2 for z/OS. Software has to be modified to take advantage of SIMD, but that work is already underway. Look for SIMD exploitation by IBM analytics tools in the near future. And, in a z/OS V2.2 system (and V2.1 with some PTFs), SIMD will be exploited by XML System Services (used for things such as schema validation when XML data is stored in DB2 for z/OS tables), by our Java SDKs (good for WebSphere Application Server), and by our COBOL and PL/I compilers.
And that, folks, is the short and sweet summary of what I most like about the z13: Big Memory (like jet fuel for DB2, and priced to sell), SMT (like multi-lane highways), and SIMD (like shipping multiple packages in one truck). There's plenty more z13 technology that is cool in its own right, but as a DB2 for z/OS specialist I'm keeping my "big three" features front-of-mind.
My favorite z13 feature is the larger and less expensive memory resource available on the servers. From a technical perspective, I like what I call Big Memory because nothing boosts application performance and CPU efficiency in a DB2 for z/OS environment like expansive real storage (as I pointed out in an entry that I posted to this blog a couple of months ago). A single z13 server can be configured with as much as 10 TB of memory (up from a maximum of 3 TB on a zEC12, the previous top-of-the-line mainframe), and a single z/OS LPAR on a z13 can use up to 4 TB of memory (with z/OS V2.2, or z/OS V2.1 with some PTFs) -- that's up from 1 TB for a z/OS LPAR on a zEC12. How much memory should a z/OS LPAR have? I will tell you that for an LPAR in which a production DB2 for z/OS subsystem is running, I like to see at least 20-40 GB of real storage per engine -- and I mean total engines, zIIP as well as general-purpose (so, for example, if a z/OS LPAR with a production DB2 for z/OS subsystem has eight engines -- four general-purpose and four of the zIIP variety -- then my recommendation would be to configure that LPAR with at least 160-320 GB of memory).
How would I want to exploit that memory for DB2 performance purposes? Well, for starters I'd like to have a big buffer pool configuration -- to the tune of 30-40% of the LPAR's memory resource (so, for example, if I had a z/OS LPAR with 300 GB of memory then I'd want the aggregate size of all the buffer pools allocated for a production DB2 subsystem in that LPAR to be 90-120 GB). I'd also want to have PGFIX(YES) specified for most of these buffer pools -- certainly for the more active pools. I'd also give consideration to specifying PGSTEAL(NONE) for one or more pools, and using those to cache some really performance-critical table spaces and or indexes in memory in their entirety. [Note: my 30-40% of memory guideline for the size of a production DB2 subsystem's buffer pool configuration assumes one production DB2 subsystem in the LPAR. If there were several production DB2 subsystems in a z/OS LPAR, you would not want each of them to have a buffer pool configuration sized at 30-40% of the LPAR's real storage resource. If there were multiple production DB2 subsystems in an LPAR, I would generally try to keep the combined size of all of the subsystems' buffer pool configurations at not much more than 50% of the LPAR's memory.]
I would not stop my exploitation of a z13 Big Memory environment with a large buffer pool configuration. In addition to that, I'd look at enlarging a production DB2 subsystem's dynamic statement cache, to drive the cache "hit ratio," ideally, to 95% or more (avoided "full" prepares of dynamic SQL statements can reduce CPU consumption significantly). I'd also go for a larger in-memory sort work area (specifically, I'd consider a value of SRTPOOL in ZPARM of 40-60 MB or more, and a MAXSORT_IN_MEMORY value of 20-30 MB or more). Additionally, I'd make greater use of the RELEASE(DEALLOCATE) package bind option, together with persistent threads (e.g., high-performance DBATs and CICS-DB2 protected entry threads), to improve the CPU efficiency of frequently executed programs. With the kind of LPAR memory I'm talking about, I think that I could do all of these real-storage-leveraging things and still have a demand paging rate of zero (though I wouldn't get too concerned if I had a small but non-zero demand paging rate of maybe 1 or 2 per second).
Now, I mentioned that I like the z13 memory picture from both an expanse and an expense point of view. I just covered the expanse angle. Now a look through the expense lens. The cost of memory on the z13 starts out substantially lower versus the cost of memory on a zEC12 or z196 server (the latter being the mainframe generation that preceded the zEC12). Depending on how much real storage you order for a z13, beyond what you have on a zEC12 or z196 from which you are upgrading to a z13, the memory cost can go lower still -- a LOT lower. Talk to an IBM z Systems sales representative for details about the memory deals available when upgrading your mainframe to a z13. They are impressive, if I do say so myself. Time to load up on gigabytes (or terabytes).
While I am definitely enamored with Big Memory (as DB2 for z/OS people ought to be), that's not all that I like about the z13. I'm also a fan of two enhancements that, in different but complementary ways, enable z13 processors to work smarter for enhanced application efficiency and throughput. One of these processing enhancements is called SMT -- short for simultaneous multi-threading. SMT allows multiple software threads to run on the same processor core at the same time. On a z13, two threads can execute concurrently on one SMT-supporting processor, and that is why the feature is sometimes referred to as SMT2. With SMT in effect, each of the two threads using one core will run more slowly than would be the case for a single-thread core, but throughput is boosted. My colleague Jim Elliott likes to use the example of a two-lane road with a 45 miles-per-hour speed limit versus a single-lane road with a speed limit of 60 miles per hour (and don't read anything into these speed limit numbers -- the only point is that one is lower than the other). Cars go faster on the one-lane road, but considerably more cars per unit of time travel a given distance using the two-lane road. In other words, the two-lane road with the lower speed limit has a greater carrying capacity than the one-lane road with the higher speed limit. Similarly, an SMT-exploiting processor should provide greater transactional throughput than it would if it were running in the traditional single-thread mode.
SMT is not applicable to all z13 engines. Specifically, the z13 processors that support SMT are zIIP engines and IFLs (the latter being an acronym for Integrated Facility for Linux -- processors dedicated to Linux workloads on z Systems servers, running either in native, "on the metal" LPARs or as virtual Linux systems in z/VM LPARs). That these engines support SMT is good from a DB2 for z/OS standpoint. zIIP engine utilization is often driven to a large extent by DB2 for z/OS DDF workloads (i.e., client-server applications that access DB2 data over network connections). DB2 for z/OS also utilizes zIIP engines for significant portions of utility processing, and (starting with DB2 10) for prefetch read and database write operations. Another reason z13 zIIP support for SMT is good for DB2: Java programs running in z/OS systems use zIIP engines, and z/OS is an increasingly popular environment for Java applications, and those applications very often involve access to data managed by DB2. z13 IFL support for SMT is great performance news for applications that run under Linux on z Systems, and many such applications interact with DB2 in an adjacent z/OS LPAR.
The other "work smarter" z13 processing enhancement that I like a lot is SIMD, or Single Instruction Multiple Data. With SIMD, if the same operation needs to be performed on several data elements, the operation can be performed once for all of the data elements at the same time, versus being performed for first data element, then performed again for the second element, then again for the third, etc. Again, fellow IBMer Jim Elliott provided a nice analogy: suppose you need to get three packages of the same type from point A to point B. It would be more efficient to do that by sending all three packages in one truck, as opposed to sending three trucks carrying one package apiece. Here's the DB2 for z/OS angle: SIMD should boost the performance of two kinds of application in particular: those that are data-intensive (referring to the volume of data operated upon) and those that are compute-intensive (referring to operations performed on data elements). Put those two application characteristics together, and what do you get? Analytics (among other things). In recent years, there has been a marked increase in the use of z Systems servers for analytics applications, driven in part by the steady convergence of transactional and analytics processing. z13 SIMD technology will add fuel to this trend, and it's a trend that most definitely benefits DB2 for z/OS. Software has to be modified to take advantage of SIMD, but that work is already underway. Look for SIMD exploitation by IBM analytics tools in the near future. And, in a z/OS V2.2 system (and V2.1 with some PTFs), SIMD will be exploited by XML System Services (used for things such as schema validation when XML data is stored in DB2 for z/OS tables), by our Java SDKs (good for WebSphere Application Server), and by our COBOL and PL/I compilers.
And that, folks, is the short and sweet summary of what I most like about the z13: Big Memory (like jet fuel for DB2, and priced to sell), SMT (like multi-lane highways), and SIMD (like shipping multiple packages in one truck). There's plenty more z13 technology that is cool in its own right, but as a DB2 for z/OS specialist I'm keeping my "big three" features front-of-mind.