Saturday, March 16, 2013

Bringing Control to a DB2 for z/OS Client-Server Workload (Part 1)

At System z sites all over the world, DB2 for z/OS workloads are growing -- often at a brisk clip. You'd think that this would be unqualified good news for the folks who support mainframe DB2 subsystems, but in fact some people view a lot of this new DB2 activity with trepidation. Why? Because the lion's share of application work being implemented these days on a DB2 for z/OS foundation is of the client-server variety. In other words, it involves data access through DB2's distributed data facility, aka DDF. OK, so what's the problem with that? The problem has to do with the belief of some individuals that a DB2 DDF workload is one that they cannot control.

I'll tell you up front that this notion of a DB2 client-server workload as being a chaotic swirl of data requests that execute without regard to rules is NOT grounded in facts -- at least, not in today's facts. And there you have a root cause of a lot of misunderstanding: knowledge of DDF that was current 20 years ago but which hasn't kept pace with enhancements that have been delivered with each succeeding release of DB2. In this blog post and in the part 2 post that will follow, I will provide information on several mechanisms that can be used to establish various controls on a DB2 for z/OS client-server application workload.

Let me start by saying that I don't consider people who are concerned about managing a DB2 DDF workload to be wound too tightly, as we say. No, they're being good mainframers. They support a platform that is known for rock-solid reliability, and a pillar of that reliability is control. In a DB2 sense, that means control over data access privileges, over SQL statement access paths, over workload classification and and management, and over DB2 connection resources provided to particular applications. It's true that in the early years of DDF (back in the 1990s), some controls that you'd like to have weren't there. Now they are. In the paragraphs below I'll describe a few of the techniques that can be utilized to bring control to a DB2 for z/OS client-server workload. More information will be provided in another week or so, in the companion to this part 1 entry.

Native SQL procedures, for data access control and SQL statement access path stability

Now, in truth, the SQL statement access path and data access control benefits of stored procedures are available whether external or native SQL procedures are utilized; however, in a client-server application context I'm particularly keen on native SQL procedures (about which I first started blogging while working as an independent DB2 consultant), for a couple of reasons:
  • All kinds of people can write them. This includes people who have worked predominantly with DB2 for LUW, or even with relational database management systems other than DB2. If you can write SQL, you can code a native SQL procedure (and it's even easier when you use Data Studio).
  • They give you lots of zIIP engine offload when invoked through DDF. Unlike an external DB2 stored procedure, which runs under a TCB in a WLM-managed stored procedure address space, a native SQL procedure runs under the task of the calling program. When the CALL comes from a DRDA requester, that task is a preemptible SRB in the DB2 DDF address space, and so the processing associated with the native SQL procedure's execution can be largely offloaded (to the tune of about 60%) to an available zIIP engine. Native SQL procedures are pretty CPU-efficient anyway (and more so in a DB2 10 environment). Toss in the high degree of zIIP eligibility in a client-server setting, and native SQL procedures look even more attractive from a cost-of-computing standpoint.

The data access control and access path stability benefits of native SQL procedures owe to the fact that the SQL statements issued by such a procedure are static (or typically so -- one can, of course, code PREPARE statements in native SQL procedures). Data security is enhanced when static SQL is used because the only authorization required for successful execution of SQL statements in that case is the EXECUTE privilege on the package into which the statements were bound (plus, in the case of a stored procedure, the EXECUTE privilege on the stored procedure). By contrast, successful execution of dynamic SQL DML statements requires that the authorization ID of the process issuing the statements have the requisite access privileges (SELECT and/or INSERT and/or UPDATE and/or DELETE) on the tables targeted by the statements. Stored procedures further enhance data security by abstracting from client-side developers the details of the database schema (table names, column names, etc.).

The stable access paths that characterize static SQL statements help to ensure consistent performance when the statements are executed. Static SQL also minimizes CPU consumption: even with a high "hit" percentage in the DB2 dynamic statement cache, dynamic SQL might approach -- but will not equal -- the CPU efficiency of static SQL.

So, static SQL is good, but do you need to utilize stored procedures to get the benefits of static SQL for DB2 client-server applications? Not necessarily. If your client-side code is written in Java, those programs can issue static SQL statements by way of SQLJ; however, while SQLJ is a standard at some sites, it is often the case that Java programmers will have a strong preference for using JDBC, and JDBC means dynamic SQL at the DB2 data server. That's what makes stored procedures a great fit in a DB2 client-server application environment: they provide a means whereby static SQL can be dynamically invoked. If your client-side developers prefer a database interface such as JDBC or ODBC, fine -- let them use JDBC or ODBC to call DB2 stored procedures and to process output parameters and/or result sets returned by those stored procedures.

Roles and trusted contexts for data access control

I won't repeat here everything that I wrote in a blog entry on roles and trusted contexts that I posted a couple of years ago. Let me instead focus on setting the stage as to why these features, introduced with DB2 9 for z/OS, are so valuable when it comes to controlling data access in a DB2 client-server context.

Sometimes, DB2 client-server applications are going to issue dynamic SQL DML statements, whether you like it or not. That will very often be case when the application in question is vendor-supplied (e.g., a purchased enterprise resource planning or human resources application). You'll be dealing with dynamic SQL, as well, if statements are generated by an end-user query or reporting tool. You could have dynamic SQL statements coming from an in-house-developed application, if the aforementioned stored procedure route (or the SQLJ route, for Java programs) is not taken. Whatever the reason for dynamic SQL DML statements, they create a potential data security exposure. As mentioned previously, successful execution of a dynamic SELECT, INSERT, UPDATE, or DELETE statement requires that the application's DB2 authorization ID have the corresponding privilege (i.e., SELECT, INSERT, UPDATE, DELETE) on the targeted table. Now, it's likely that an application issuing dynamic SQL DML statements will connect to DB2 using an authorization ID and password that are unique to the application, but anyone who knows that authorization ID and password would seem to be free to use the table access privileges granted to the ID to access data -- maybe very sensitive data -- outside of the application. That's kind of a scary thought.

Enter roles and trusted contexts. In essence, here's how these two DB2 features work together to tighten data access control for client-server applications that issue dynamic SQL DML statements: first, the table access privileges needed for an application to work are granted not to an authorization ID, but to a role (something created by way of the CREATE ROLE statement). Next, the conditions of the use of the role's privilege set are defined through the creation of a trusted context. In plain English, the CREATE TRUSTED CONTEXT statement says something like this: "OK, I will allow the privileges granted to role XYZ to be used by an application that connects to the DB2 subsystem using authorization ID ABC, but only when that connection is established from the application server with IP address 9.30.131.203." If someone who knows the application's authorization ID and password attempts to use these credentials to connect to the DB2 subsystem from his personal workstation, the connection might be successful but no security exposure will exist because no DB2 privileges were granted to the authorization ID -- they were granted to role XYZ. The unauthorized user of the application's ID will not be able to use the table privileges granted to role XYZ, because he has connected to the DB2 subsystem from an IP address other than the one specified in the CREATE TRUSTED CONTEXT statement. If the rogue user isn't permitted to log in to the app server at (in this example) IP address 9.30.131.203, he can't do anything, DB2-wise, with the application's authorization ID and password. If needs be, a trusted context can further tighten data access control by specifying that only user IDs SMITH, JONES, and MILLER can use the privileges granted to role XYZ, and again only when the "come from" IP address is 9.30.131.203. With roles and trusted contexts, people responsible for safeguarding DB2 data can rest easier, even when client-server applications are issuing dynamic SQL DML statements.

And more to come...

In my part 2 entry, I'll provide information about additional mechanisms -- namely, the z/OS workload manager, DB2 10 profiles, and the DB2 10 query repository -- that you can use to control a DB2 client-server workload.

10 comments:

  1. Robert, you wrote "to the tune of about 60%" this change was implemented via a V9 APAR. There were some discussions about how this was implemented and there were subsequent APARs. With current maintenance, how do V9 and V10 apply the 60% across DDF threads?

    ReplyDelete
    Replies
    1. Not sure what you mean by the question, "With current maintenance, how do V9 and V10 apply the 60% across DDF threads?" If you're asking what the zIIP offload percentage is these days for work done under DDF preemptible SRBs at DB2 9 and DB2 10 sites, I can only relate what I've seen in DB2 monitor accounting reports generated at DB2 for z/OS sites: zIIP offload in the range of 55-60%.

      Robert

      Delete
    2. Robert, with the original APAR it was reported by at least one user that they experienced erratic response times because it appeared that 60% of their DDF threads got 100% zIIP offload and 40% of their threads got 0% zIIP offload. It seemed to be the luck of the draw, with the net result for a site with uniform threads of an average of 60% zIIP offload. But if you had widely varying threads then your "mileage would vary."

      There was a subsequent APAR with vague wording that this behavior would be made more uniform, but it didn't say if it returned to the back and forth switching that the first APAR was meant to avoid since it caused overhead.

      Delete
    3. The "vague wording" APAR
      https://www-304.ibm.com/support/docview.wss?crawler=1&uid=swg1PM28626

      might solve the issue of "more variation" by making every thread consistently slow.

      I was curious whether the APAR was "A cure worse than the disease."

      Delete
    4. Based on what I've learned about the internals of the implementation, the "cure" provided via APAR PM28626 (more consistent zIIP offload at the statement level for SQL processed through the DB2 DDF address space) is definitely NOT worse than the "disease." In fact, the z/OS overhead associated with managing zIIP offload should be the same or even REDUCED with application of the fix for PM28626, IF the fix for z/OS APAR OA35146 is also applied. Note that z/OS APAR OA37201 is also related, and the fix for that APAR should be applied, as well.

      Robert

      Delete
  2. Robert, "All kinds of people can write them," but only people who have demonstrated that they have written them well on LUW or other platforms should be allowed to do so.

    I refer you back to a previous article of yours where it appeared that some host person had written a Native Stored Procedure using a DTT when a CTE would have been better. This could have been done by anyone, from any platform. Poorly or badly written SQL is being churned out across a wide range of platforms.

    Failure to adapt to the performance characteristics and options of the implementation platform should be a concern.

    What is perfectly normal and appropriate for one platform or old version, becomes a bad habit with the current SW on a different platform.

    Stored Procedure writers across a wide variety of platforms have probably gotten into the habit of using DTTs to return results. In general, for DB2 10 for z/OS, this is a habit that should be broken.

    ReplyDelete
    Replies
    1. I hear what you're saying. I didn't mean to imply that anyone who knows SQL well, on any platform, can (or should) step right into developing native SQL procedures for a production DB2 for z/OS system. My point is that these folks - the people who know SQL well, and who have written SQL procedures on DB2 for LUW or even non-DB2 RDBMS platforms - should have a not-too-steep learning curve with regard to DB2 for z/OS native SQL procedure development. In other words, these folks should be able to be pretty productive, pretty quickly, when it comes to coding native SQL procedures in a DB2 for z/OS environment.

      Interesting that you brought up declared global temporary tables as a sub-optimal means of making the result set generated by a stored procedure-declared cursor available to a calling program. The DB2 10 for z/OS cursor declaration option WITH RETURN TO CLIENT enables one to make a stored procedure-generated result set available to a program more than "one level up" from the said stored procedure without having to utilize a DGTT. As DB2 for LUW had RETURN TO CLIENT cursors before DB2 for z/OS (DB2 for z/OS previously had only RETURN TO CALLER capability, which made a cursor's result set available to the program that directly called the procedure generating the result set), some sites may find that they have DB2 for LUW SQL procedure developers who are already comfortable with using RETURN TO CLIENT in DECLARE CURSOR statements in DB2 stored procedures.

      Robert

      Delete
    2. Robert, wrt to your 2nd paragraph, that DB2 10 enhancement was exactly why I wrote, "In general, for DB2 10 for z/OS, this is a habit that should be broken."

      FYI, one of the rationales for adding DTT (DGTT)function via APAR to V5 was to aid in porting stored procedures from other platforms where apparently this use of DTTs in stored procedures was relatively common.

      Thanks again for your insights and experiences.

      Delete
  3. Hi,

    How could I know whether the dymamic SQL is executed by a role and not by the user ID authorization?

    Thanks

    ReplyDelete
    Replies
    1. Interesting question. Here's the deal: a dynamic SQL statement can be successfully executed (from an authorization perspective) if the required execution privileges are included in the privilege set of the associated process. For most SQL statements, the privilege set of a process is the union of the privileges held by all of that process's DB2 authorization IDs (the primary ID and, if applicable, all secondary IDs) AND the role (if any) associated with the process. There will be a role associated with the process only if the process establishes a trusted connection to DB2 by way of a trusted context. If that is the case, then a -DISPLAY THREAD command will show both the process's primary auth ID and its associated role.

      If the privileges needed to execute a certain dynamic SQL statement are held by an authorization ID of a process AND by a role associated with that process, I don't know that the question, "Which privilege - the one held by the auth ID, or the one held by the role - enabled the execution of the statement?" is valid. I see it as being analogous to a question that one might ask if the privilege needed to execute a statement are held by two of a process's secondary auth IDs, A and B: "Was it the privilege held by secondary auth ID A, or by secondary auth ID B, that enabled successful execution of the statement?" In either case, the question is probably moot. You're dealing with a set of privileges, and it's the set that matters.

      Given that reality, a sure-fire way to determine whether successful execution of a dynamic SQL statement owes to privileges held by an auth ID or a role is to start revoking privileges from authorization IDs. The fact is, if you want to use roles and trusted contexts as a means of tightening up security in your DB2 for z/OS environment, you're going to have to get away from privileges being held by auth IDs (primary or secondary). If the privileges needed to execute dynamic SQL statements in your DB2 system are widely held by authorization IDs, roles won't make much of a difference, security-wise.

      Robert

      Delete