Thursday, July 12, 2012

DB2 for z/OS: What's With DDF CPU Time?

Recently I got a question from a mainframer who was seeing rather high CPU times for the DB2 DDF address space. DDF, of course, is the distributed data facility -- the DB2 component that handles communications with client systems (usually application servers) that send requests to the DB2 for z/OS server using the DRDA protocol (DRDA being Distributed Relational Database Architecture). Anyway, the questioner wanted to know what he and his colleagues could do to reduce DDF CPU consumption.

This query prompted me to write a blog entry about DDF CPU utilization, because there remains today -- even 20 years after DDF was introduced with DB2 V2.2 -- a good bit of misunderstanding concerning this subject. I believe that the confusion comes from people looking at the DDF address space as they do the other DB2 "system" address spaces (the database services address space, also known as DBM1; the system services address space, or MSTR; and the internal resource lock manager, or IRLM). Generally speaking, these address spaces show very little in the way of CPU consumption (this is particularly true of the IRLM and MSTR address spaces). Here is some information from a real-world mainframe DB2 environment, as seen in a DB2 monitor statistics detail report covering a busy two-hour time period:

                       TOTAL TIME
                     ------------
SYSTEM SERVICES       2:16.529524
DATABASE SERVICES    55:38.969257
IRLM                    21.249774

So, on this system (a big one, with multiple engines, running a large DB2 workload), the IRLM address space consumed 21 seconds of CPU time and the DB2 system services address space consumed just over 2 minutes and 16 seconds of CPU time. The DB2 database services address space consumed considerably more CPU time than these other two (a little more than 55 minutes' worth), but that's not unusual when there's a lot of prefetch and database write I/O activity on a system (true in this case), as the CPU time for such I/O operations is charged to DBM1. That the CPU consumption of these address spaces is relatively low on a system with a great deal of DB2 data access activity is not at all unusual: overall CPU consumption for a DB2 workload is associated primarily with SQL statement execution, and the vast majority of that time is charged to the address spaces (e.g., CICS regions and batch initiators) through which data access requests get to DB2.

Now, during the same two-hour period on the same system on which the numbers shown above were seen, CPU consumption of the DB2 DDF address space was as follows:

                       TOTAL TIME
                   --------------
DDF ADDRESS SPACE  3:16:34.642514

"Three hours and sixteen minutes?!? What's with that? I thought that the DB2 address spaces are supposed to show smaller numbers for CPU consumption! Is there a problem in my system? What should I do?!?"

First of all, you should relax. You can't equate DDF CPU consumption with that of the other DB2 "system" address spaces, because it really is a different animal. IRLM, MSTR, and DBM1 are about data access. DDF is about transaction management (specifically, client-server transactions) -- and that understanding should point you to the reason for sometimes-high DDF CPU utilization: it's driven by execution of SQL statements that get to DB2 through the DDF address space (typically, SQL statements issued from programs running in network-attached application servers, or from DB2 stored procedures called by such programs). Recall that I mentioned, a couple of paragraphs up, that the vast majority of the CPU time associated with SQL statement execution is charged to the "come from" address space. If the statement comes from a CICS transaction program, that time is charged to the CICS region in which the program executed. If the statement comes from a DRDA requester, the CPU time is going to be charged to the DDF address space (and stored procedure usage doesn't change this picture: CPU time consumed by a stored procedure is charged to the address space through which the stored procedure CALL got to DB2). So, in a system in which there is a lot of client-server DB2 activity, you'll see higher numbers for DDF CPU consumption (at plenty of sites, DDF-related activity is the fastest-growing component of the overall DB2 for z/OS workload, and at more and more locations it is already the largest component of the DB2 workload).

Back to the DB2 monitor statistics detail report referenced earlier: the breakdown of address space CPU time tells the story with respect to DDF CPU consumption. Here's the additional information from the DDF address space line that I didn't show before (I'm showing times down to the millisecond level instead of the microsecond level to save space):

       TCB TIME    PREEMPT SRB  NONPREEMPT SRB    TOTAL TIME
        --------    ----------- --------------   -----------
DDF    1:02.659    3:13:21.699       2:10.283    3:16:34.642   

You can see that almost all of the DDF address space CPU time (about 98%) is in the "preemptible SRB" category. This is "user" CPU time, associated with the execution of SQL statements issued by DRDA requesters (work tied to database access threads -- aka DBATs -- is represented by preemptible SRBs, which you can think of as "dispatchable" SRBs). The other 2% of the DDF CPU time is "system" time, related to work done under DDF's own tasks on behalf of the aforementioned "user" tasks. So, DDF is in fact a very CPU-efficient manager of DB2-accessing client-server transactions.

Now, what would you do to reduce the CPU consumption of a DDF address space? You'd take actions to reduce the execution cost of SQL statements that get to DB2 through DDF. Those actions might include the following:
  • Convert external stored procedures to native SQL procedures. This is actually more about reducing the monetary cost of CPU cycles consumed by a DB2 client-server workload, versus reducing the cycles consumed. When a native SQL procedure (introduced with DB2 9 in new-function mode, and available in a DB2 10 new-function mode environment if you're migrating to DB2 10 from DB2 V8) is executed through DDF, around 60% of the associated CPU time will be zIIP eligible (the same is not true for external stored procedures). This zIIP offload will reduce the cost of your DB2 client-server workload, because zIIP MIPS are less expensive than general-purpose CPU MIPS.
  • Take greater advantage of DB2 dynamic statement caching. Often, a lot of dynamic SQL statements are executed through DDF. If you are getting a low hit ratio in your dynamic statement cache (check this via your DB2 monitor), consider making the dynamic statement cache larger (done via a ZPARM change). This assumes that you have enough real storage on your system to back a larger dynamic statement cache.
  • Go to DB2 10 (if you're not already there) and take advantage of high-performance DBATs. This is done by binding packages executed via database access threads with RELEASE(DEALLOCATE). I blogged on this topic last year.
  • Improve SQL statement CPU efficiency by tuning your DB2 buffer pool configuration. This, of course, would beneficially affect all SQL statements -- not just those executed via DDF. I have "part 1" and "part 2" blog entries on this subject.
  • Turn dynamic SQL statements executed via DDF into static SQL statements. We have a nice tool, IBM Optim pureQuery Runtime, that can help you to do this without having to change client-side program code. pureQuery Runtime can also be used to "parameterize" dynamic SQL statements that were coded with references to literal values in predicates -- something that will boost the effectiveness of the DB2 dynamic statement cache. I recently wrote about pureQuery Runtime on my DB2 for z/OS tools blog.
  • Tune your higher-cost DDF-related SQL statements. We have a tool that can help you to do this in a proactive, productive way. It's called the IBM InfoSphere Optim Query Workload Tuner. You can find out more about this product in an entry I posted to my DB2 for z/OS tools blog just last week.

Now you know what's behind the DDF address space CPU times that you're seeing on your system, and what you can do to reduce DDF address space CPU consumption. I hope that this information will be useful to you.

9 comments:

  1. Around the middle of this blog entry, I have a line from a DB2 for z/OS monitor report that shows, for the DDF address space, four CPU time figures: TCB time, preemptable SRB time, non-preemptable SRB time, and total CPU time. This information is from a Statistics Long report generated via IBM Tivoli OMEGAMON XE for DB2 Performance Expert on z/OS. I want to point out that the CPU times I included in the aforementioned line of the blog entry above are indicative of time consumed on general-purpose engines. The Statistics Long report includes a fifth CPU time field for each DB2 address space. That fifth field is preemptable SRB time consumed on zIIP engines. The preemptable-SRB-time-on-zIIP value that would complete the referenced line in the blog entry above is 2:55:43.179. Compare that to the preemptable SRB time figure of 3:13:21.699 (which, again, does NOT include preemptable SRB time consumed on zIIP engines), and you'll see that a little over 47% of the total preemptable SRB time for the DDF address space was consumed on zIIP engines (DDF preemptable SRB time, as pointed out in the blog entry, is basically the CPU time consumed in executing SQL statements that get to the DB2 DBM1 address space via the DDF address space). In some cases, the percentage of CPU time for DDF-related SQL statement execution (i.e., for client-server SQL statement execution) consumed on zIIP versus general-purpose engines is as high as 60%.

    ReplyDelete
  2. great explanation, my next tasl is to find the fields in the smf data where these metrics are reported, have you done a blog on this.
    bob.schwarz@firstdata.com

    ReplyDelete
    Replies
    1. Haven't blogged about that, Bob, but I think I can answer your question here. Obviously, if you're using a DB2 monitor to check out CPU utilization of the DB2 address spaces, the monitor will format the information in the trace records for you. If you want to process the records in some other way, here's what you need to know: first, the DB2 address space CPU time data is found in trace records associated with DB2 IFCID 0001. Those records are written when statistics trace class 1 is active (as is likely to be the case -- it's a very low-overhead trace class). When the records are written to SMF, it is in the form of SMF type 100 records. Descriptions of all the fields in all of the DB2 trace records can be found in a DB2-supplied file called DSNWMSGS. This file is a member of a DB2 library called SDSNIVPD. Information about the structure of DB2 trace records written to SMF can be found in chapter 51 of the manual titled, "DB2 for z/OS Managing Performance." The DB2 10 version of that manual is available at http://www-01.ibm.com/support/docview.wss?uid=swg27019288#manuals, and the DB2 11 version can be found at http://www-01.ibm.com/support/docview.wss?uid=swg27039165#manuals.

      Robert

      Delete
    2. Hi Robert,

      I conducted several performance tests - SQL statements execution in CICS/DB2 pgms and Stored Procedure ( external and native) , and found that CPU consumption for the same statements in SP , 2 -2.5 times higher than in CICS/DB2 pgm.
      For example , INSERT stmt CPU in CICS PGM - 68 microseconds, while the same INSERT in SP is 139 microseconds. And such pattern is for every INSERT ( 2 -2.5 times higher ).
      Is this related to the way how DDF CPu is charged ? But , in this case, how to compare CPU consumption for application running thru DDF and application executing in CICS region ?
      I compared 2 runs of the same apps ( same workload ) between 2 environments , and it's almost 2 times difference. And it's only CP CPU ( I didnt even include zIIP CPU - so difference would be even bigger ).

      Could you shed some light on this ?

      Regards Ilya

      Delete
    3. 1) Where are you getting the performance numbers?

      2) In what language is the COBOL program written?

      3) How are the stored procedures invoked? From DRDA requesters? From CICS?

      Robert

      Delete
    4. Hi Robert,

      1)The performance numbers per statement , I'm getting from CA detector . But summary numbers per Test ( multiple executions of SPs or pgms) , I'm getting from SMF , by running PM report
      2) CICS/DB2 programs - Cobol . SPs - all PSQL

      3) SPs are invoked from DRDA requester - Java apps ( coming thru Webshere to zOS )

      Regards Ilya

      Delete
    5. I can't comment on information provided by Detector -- you'll have to check with the vendor about that.

      If you have OMEGAMON for DB2, you can send some accounting reports to me for comparison. If you send me a note at rfcatterall@gmail.com, I'll tell you how these accounting reports should be generated.

      Robert

      Delete
  3. Hi. We got similiar problem. Cpu +zip for ddf is about ywice higer than cpu reported for cics. The same sql, mostly single selects both static one. Reports generated from omegamon.

    ReplyDelete
    Replies
    1. That does not make sense to me. From what OMEGAMON for Db2 reports are you getting this information? From what field or fields in the report are you getting the CPU data?

      Robert

      Delete