Twice in the past month, I've encountered a misunderstanding pertaining to dynamic SQL statements issued by applications that access DB2 for z/OS via network connections (these could also be called DDF-using applications, or DRDA requesters). Now seems as good a time as any to clear things up. I'll use this blog entry for that purpose.
The misunderstanding of which I speak: some people are under the impression that dynamic SQL statements issued by DDF-connected applications are zIIP-eligible when executed, while static SQL statements issued by DDF-connected applications are not zIIP-eligible.
This is not true. zIIP eligibility is not a dynamic or static SQL thing. It is a task thing (as described in a blog entry I wrote about 15 months ago). Here, "task" refers to the type of task in a z/OS system under which an SQL statement executes. zIIP-eligible work executes under a type of task called an enclave SRB (also referred to as a preemptible SRB); thus, when a SQL statement -- dynamic or static -- runs under an enclave SRB, it is zIIP-eligible. If it runs under a TCB, it is not zIIP-eligible. When does a SQL statement run under an enclave SRB in a z/OS system? Three scenarios come to mind:
When the SQL statement is issued by a DDF-connected application (i.e., by a DRDA requester). In that case, the SQL statement will run under an enclave SRB (again, that's a preemptible SRB) in the DB2 DDF address space. The statement's execution will be off-loadable -- as much as 60% so -- to a zIIP engine (if significantly less than 60% zIIP offload is seen for SQL statements issued by DDF-connected applications, the cause could well be zIIP engine contention in the LPAR). Note that static versus dynamic is a non-issue here -- either way, the statement runs under an enclave SRB and so is zIIP-eligible.
When the SQL statement is issued by a native SQL procedure that is called by a DDF-connected application. I underlined "native SQL procedure" because a SQL statement (again, static or dynamic) issued by an external stored procedure (such as an external SQL procedure, or a stored procedure written in COBOL or C or Java) will not be zIIP-eligible, regardless of whether it is called by a DDF-connected application or by a local-to-DB2 program (such as a CICS transaction or a batch job). A SQL statement issued by an external DB2 stored procedure will not be zIIP eligible because such a stored procedure always runs under a TCB in a stored procedure address space, no matter what type of application -- local to DB2, or remote -- issues the CALL to invoke the stored procedure. Conversely, a native SQL procedure always runs under the task of the calling application process. If that process is a DDF-connected application, the application's z/OS task will be, as pointed out above, an enclave SRB in the DDF address space. That being the case, a native SQL procedure called by a DDF-connected application will run under the DDF enclave SRB representing that application in the z/OS LPAR, and the SQL statements issued by the native SQL procedure (and all statements of that type of stored procedure are SQL statements -- it's a stored procedure written in SQL) will execute under that enclave SRB and so will be (as previously noted) up to 60% zIIP-eligible.
You might think, "Hey, isn't a Java stored procedure zIIP-eligible?" Yes, the Java part of that stored procedure program will be zIIP eligible, but SQL is not Java, and the SQL statements issued by a Java stored procedure will run under a TCB in a stored procedure address space and so will not be zIIP-eligible (actually, they might be a little zIIP-eligible, because the Java stored procedure might "hold on" to a zIIP processor for just a bit after a SQL statement issued by the stored procedure starts executing).
When the SQL statement is parallelized by DB2. A query can be a candidate for parallelization by DB2 if: it is static and the associated package was bound with DEGREE(ANY); it is dynamic and the value of the CURRENT DEGREE special register is 'ANY'; or there is a row for the query in the SYSIBM.SYSQUERY and SYSIBM.SYSQUERYOPTS catalog tables (introduced with DB2 10 to enable statement-level control over execution behaviors such as parallelization and degree of parallelization, for static and dynamic SQL). If a query is parallelized by DB2, the "pieces" of the split query will run under enclave SRBs and so will be zIIP-eligible (up to 80%).
And one more thing... Since DB2 10 for z/OS, prefetch read operations have been 100% zIIP-eligible; thus, even if a query is running under a TCB and is therefore not zIIP-eligible, prefetch reads executed by DB2 on behalf of the query are zIIP-eligible. Prefetch read CPU time, as always, shows up in the DB2 database services address space (DBM1), not in the address space associated with the DB2-accessing application process (e.g., a CICS region, or the DB2 DDF address space, or a batch initiator address space).
So there you have it. To repeat a point made up front: zIIP eligibility of SQL statement execution is a task thing, not a dynamic versus static SQL thing. A static SQL statement issued by a DDF-connected application (i.e., by a DRDA requester) will be zIIP-eligible because it will run under an enclave SRB (i.e., a preemptible SRB). A dynamic SQL statement issued by a CICS transaction program will not be zIIP-eligible, because it will execute under a TCB. Clear? I hope so.
Hey Robert, I always like your blogs. I checked with IBM on the idea of parallelism being offloaded to the ziip and their response was that db2 uses some kind of internal calculation and only uses the ziip if the results of that calculation meet the requirements. In a nutshell, I was told by ibm that the query would have to be a very long running query to qualify.
ReplyDeleteAn important question here is, what does "very long running" mean? I think that you're talking about queries with run times that could be measured in seconds. If, for example, a query has a 45-second elapsed time, that would be considered "very long-running" (and almost certainly unacceptable) in an online transaction processing context. If that query accessed data in multiple partitions of a partitioned table space, I would expect that it could be parallelized by DB2. My point is this: a query doesn't have to run for hours in order to be parallelized by DB2. A query shouldn't even have to run for minutes in order to be parallelized. This is not a big deal, in my mind. You wouldn't want query parallelization in an online transactional processing environment, anyway. I have never encountered a situation in which a DB2-using organization complained because a quick-running query was not parallelized by DB2. I suspect that the bulk of the queries that people want DB2 for z/OS to parallelize are those that run for a minute or more in non-parallel mode.
ReplyDeleteRobert
Understood but my point is that unless changing a program to take advantage of parallelism would result in it using the ziip processor, we wouldn't change it unless we were having an issue with a program taking too long to complete. We do have a few with parallelism enabled. The fear is the possibility of intense cpu spikes on the mainframe. If we knew these would be offloaded to the ziip then sure, why not?
ReplyDeleteQuestion: why do you say 'you wouldn't want query parallelization in an online transactional processing environment anyway'. This is exactly where we would want it, to allow an online transaction to complete quickly. I would only use parallelism in batch processing if I knew it would offload to the ziip, which is the point of my comments. How would we know? So the safe way is to only use parallelism where we really need to, to speed up online transactions. My point is that according to ibm, just enabling parallelism will not result in ziip usage. In fact, according to ibm there is little chance of a parallelized query using the ziip processor.
Thanks for the response and thanks for your great blogs. I get more useful information from your blogs than any other I've read.
"The fear is the possibility of intense CPU spikes..." Understood. That could happen if a query split into many "pieces." You can control this, at a DB2 subsystem level, via the PARAMDEG parameter in ZPARM. If, for example, you set PARAMDEG to 4, that would be the maximum degree to which a query would be parallelized. Starting with DB2 10, you can also use the new catalog tables SYSQUERY and SYSQUERYOPTS to control the degree of parallelism for individual queries.
DeleteYou generally don't want query parallelism in an online transactional processing (OLTP) environment because parallelism, even if it were possible for queries in such programs, would likely SLOW THINGS DOWN. In an OLTP environment, queries typically involve index access, very little in the way of large-scale page scans (of either indexes or table spaces), and small result sets (singleton SELECTs are not unusual in OLTP programs). Parallelism would typically offer NO ADVANTAGE for those kinds of queries. Even if DB2 decided to parallelize one of the quick-running queries that typify OLTP programs (and it wouldn't, because that wouldn't make sense), the extra CPU required to spawn split queries, manage same, and merge results of the split queries would more than offset any elapsed time improvement that query parallelism might offer.
Batch jobs are precisely those for which query parallelism is often - and rightly - used. When a query in a batch job accesses data in several partitions of a partitioned table space, and scans a good number of pages in these different partitions, splitting the query can definitely improve elapsed time. Not only that, but the split queries should definitely run on zIIP engines in that case (specifically, they'd be up to 80% zIIP-offloadable). Whoever at IBM told you that parallelized queries are not likely to run on zIIP engines gave you incorrect information. You can see the zIIP execution of parallelized queries in accounting reports generated by a DB2 monitor.
Robert