I recently encountered a situation that served to illuminate a number of important aspects of application processing in the context of the DB2 for z/OS distributed data facility (DDF). I'll go through the scenario here, in hopes that you will find the information to be useful.
A systems programmer contacted me about a problem his organization had run into with a new application that accesses DB2 for z/OS by way of a network connection (and so uses DDF). The application drives large numbers of row-insert operations, and was generating huge numbers of DB2 accounting trace records (written to SMF data sets). In one test, executed in a pre-production environment, the application issued 140 million INSERT statements, and that resulted in DB2 cutting 140 million accounting trace records. The trace records filled the SMF data sets faster than they could be offloaded, and as a consequence some of the records were lost.
To reduce the torrent of trace records flooding the SMF data sets, the systems programmer changed the value of the ZPARM parameter ACCUMACC in the target DB2 subsystem to 100. In doing that, he made it possible for DB2 to "roll up" activity for as many as 100 DDF units of work into a single accounting trace record. That action addressed the immediate problem, but only for the test system: in the production environment, the organization had a requirement for one accounting trace record per DDF unit of work (ACCUMACC on the production DB2 subsystem was set to NO).
With roll-up of activity for multiple DDF transactions into a single accounting record not an option in production, the systems programmer looked into multi-row INSERTs (also referred to as "bulk inserts") as a means of reducing the volume of trace records that would be generated when the new application executed. He put this question to me: Could multi-row INSERT be utilized for a Java application accessing DB2 for z/OS-managed data via DDF? I responded affirmatively, and pointed out that information on multi-row INSERT for Java programs can be found in the DB2 for z/OS Application Programming Guide and Reference for Java (the DB2 10 manual can be downloaded from http://www-01.ibm.com/support/docview.wss?uid=swg27019288#manuals, and the DB2 11 manual from http://www-01.ibm.com/support/docview.wss?uid=swg27039165#manuals).
The systems programmer got together with one of the developers of the new application, and they ran a test in which 1000 rows were bulk-inserted into a DB2 table, in chunks of 100 rows. Instead of 1000 accounting trace records, execution of the test program generated 10 trace records. Here's the interesting part: the program was not issuing any explicit commits. That led the systems programmer to surmise that perhaps accounting record volume was being driven by the volume of INSERT statements issued by the application program. After all, when 140 million single-row INSERTs had been issued by the program, 140 million accounting records had been written to SMF. When 1000 rows were inserted by way of 10 bulk inserts of 1000 rows apiece, 10 trace records had been cut. I explained that accounting records are generated not by issuance of data-changing SQL DML statements, but by commits issued following execution of one or more SQL DML statements. [That's the way it works for application processes, such as DDF-using programs, that are transactional in nature -- a z/OS batch program is different, in that one accounting record will be generated when the program completes, even if the program issued hundreds of commits while executing.]. As the Java application developer was not issuing explicit commits from his program, I speculated that AutoCommit was enabled on the client side of this client-server application. Not so. The systems programmer verified that AutoCommit was set to FALSE for the new application. This information left us scratching our heads.
The sysprog next suggested that perhaps the DBAT used by the application was going inactive after each INSERT, and the accounting records were being generated as a result of DBATs going inactive (DBAT is short for database access thread -- the kind of thread used by applications that connect to DB2 via DDF). Not the case, I told him. DBATs don't go inactive in the usual sense. A connection (from a client application to DB2) will go inactive when a DDF transaction completes. Yes, the DBAT that had been used to service the transaction will go back to the DBAT pool at that time, but I'd say that "disconnected" is a better term that "inactive" for the pooled DBATs (I'm talking here about "regular" DBATs -- a high-performance DBAT will remain associated with the connection through which it was instantiated, and that connection will not go inactive until the high performance DBAT is terminated after having been reused 200 times). In any case, I said, the key is transactions completing, not connections going inactive. Connections going inactive, like accounting records getting cut (with ACCUMACC set to NO), are a response to DDF transactions completing, and transaction completion involves commit processing.
Asked the sysprog: Could packages bound with RELEASE(COMMIT) be the cause of the large volume of accounting trace records associated with execution of the new application? My response: No. Accounting trace records are generated at DDF transaction boundaries, and RELEASE(COMMIT) and RELEASE(DEALLOCATE) do not affect transactional boundaries. Commits establish transaction boundaries. Could DB2 itself be driving the commit activity that was in turn driving trace record generation? No. DB2 does not cause commits. DB2 responds to commits (an exception to that rule is the commit driven upon completion of a DB2 stored procedure defined with COMMIT ON RETURN YES).
Finally, in searching for the elusive commits that we knew were coming from somewhere, the systems programmer and his application development colleagues found the answer: the new application was using the open-source Spring Framework, and in response to the application indicating completion of a transaction, the Spring Framework would drive a commit that in turn would drive generation of a DB2 accounting trace record. With this now clarified, the application team can vary commit frequency as desired to balance volume of trace records generated (don't want too many commits) against accumulation of DB2 child X-locks associated with INSERT processing (don't want too few commits).
The key take-away here: for a lot of reasons, issuance of commits is really important for DB2-accessing application processes. Sometimes, developers of DDF-using applications have direct visibility of commit-issuing logic. Other times, use of something like a framework between a client program and a DB2 for z/OS data server (and the Spring Framework is just one example of this kind of thing) abstracts issuance of commits in a way that removes commit visibility from an application developer. When that is the case, commit frequency can still be controlled and changed as needed, but that requires an understanding of how interaction by a program with the framework drives the commits that flow to DB2. If your DB2-accessing client programs interface with an application framework, take the time to understand how commits are made to flow to DB2. In doing that, you'll avoid surprises and you'll be less likely to spend time scratching your head and spinning your wheels.
Great information!
ReplyDelete