Monday, April 29, 2013

Application Performance: When Reducing Concurrency can Improve Throughput

There's an interesting application performance situation that I've encountered several times over the past twenty years or so, most recently a couple of weeks ago. I was talking with a DBA over lunch at a regional DB2 users group meeting. He mentioned that he and some of his colleagues had been working to address a performance problem involving a purchased application that accessed a DB2 for z/OS-managed database. A particular data-changing application process was not performing well, and the cause was clear: lock contention -- especially deadlocks. Row-level locking was in use, so that contention-reducing card had already been played. One of the moves being contemplated was a further reduction in the already-low deadlock detection cycle time. I had a different suggestion: reduce the level of execution concurrency for the process. In other words, reduce the degree of multi-threading for the process. I said that serious consideration should be given to single-threading the application process. That recommendation ended up putting the DBA's team on the path toward success.

The advice I gave to my DBA friend may seem contradictory at first blush. I mean, increasing the level of multi-threading for a process is supposed to boost application performance, right? Well, yeah, if the process is designed for concurrency. Some applications are so designed, and some aren't. Design for concurrency is, of course, chiefly applicable to processes that change data, because such processes acquire X-locks on DB2 data pages or rows. Most especially, design for concurrency applies to processes that update or delete data rows, since inserts are generally not blocked by locks (I say "generally" because an insert will wait for an X-lock held by another process on the target page to be released, if the table is hash-organized -- I described hash-organized tables in an entry I posted to this blog a few months ago). If a process is such that the same set of rows can be changed by overlapping units of work (given a multi-threading scenario), it is possible that deadlocks will negatively impact application throughput (of course, if concurrently executing data-changing processes act on different sets of rows -- as is very often the case -- then the level of execution concurrency can go very high with no lock contention problems whatsoever).

If an application process driving a lot of deadlocking is one that was developed in-house by your organization's programmers, it may be that a design modification could improve concurrency by changing the order in which data-changing operations occur (for example, row-changing actions might be moved to the "end" of a unit of work, or a sequence of change A / change B / change C might be altered to B-A-C or otherwise re-sequenced). If, on the other hand, the application in question is a product purchased from a vendor (as was the case for the DBA to whom I spoke at the user group meeting), code modification likely won't be an option. Even for an in-house-developed application, changing a program's design to alleviate a deadlock problem may not be feasible or fruitful. What then?

That's when your thoughts should turn to reducing the level of multi-threading for the process -- perhaps as far as going the single-thread route. I know for a fact that reducing concurrency of execution for a process can improve performance, even if you go all the way to one-at-a-time. Here's how: if your deadlock detection time (set via the DEADLOK parameter in your IRLM PROC) is at the default of 1 second, two deadlocked transactions could wait for a second (that is, 1000 milliseconds) before one is selected by DB2 to be the "loser." That transaction's work then gets rolled back, and it has to be re-tried (if the associated program has retry logic). Now, suppose that the process is single-threaded. If the average transaction time is, say, 50 milliseconds, and a transaction gets there "second" (i.e., arrives when another instance of the process is in the midst of execution, requiring a wait because of the single-thread situation), it will have to wait a maximum of 50 milliseconds before it can get going, AND there won't be any rollbacks or required retries. Much better, right?

"Wait," you might say, "'Better' depends on the transaction arrival rate," and you'd be right in focusing on that as a key factor -- that, and average transaction elapsed time. If the average elapsed time for a transaction that you want to single-thread is 50 milliseconds, then an arrival rate of up to 20 transactions per second should be OK in terms of avoiding big back-ups in a single-threaded scenario. That may not sound like a very high transaction rate to you, but I've seen that some processes that have had deadlocking problems when the level of multi-threading is high in fact have transaction arrival rates that make single-threading a very viable throughput-boosting technique. And, keep in mind that reducing the level of multi-threading for a process to two or three -- as opposed to one -- might be sufficient to alleviate a deadlock problem while providing support for a higher transaction arrival rate than could be handled with single-threading in effect.

How might one go about reducing multi-threading for a process in order to boost throughput through elimination (or near-elimination) of deadlocks? There are several alternatives, depending on the nature of the process and other database-accessing applications running in the system. If the process with deadlock issues is the only one accessing a table or set of tables when it executes, single-threading could be achieved through LOCK TABLE IN EXCLUSIVE MODE statements. If other processes require access to the target table or tables while the deadlocking transactions execute, one-at-a-time locking can be effected in a more granular fashion. Suppose, for example, that a process changes some rows in a set that have a common identifier, such as a customer number. If, at the beginning of a transaction, a column in a row with a given customer number in a "master" or "control" table is updated to equal itself (e.g., UPDATE CUST_MASTER SET COL_A = COL_A WHERE CUSTNO = :hvar), a transaction can change data in rows with that customer number without deadlocking with another instance of the process -- this is a very low-cost way to get a serializing X-lock on a row or page.

If the process is executed as a CICS transaction, specifying THREADLIMIT=1 and THREADWAIT=YES in the associated DB2ENTRY CICS RESOURCE will result in one-at-a-time execution.

For a DRDA-using client-server process, the number of connections to a DB2 for z/OS system for an application can be controlled on the client side. DB2 10 for z/OS provided a server-side means of controlling the number of connections from an application server: profile tables (something about which I blogged a few weeks ago). DB2 server-side control of client connections is even more dynamic and comprehensive if you have IBM InfoSphere Optim Configuration Manager.

Whatever your preferred means of achieving single-threading (or just a reduced level of multi-threading), make sure that this is in your application tuning tool box. Under certain circumstances, you really can get more throughput for an application process by having less of it executing in the system at one time.

No comments:

Post a Comment