Not long ago, a Db2 for z/OS DBA sent to me, via a colleague of mine, a question. He described in an email the procedure that his team regularly used to consolidate extents, when the number of these got into double digits for a Db2 table space-related data set, back down to one. He noted that this extent-consolidation procedure was more time-consuming and CPU-intensive than desired, and he wanted to know if I had any suggestions for making the procedure more efficient. In fact, I did have a suggestion for improving the efficiency of the Db2 data set extent consolidation procedure used at this person's site. My suggestion: STOP DOING THAT.
It might have been the comedian Henny Youngman who'd get laughs with this joke: "I told my doctor, 'It hurts when I do this.' He told me, 'Stop doing that.'" In all seriousness, Henny (or whoever it was) had an important point there. When some procedure causes pain in the form of CPU consumption and/or labor intensity, the best way to take that pain down to zero is to dispense with said procedure. In the context of Db2 data set extent-reduction efforts, my "Stop doing that" suggestion might engender this response: "But, if we dispense with our Db2 data set extent consolidation procedure, we'll end up with Db2 data sets that have a lot of extents!" My response to that response: "So?"
Here's the deal, folks: extents matter WAY less for Db2 for z/OS data sets than they did a long time ago (like, back in the 1990s). Way back when, a real concern about a Db2 table space data set going into a lot of extents was the impact this could have on prefetch read performance. Such a negative extent effect could in fact occur because - again, way back when - Db2 prefetch read operations were satisfied from spinning disk. Disk controller cache memory sizes were so small back in the day that Db2 would bypass cache for prefetch reads, and those multi-page reads from spinning disk could suffer, performance-wise, if the data set holding the page set or partition being accessed (table space or index) had gone into a good number of extents.
Things are different now, in a number of ways:
- Db2 subsystem buffer pool configurations are MUCH larger than they were some years ago, owing largely to 1) mainframe memory getting less expensive all the time (on a per-gigabyte basis), leading organizations to load up on z/OS real storage (often to the tune of several hundred GB for a production z/OS LPAR); and 2) people realizing that if you give Db2 a lot of memory (e.g., for larger buffer pools), it generally performs really well. Much larger buffer pool configurations mean that a much higher percentage of Db2 page requests (synchronous and asynchronous) are satisfied from pages in memory, as opposed to requiring disk subsystem read I/O operations. Obviously, when page requests are satisfied from pages in memory, data set extents on disk are irrelevant.
- Disk controller cache memory sizes have been really big for a long time; and, that large disk controller cache memory resource is managed in a high-performing way by powerful microprocessors that are an integral part of modern enterprise disk subsystems. What these large and intelligently managed disk controller cache resources mean is that a read request (synchronous or asynchronous) that cannot be satisfied from the buffer pool configuration will often result in a read from disk controller cache, as opposed to requiring a read from spinning disk. As is true for a read request that is satisfied from data in memory (in a buffer pool), data set extents are not relevant for a read of data from disk controller cache.
- Even when a Db2 read request leads to accessing data all the way back on spinning disk, the architecture of modern enterprise disk subsystems - primarily RAID in nature - reduces the performance impact of data set extents from what it once was.