Tuesday, November 21, 2017

Db2 12 SQL Enhancement: Temporal Logical Transactions

Temporal data support, introduced with Db2 10 for z/OS, is one of the more interesting SQL-related Db2 enhancements delivered in recent releases of the DBMS. Temporal data support comes in two flavors (which can both be utilized for a single table): business-time temporal and system-time temporal. With business-time temporal support enabled for a table, an organization can put future changes into the table (e.g., price changes for products or services that will not go into effect until some future date) without affecting programs that, by default, access data rows holding information that is currently in effect (among the use cases for business-time temporal: profitability forecasts, utilizing queries that access price values that will be in effect six months from now).

In contrast to business-time temporal, system-time temporal enables a look back, as opposed to forward, regarding in-effect data values. By that I mean that system-time temporal allows an application or a user to see data rows that were current at a time in the past. The mechanism through which this capability is provided is conceptually pretty simple: when system-time temporal support is enabled for a table (referred to as the base table), a logically equivalent history table is associated with the base table ("logically equivalent" means that the history table has the same columns as the base table: same names, same order, same data types - I've pointed out that physical equivalence is NOT a requirement). Subsequently, when a row is made non-current in the base table by way of an UPDATE or a DELETE operation, the "before" image of the updated or deleted row is stored - automatically by Db2 - in the associated history table. Db2 knows, thanks to a couple of timestamp columns in the base table (and the history table) that are maintained by Db2, when a row became current (i.e., when it was either inserted into the base table, or when it was updated) and when it became non-current (i.e., when it was deleted from the base table, or replaced in the base table via an UPDATE). With Db2 having that information, it can respond to a query that contains a temporal predicate, which would be of the form FOR SYSTEM TIME AS OF timestamp-value (or BETWEEN timestamp-value1 AND timestamp-value2 or FROM timestamp-value1 TO timestamp-value2).

OK, with that background information in mind, consider this scenario: a program inserts a row into base table T1, which has been enabled for system-time temporal (one could also say that T1 is enabled for "row versioning"), and then, in the same unit of work, updates the just-inserted row. Then the program commits. You might expect to find, following the completion of this unit of work, the "before" image of the row that was changed by the program's UPDATE statement; but, that row is nowhere to be found in T1_HIST (or whatever you decided to name T1's history table). Why is that so? It's so because the update of the row occurred in the same unit of work as the insert of the row. What can you do about that, if you want to see, in T1_HIST, the "before" image of the row changed by the aforementioned UPDATE? Well, you could break the single unit of work into two units of work, with the row-insert in one and the UPDATE of the row in the other. That might not be feasible, and it could be a big hassle even if feasible. Here's what likely would be a more attractive option: have two temporal logical transactions in the one unit of work.

Two what?

Temporal logical transactions, or TLTs, are a new capability introduced with Db2 12 for z/OS (at function level V12R1M500). How would you put this functionality to work in a program? It's pretty easy: first, you have the program issue the following SQL statement:

SET TEMPORAL_LOGICAL_TRANSACTIONS = 1

That statement, which references a new (with Db2 12) special register, tells Db2 to allow (for your session) multiple TLTs in one unit if work.

Next, the program would issue a SQL statement with this form (and note that timestamp-value1 could be CURRENT TIMESTAMP):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value1

Then the program would perform the row INSERT into T1 as it had before. In that row, the "start" timestamp value (the value interpreted by Db2 as the time at which the row became current) would come from the value supplied via the TEMPORAL_LOGICAL_TRANSACTION_TIME special register, which was previously set by the program.

After performing the INSERT operation, the program would again issue the statement below (and again, timestamp-value2 could be CURRENT TIMESTAMP - it would just need to be a little "ahead" of the timestamp-value1 previously used, and as the temporal timestamp value goes to the picosecond level, CURRENT TIMESTAMP, used again, would take care of that "little bit ahead" requirement):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value2

The program would then update the just-inserted row, as it had before. In a similar way as for the INSERT, the "start" timestamp value for the updated row in the base table would be timestamp-value2, based on the value to which the TEMPORAL_LOGICAL_TRANSACTION_TIME special register was set just prior to the UPDATE.

Then the program would issue a commit, as it had before, and now what would one see in T1_HIST? One would see the "before" image of the row updated within the just-completed unit of work, even though the UPDATE targeted a row that had been inserted into T1 in the same unit of work. That's temporal logical transaction functionality in action: you have inserted multiple TLTs into one physical transaction, giving you the row-change-history capability you want even for a row acted on more than once in a unit of work, without having to change the commit scope of the unit of work.

AND, that's not all - TLT functionality can also work in the other direction, allowing you to incorporate several physical transactions into one temporal logical transaction. How is that done? It's not hard: just have your program issue the statement below (keeping in mind that, as previously noted, timestamp-value can be CURRENT TIMESTAMP):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value

Following issuance of that statement, the program could update a row in system-time-enabled base table T1, then commit, then update the same row, then commit again. What you'd see in T1 and T1_HIST in this case (sticking with the same base and history table names used above) would be what you'd see if the updates had been made in one physical transaction versus the actual pair of physical transactions: one "before" row image captured in T1_HIST, showing the target row as it appeared prior to the first update performed by the program, and the row in T1 as it appeared after the second update performed by the program. Why no second row in T1_HIST, reflecting the change made by the second UPDATE operation? Because you told Db2 (through specifying a value for the TEMPORAL_LOGICAL_TRANSACTION_TIME special register) that you wanted the two physical transactions treated as one from a system-time temporal perspective.

And there's your overview of temporal logical transaction functionality. Having that Db2 12 capability on-hand in your shop could well open up new use cases for system-time temporal data support. Give it some thought.