Thursday, January 19, 2012

A Note on the Data Capacity of a DB2 for z/OS LOB Column

Last week, a DBA asked me a few questions about LOBs (large objects) in a DB2 for z/OS context. Most of you probably know that a LOB is a DB2 data type. There are actually two LOB data types: BLOB (binary large object -- a string of bits) and CLOB (character large object -- a string of characters). One key difference between LOB and non-LOB data types is capacity with respect to the size of individual data values: a table column defined with the VARBINARY or VARCHAR attribute (these being the highest-capacity non-LOB data types) can accommodate individual values with a length of up to 32,704 bytes, whereas one can store up to 2 gigabytes of data in one LOB column of one row in a table; thus, a BLOB column might hold movies or high-resolution photographs or digital audio recordings, and a CLOB column might hold books or lengthy legal documents. Another distinguishing characteristic of LOB columns is their physical separation, storage-wise, from non-LOB columns and from other LOB columns (if any) that are part of the same table: a table's non-LOB data values will be stored in a "base" table space, while each LOB column will have its own LOB table space (and there will be one LOB table space per LOB column and per partition of the base table space, so that a table with two LOB columns in a base table space with four partitions will have eight associated LOB table spaces). Of course, application programs don't "see" table spaces -- they see tables, and to a DB2-accessing program it appears that LOB values are side-by-side with non-LOB values in a table's rows (in a DB2 10 for z/OS system running in new-function mode, a table's LOB values -- or a portion of each LOB value -- can in fact be physically adjacent to non-LOB values in the base table space thanks to a new feature called LOB in-lining, but that's a subject for another blog entry).

Anyway, the aforementioned DBA was looking for answers to these questions:
  1. How much data can be stored in one LOB column?
  2. What is the relationship between a base table space and a LOB table space?
  3. How can one control the amount of disk space that will be occupied by data stored in a LOB column?
I provided the DBA with the information he was seeking, and, in the interest of further disseminating that information, I'm providing it in this blog post.


Maximum amount of data that can be stored in a single LOB column

The short answer: 66,584,576 gigabytes. That's slightly less than 64 petabytes, which would be 67,108,864 GB. It's a little less than 64 petabytes because a LOB table space can have a maximum of 254 data sets, and you'd need a 256-data set LOB table space to get to the precise 64 petabyte figure.

Here's how this maximum amount of data could be stored in a single LOB column:
  • A base table space could have up to 4096 partitions.
  • As previously mentioned, there will be a LOB table space for each partition of an associated base table space (actually, there is one LOB table space per partition per LOB column, but I'm assuming here that the table in question has a single LOB column); so, there would be 4096 LOB table spaces associated with a base table space with 4096 partitions.
  • One LOB table space can be comprised of up to 254 data sets (as noted above), each of which can reach a size of up to 64 GB.

THEREFORE, the maximum amount of data that can be stored in one LOB column is:

(4096 base table space partitions)
X (1 LOB table space per partition for the column)
X (254 data sets per LOB table space)
X (64 GB per LOB table space data set)
= 66,584,576 GB


Relationship between a base table space and a LOB table space

First, it's important to keep in mind that a LOB table space is a unique DB2 for z/OS database object. I like the way that Jay Yothers, a longtime member of IBM's DB2 for z/OS development organization, put it (the acronyms PBG and PBR refer, respectively, to partition-by-growth and partition-by-range universal table spaces): "LOB table spaces are not PBG, PBR, classic partitioned, segmented, or simple. LOB table spaces are LOB table spaces."

Second, remember that the base table space / LOB table space relationship is all about the partitions in the base table space. So, if the base table space is non-partitioned or is partition-by-growth with MAXPARTITIONS 1, there will be one LOB table space associated with the base table space (assuming, as before, that the table in the base table space has one LOB column).

For a range-partitioned base table space, the number of associated LOB table spaces (per LOB column in the base table) will be equal to the NUMPARTS value for the base table space.

For a partition-by-growth base table space, if MAXPARTITIONS is greater than 1 and a new partition is automatically added to the table space by DB2, a new LOB table space for the partition (per LOB column in the base table) will be automatically created by DB2.


Controlling the space occupied by data in a LOB column

I pointed out that one LOB column can hold up to 66,584,576 GB of data. What if you don't want that much data to be stored in a LOB column? No problem -- you have several means of controlling the amount of space that will used to store LOB data in a table. Among these are the following:
  • Limit the number of base table space partitions. The fewer the number of base table space partitions, the fewer the number of associated LOB table spaces.
  • Limit the size of the data sets comprising the LOB table space(s). The DSSIZE specification for a LOB table space controls the size of the data sets that comprise the LOB table space. DSSIZE can be nG, where G represents gigabytes and n is 1 or a multiple of 2 between 2 and 64, inclusive.
  • Limit the length of a value that can be placed in a LOB column. When a base table is created, a LOB column can be defined as (for example) BLOB(100K). In that case, no value larger than 100 KB could be inserted into the LOB column.

Another LOB space consumption control capability is LOB inlining, introduced (as I pointed out in the opening paragraph of this entry) with DB2 10 for z/OS. The significance of that and other LOB-related features of DB2 10 is such that they deserve coverage in a separate blog entry. I'll try to get that entry posted within the next couple of weeks.

One more thing: you can find a lot of very good information about LOBs in a DB2 9 environment in an IBM "red book" titled "LOBs with DB2 for z/OS: Stronger and Faster." It's available on IBM's Web site at http://www.redbooks.ibm.com/abstracts/sg247270.html?Open.

I hope that the information in this blog entry will be useful to you. Check back in a week or two for an overview of LOB-related enhancements delivered in DB2 10 for z/OS.

No comments:

Post a Comment