Robert's Db2 blog: DB2 for z/OS: Something You Might Not Know About Character String Literal Values

Wednesday, October 9, 2013

DB2 for z/OS: Something You Might Not Know About Character String Literal Values

Here's something I imagine a lot of you (or programs written by you) do all the time: issue DB2 for z/OS-accessing SQL statements that contain character string literal values. Below is a query that I issued from Data Studio earlier today, targeting a table that I'd created on one of our test DB2 subsystems:

SELECT PRODUCT_ID, UNIT_SALES
FROM PRODUCT_SALES
WHERE PRODUCT_ID = 'A12345';

Now, a question for you: is that character string literal value, 'A12345', fixed-length or varying-length? Considering this from your point of view, you might answer, "Fixed length." After all, it's six characters, right? No more, no less. And if you look at it again a few minutes from now, it'll still be six characters. Guess what? From DB2's point of view, the value 'A12345' in the above statement is a varying-length character string. I'm not revealing any state secrets here -- check the DB2 10 for z/OS SQL Reference and you'll see, in the first sentence of the section on character string constants, the following:

"A character string constant specifies a varying-length character string."

Of course, even a seasoned mainframe DB2 professional is not likely to have committed the entire SQL Reference to memory, and in fact plenty of people are unaware of this little nugget of information. That's generally not a big deal, because it almost never matters. Why does it hardly ever matter? Because fixed-length and varying-length character strings are compatible for purposes of assignment and comparison in a DB2 for z/OS system. Take the previously referenced SELECT statement. The 'A12345' character string constant is, as noted, treated by DB2 as a varying-length value. The PRODUCT_ID column of my PRODUCT_SALES table is defined as CHAR(6) -- a fixed-length character column. Does that matter? No -- the query runs just fine. What if you have a stored procedure with an input parameter defined as CHAR(10), and you call the stored procedure with a 10-character (or shorter) literal value provided for that input parameter? Does that matter? No -- the CALL will work just fine. Same goes for a user-defined function invocation.

Alright, so I said that the fact that DB2 treats a character string constant as a varying-length value almost never matters. "Almost never" is not the same thing as "never." So, when does it matter? I'm aware of one case, which was recently brought to my attention by a DB2 for z/OS DBA who I've known for years. This individual was trying to create a column mask in a DB2 10 for z/OS new-function mode environment. His CREATE MASK statement looked something like this (I'm using table and column names other than the real ones):

CREATE MASK ABC.COL2_MASK
ON ABC.MY_TABLE
FOR COLUMN COL2 RETURN
   CASE
    WHEN (1 = 1)
     THEN '999999'
    ELSE '000000'
   END
   ENABLE;

When he issued that SQL statement, it failed with a -20474 SQL code, reason code 33, indicating that "The data type of the return expression is not the same as the data type of the column on which the column mask is defined." This didn't make sense to my DBA friend, because COL2 was defined as CHAR(6), and he'd provided 6-character literal values in the RETURN clause of the CREATE MASK statement. In this situation, the problem was related to the fact that the character string literals in the SQL statement were treated as varying-length values by DB2. Why did that matter? Because, per the DB2 10 SQL Reference, one of the rules of column mask creation is as follows:

"The result data type, null attribute, data length, subtype, encoding scheme, and CCSID of the CASE expression must be identical to those attributes of the column that is specified by column-name."

Fixed-length character strings and varying-length character strings are two different DB2 data types. Thus, the -20474 error, with reason code 33. When the DBA changed the statement as shown below, it worked:

CREATE MASK ABC.COL2_MASK
ON ABC.MY_TABLE
FOR COLUMN COL2 RETURN
   CASE
    WHEN (1 = 1)
     THEN CAST('999999' AS CHAR(6))
    ELSE CAST('000000' AS CHAR(6))
   END
   ENABLE;

The statement also worked when coded as follows, because the CHAR scalar function returns a fixed-length character string representation of the argument (this from the DB2 10 SQL Reference, with emphasis added by me):

CREATE MASK ABC.COL2_MASK
ON ABC.MY_TABLE
FOR COLUMN COL2 RETURN
   CASE
    WHEN (1 = 1)
     THEN CHAR('999999')
    ELSE CHAR('000000')
   END
   ENABLE;

Recap time. 1) Character string literals in SQL statements are treated by DB2 for z/OS as varying-length values. 2) That almost never matters. 3) It matters when you want to create a mask for a fixed-length character column.

12 comments:

AnonymousOctober 23, 2013 at 2:16 AM
Hi Robert,
I have a small doubt.
I could not see any Indexspaces entries in the catalog table SYSCOPY.
When I queried the syscopy with OTYPE=I , I am able to find the directory Indexspace entries only, Please help me on this.
Thanks
ReplyDelete
Replies
AnonymousJuly 6, 2014 at 4:14 PM
Recently I faced one issue in char string comparison as below :
SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY > '201406260924396848

'
AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR;
Count o/p is 13856.

And DDL of table above table is :
CREATE TABLE U.UPARM
(C_APPL CHAR(3) NOT NULL WITH DEFAULT
,X_KEY CHAR(22) NOT NULL WITH DEFAULT
,X_TBL CHAR(200) NOT NULL WITH DEFAULT
) IN U030.UPARM
;

here X_KEY is of CHAR(22) length.

But when I changed the above query to :
1) With X_KEY compared to string less than 22 char :
SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY > '201406260924396848'

AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR;
Count o/p is 0.
2) With X_KEY comparison to string of 22 characters :
SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY > '201406260924396848 '

AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR;
Count o/p is 0.

3) With X_KEY comparison to string greater than 22 characters but less than 254 characters :
SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY > '201406260924396848 '

AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR;
Count o/p is 0.

My concern is that for all the condition where X_KEY input is less than 22 chars and greater than 22 but less than 254 chars; o/p of query is zero.

When I checked QBLOCK_TYPE field in PLAN_TABLE for above queries, queries with X_KEY input less than equal to 22, it is showing "PRUNED"; but when X_KEY i/p gets greater than 22 chars QBLOCK_TYPE will show as "SELECT".

Can you help in understanding the o/p of above queries and QBLOCK_TYPE - "PRUNED" or "SELECT" values while analyzing the EXPLAIN of queries.

Thanks in advance as always.
ReplyDelete
Replies
AnonymousJuly 10, 2014 at 7:44 AM
Robert,

Let me put above query in a more precise way.

I got one sql query as mentioned below :
Query:

SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY > '201406260924396848

'
AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR

DDL Definition :
CREATE TABLE U.UPARM_TBL
(C_APPL CHAR(3) NOT NULL DEFAULT
,X_KEY CHAR(22) NOT NULL DEFAULT
,X_TBL CHAR(200) NOT NULL DEFAULT
)
IN U030.UPARM
;
Here X_KEY is CHAR(22).

When I tried executing above query in SPUFI with first occurence of X_KEY >

'compared with char string more than 254', guess what it executed fine. It results

into some rows. I expected it to give error with data size mismatch.

Then I check removing the first occurrence of X_KEY from predicate, modified query

looks like as below :

SELECT COUNT(*)
FROM U.UPARM
WHERE C_APPL = 'PS'
AND X_KEY < '20140626090013747561'
AND X_TBL NOT LIKE '%TRANSFER%'
FOR READ ONLY
WITH UR ;

It results into same number of rows and same data.

Does it means that DB2v10 is skipping first condition when more than 254

characters are passed for comparison.

Here I want to know how db2v 10 acts when :

1) predicate CHAR column is compared with char string greater than size defined in the DDL.

2) Predicate CHAR column is compared with char string greater than CHAR data type limit(254 char).

Appreciate your efforts in making us to understand complex db2 concept in a easy way.

Thanks in advance.
ReplyDelete
Replies
AnonymousSeptember 24, 2014 at 12:23 AM
Hi,
I have a question as below.We have a table with colum X defined as CHAR(1) in DDL.
and have values 0,1,2,3,4,X in that column.
when I ran the query "select count(1) from tab where x>'0';
I am getting 4 as count i.e 1,2,3,4 but not X
can some explain why is this?

Note: when we have the same situation on netezza, it is giving 5 as count.i.e 1,2,3,4,X
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.