Thursday, January 30, 2020

Db2 for z/OS and RACF, Part 2: Db2-Internal Security

In the Part 1 post of this two-part blog entry, I described how RACF can be used to provide "external" security for a Db2 for z/OS system - external security being concerned with who (i.e., which IDs) can connect to the Db2 system, and how (e.g., in the form of a batch program and/or through CICS and/or as a DRDA requester and/or as a REST client, etc.). That is a very common use of RACF in a Db2 security sense. RACF can optionally be used to also manage Db2-internal security, which is all about what one can do (i.e., what a process with a given ID can do) when one has successfully connected to a Db2 system. This Part 2 blog entry is focused on that second topic: using RACF to manage Db2-internal security. [Note: as in the Part 1 entry, I will be referring explicitly to IBM's RACF - if you use another vendor's z/OS security management product, you can check with the vendor to see how what I have written about RACF pertains to the alternate product.]

For quite a few years, the only way to manage Db2-internal security was through Db2 itself. This was done using the SQL statements GRANT and REVOKE. For example, if an application connecting to Db2 using ID XYZ needed to be able to read data from table T1 using dynamic SQL statements, a DBA would execute a statement like this one:

GRANT SELECT ON T1 TO XYZ;

Several Db2 versions ago, a new option for managing Db2-internal security was introduced: you could do that with RACF, via a Db2-provided exit routine called DSNXRXAC. Before getting further into that, I'll bring up the obvious question: why would an organization want to use RACF to manage Db2-internal security, given that the job could be effectively done with Db2 GRANTs and REVOKEs? I have seen that the RACF route is usually taken because a decision has been made that one team will be responsible for both external and internal Db2 for z/OS security. In such a case, the one team will likely choose the RACF-based approach because RACF can be used for external and internal Db2 security (thanks to the aforementioned Db2-supplied exit), while Db2's built-in functionality is aimed at managing only Db2-internal security.

So, if RACF is going to be used to manage Db2-internal security, you implement the DSNXRXAC routine and that's it, right? Nope. There's more to be done. Why? Because when a Db2 thing requiring some privilege or authority is to be done, Db2 drives the RACF exit and essentially asks RACF whether ID X can do action Y. To answer that question, RACF needs to see that ID X has been RACF-permitted to perform action Y. If RACF has not been set up to manage authorization for performing Db2 action Y, RACF will defer to Db2 ("This doesn't appear to be in my purview, Db2 - you make the call"), and Db2 will check to see if the required privilege or authority has been provided for ID X through a SQL GRANT statement.

How does RACF get set up to manage Db2 privileges and authorities? That involves activating various RACF classes and defining profiles in those classes. Consider the privilege of reading data from a table using a dynamic SQL statement. Permission to do that, from a RACF perspective, requires that a RACF profile, specifying the SELECT privilege for the table in question, be defined in the RACF class MDSNTB (and that RACF class has to be active). Additionally, the ID needing the privilege has to be RACF-permitted to access the profile associated with the privilege. Another example: if a user needs to be able to execute a certain Db2 package then on the RACF side a profile is defined for the execute privilege on the package in question (or perhaps for that package and all others in the associated package collection), and that profile is defined in the RACF class MDSNPK, and that RACF class has to be active. On top of that, the user's ID (or the ID of an application process that needs to be able to execute the package) is RACF-permitted to access the aforementioned profile covering the EXECUTE privilege on the package (or on all packages in the collection, if that is how the RACF profile was defined).

Got it? In RACF, profiles are defined - in the appropriate RACF classes - that cover specific privileges that can be provided for specific resources (e.g., SELECT on table T1, EXECUTE on package PKG1, etc.), and IDs are RACF-permitted to access the various profiles as needed. Information on all of this - the RACF classes that cover various Db2 privileges and authorities, defining profiles, permitting access to profiles, and more - can be found in a Db2 for z/OS manual titled RACF Access Control Module Guide. The information in that manual is also available online, in the Db2 for z/OS Knowledge Center on IBM's Web site, at https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/racf/src/tpc/db2z_racfoverview.html.

I want to wrap up this entry with information about a couple of important enhancements pertaining to RACF management of Db2-internal security, both of which were delivered with Db2 11 for z/OS. The first of these enhancements has to do with the way that RACF would do authorization checking for a package auto-rebind situation. Consider this scenario: package PKG1 is dependent on table T1. A DBA alters T1 in a way that invalidates PKG1. A request to execute PKG1 comes along, and because PKG1 has been marked invalid by Db2, an attempt is made to auto-rebind the package. Historically, in a situation in which RACF is managing Db2-internal security, the Db2-RACF authorization exit is driven, and RACF checks to see if the ID of the process that wants to execute PKG1 has the privileges needed to successfully rebind the package. That was a problem, because the ID that wants to execute the invalidated PKG1 (thereby triggering the auto-rebind) almost certainly does NOT have the privileges needed to successfully rebind the package. More than likely, those privileges are instead held by the ID of the OWNER of the package. So, ID APPL1, which wants to execute the invalidated PKG1, is checked for privileges needed for a successful rebind of PKG1, and because those privileges are not held by ID APPL1 the auto-rebind fails, and PKG1 ends up getting marked "inoperative" by Db2. Bummer. To address that situation, Db2 11 for z/OS introduced a new ZPARM parameter called AUTHEXIT_CHECK. If that parameter is set to its default value of PRIMARY, and an auto-rebind operation occurs and RACF is managing Db2-internal security, the primary authorization ID of the process that triggered the auto-rebind (by attempting to execute an invalidated package) will be checked by RACF for privileges needed to successfully complete the rebind of the package in question. If AUTHEXIT_CHECK is set to DB2, in the auto-rebind scenario RACF will check to see if the ID of the package OWNER has the privileges required for a successful rebind of the package. Problem solved.

The second of these Db2 11-delivered enhancements pertaining to RACF management of Db2-internal security has to do with in-memory caches of authorization information that Db2 maintains to speed up authorization checks. Consider this scenario: ID SMITH has the EXECUTE privilege for package PKG1, and Db2 has that information stored in the package authorization cache for quick look-aside reference (the size of this cache is specified via the CACHEPAC parameter in ZPARM). RACF is being used to manage Db2-internal security, and on the RACF side the ID SMITH is de-permitted access to the profile that relates to the EXECUTE privilege on PKG1. The RACF change is made active via a SETROPTS RACLIST REFRESH command, and then ID SMITH tries to execute PKG1 again, and SMITH is successful in doing that. How can that be, given that SMITH was de-permitted access to the profile relating to the EXECUTE privilege on PKG1? SMITH can still execute package PKG1 because the information in Db2's in-memory package authorization cache has not been changed to reflect the change in package execution authority made on the RACF side. Bummer. To address that situation, the ZPARM parameter AUTHEXIT_CACHEREFRESH was added with Db2 11 for z/OS. When the value of that ZPARM is set to ALL (versus the default value of NONE), authorization information cached in memory by Db2 (in the package cache or the routine cache or the dynamic statement cache) will be updated to reflect RACF-side changes after those changes have been RACF-activated via a SETROPTS RACLIST REFRESH command. Problem solved.

Those two Db2 11-introduced ZPARMs eliminated just about the last two irritating snags associated with RACF management of Db2-internal security. A growing number of organizations are managing both external and internal Db2 security through RACF. It's a totally viable approach, so don't fret if you learn that your organization wants to go this route. Sure, there's some set-up work that has to be done, and RACF people will need to pick up some Db2 terminology and vice versa, but your Db2 security situation will continue to be robust and manageable.

11 comments:

  1. Thanks Robert.

    We went to RACF security some time ago.

    Pros: as our databases and tables are mostly defined with a well known naming standard, we can use generic RACF profiles. We went from thousands of Grant to a bunch of profiles. And they are ready before object creation.

    Contras: Db2 continues to create implicit auths. We would like a zparm to deactivate internal Db2 security at all.

    Extra: we need a role similar to DATAACCESS but for readonly access. A update to the exit was the solution.

    ReplyDelete
    Replies
    1. That ZPARM idea could be a good enhancement request. You could enter that request (or any other Db2 for z/OS-related enhancement request) at this URL: https://ibm-data-and-ai.ideas.aha.io

      Robert

      Delete
  2. Hi Robert
    Thank you so much for your post!!
    We are seeing get pages against SYSIBM auth tables such as SYSUSERAUTH,SYSRESAUTH,SYSPACKAUTH and SYSPACKAGE. The packages are BIND with VALIDATE(BIND). I would like to understand why auth tables are accessed before executing the SQL queries. Thank you so much for your help in Advance !!
    Below is the ZPARM setting
    Authorization option to enable/disable DB2 auth. YES AUTH
    Default authorization cache size. 4096 AUTHCACH
    Authorization exit cache refresh. NONE AUTHEXIT_CACHERE>
    Authorization exit check. PRIMARY AUTHEXIT_CHECK
    Cache store limit. 0 CACHE_DEP_TRACK_>
    Dynamic SQL statement cache support option. YES CACHEDYN
    Free local cached statements to relieve storage. 0 CACHEDYN_FREELOC>
    Max storage for package authorization cache. 10485760 CACHEPAC
    Max storage for routine authorization cache. 10485760 CACHERAC

    ReplyDelete
    Replies
    1. First, of course, SYSPACKAGE is not an authorization table.

      As for the other tables, you would expect SYSRESAUTH to be accessed as Db2 checks privileges related to the use of things such as collections, buffer pools and table spaces. Similarly, you would expect accesses to SYSUSERAUTH as Db2 checks system privileges of users, such as SYSADM, SYSTEM DBADM and BINDAGENT, as well as authorization to do such things as start and stop traces, issue -DISPLAY commands, and issue BIND PACKAGE commands with the ADD option.

      With regard to SYSPACKAUTH, you would expect Db2 to check for the EXECUTE authority on packages when there are requests to execute the packages. A lot of those package authorization checks should be satisfied via the in-memory package authorization cache, but some checks will still likely require access to SYSPACKAUTH. Sometimes, an overly high level of package authorization checks are the result of PKLIST specifications for "local to Db2" plans (i.e., plans other than DISTSERV, which is not a plan in the traditional sense) that have an asterisk (i.e., *) for the location of listed collections - removing those asterisks can resolve that issue (absent an *, location defaults to the local Db2 system). If in fact packages at a remote server need to be executed, explicitly identify that remote location for the relevant collection in the plan's PKLIST.

      Sometimes, elevated levels of package authorization checks can be related to stored procedure calls, when the OWNER of the stored procedure has not been explicitly granted EXECUTE on the stored procedure's package (this is not an issue when the stored procedure OWNER has an authority, such as SYSADM, that already has the ability to execute packages).

      Robert

      Delete
    2. Hi Robert,
      Thank you so much for your reply!. Below report was taken from CA detector. A99S0187 package is a COBOL program and it issues SELECT to a remote ORACLE table. This A99S0187 package was BIND with VALIDATE(BIND).
      I would like to understand why DB2 is doing GETPAGEs against SYSIBM Authorization tables though this package was BIND with VALIDATE(BIND) parameter.

      20.0 > ------------- DETECTOR Plan SQL Display ------------ 12/11/20 05:10
      Command ==> Scroll ==> CSR
      LINE 1 OF 2
      DB2 SSID ==> DSN Planname ==> A99P910U Program ==> A99S0187
      Type ==> RMPK Collid ==> ORAPROD
      Version ==>
      Total/Avg => T

      Interval Date => 11/30/20 Interval Time => 09:00:00 Elapsed Time => 03:00
      -------------------------------------------------------------------------------

      Q -View SQL text, T -View Tables/Indexes, E -Explain, D -View Detail

      SQL_CALL SQL CPUPCT INDB2_TIME INDB2_CPU GETPAGE GETPFAIL
      -------- ---------- ------- ------------ ------------ ---------- ----------
      _ SELECT 189549 50.19% 02:26:00.973 00:32.886688 4928274 0

      Below screen shows the GETPAGEs for the above the qurery

      20.0 > -------- Detector SQL Table Activity Display -------- 12/11/20 05:10
      COMMAND ===> Scroll ==> CSR
      LINE 1 OF 5
      Total/Avg => T DB2 SSID ==> DSN
      Program ==> A99S0187 Type ==> RMPK Statement ==> 0001543
      Ctoken ==> 1B188EE818425BD0 Section ==> 00001
      Collid ==> ORAPROD SQL Call Type ==> SELECT
      Version ==>

      Interval Date => 11/30/20 Time => 09:00:00 Elaps Time => 03:00 Samp => 100%
      -------------------------------------------------------------------------------

      TABLE INDEX TB_SEQ_GP TB_IDX_GP TB_LNK_GP IS_GETP
      ------------------ -------- ---------- ---------- ---------- ----------
      SYSUSERAUTH DSNAUH01 947745
      SYSUSERAUTH 0 947745 0
      SYSRESAUTH DSNAGH01 1137294
      SYSPACKAUTH DSNKAX02 1705941
      SYSPACKAGE DSNKKX02 947745

      Delete
    3. Below screen shows the SQL issued from LOCAL system

      20.0 > ------------- DETECTOR Plan SQL Display ------------ 12/11/20 05:11
      Command ==> Scroll ==> CSR
      LINE 4 OF 4
      DB2 SSID ==> DSN Planname ==> A99P910U Program ==> A99S0068
      Type ==> PACKAGE Collid ==> PROD
      Version ==> 2020-11-03-17.12.44.217010
      Total/Avg => T

      Interval Date => 11/30/20 Interval Time => 09:00:00 Elapsed Time => 03:00
      -------------------------------------------------------------------------------

      Q -View SQL text, T -View Tables/Indexes, E -Explain, D -View Detail

      SQL_CALL SQL CPUPCT INDB2_TIME INDB2_CPU GETPAGE GETPFAIL
      -------- ---------- ------- ------------ ------------ ---------- ----------
      _ SELECT 241723 10.13% 00:05.138987 00:03.691959 559854 0
      ******************************* BOTTOM OF DATA ********************************

      Below screen shows the GETPAGEs for the above query.


      20.0 > -------- Detector SQL Table Activity Display -------- 12/11/20 05:11
      COMMAND ===> Scroll ==> CSR
      LINE 1 OF 7
      Total/Avg => T DB2 SSID ==> DSN
      Program ==> A99S0068 Type ==> PKGE Statement ==> 0005760
      Ctoken ==> 1B188E82167AAC88 Section ==> 00003
      Collid ==> PROD SQL Call Type ==> SELECT
      Version ==> 2020-11-03-17.12.44.217010

      Interval Date => 11/30/20 Time => 09:00:00 Elaps Time => 03:00 Samp => 100%
      -------------------------------------------------------------------------------

      TABLE INDEX TB_SEQ_GP TB_IDX_GP TB_LNK_GP IS_GETP
      ------------------ -------- ---------- ---------- ---------- ----------
      SYSUSERAUTH DSNAUH01 1265
      SYSUSERAUTH 0 1265 0
      SYSRESAUTH DSNAGH01 1518
      SYSPACKAUTH DSNKAX02 2277
      SYSPACKAGE DSNKKX02 1265
      FULFILMNT_MNT RBODFFK0 241532
      FULFILMNT_MNT 0 311462 0
      ******************************* BOTTOM OF DATA ********************************

      Delete
    4. VALIDATE(BIND), from an authorization standpoint, relates only to privileges required to create the package (i.e., to successfully bind the package). When there is a request to execute the package, that will drive authorization checks related to package execution, versus package creation.

      Robert

      Delete
    5. Robert,
      Thank you so much for your reply!!
      A99S0187 is a COBOL package that issues a SELECT SQL against a remote ORCLE table about 189,549 times and it was bind statically BIND to ORACLE and BIND Plan has PKLIST ( *.ORAPROD.*). For this SELECT SQL, I see the below statistics from CA detector.

      947,745 Getpage against DSNAUH01 index of SYSUSERAUTH
      947,745 Getpage against tablespace of SYSUSERAUTH
      1,137,294 Getpage against DSNAGH01 index of SYSRESAUTH
      1,705,941 Getpage against DSNKAX02 index of SYSPACKAUTH
      947,745 Getpage against DSNKKX02 index of SYSPACKAGE

      Do you think the above GETPAGEs are normal for a remote package ?.

      Delete
    6. Hard to say. At least some of the Db2 catalog accesses could be related to whatever data federation software is used to makes the remote Oracle database server appear to the Db2 for z/OS system as a DRDA server. It could also relate to the entries in the Db2 for z/OS system's communications database that pertain to the remote Oracle database server - for example, it may be that outbound ID translation is occurring.

      Accesses to Db2 catalog authorization tables can often be reduced by removing asterisks in location name positions in a plan's PKLIST. You have *ORAPROD.*. If there is an ORAPROD collection at the local Db2 system, remove the asterisk before ORAPROD so that the location name will default to the local system. If there is also an ORAPROD collection at the remote server, list ORAPROD again in the PKLIST, prefixed by the location name for the remote Oracle server.

      Additionally, the CACHEPAC and CACHERAC parameters in ZPARM should both have a value of 10M (the fix for APAR PH28280 removes both of these parameters from ZPARM and sets them both internally to 10M).

      Robert

      Delete
    7. Robert,
      I tested a job by replacing the asterisk in the PKLIST with actual location name and I could see that GETPAGES against SYSIBM auth tables disappeared.Thank you so much for your help!!

      Delete
    8. You are welcome. Good to know that the issue has been resolved.

      Robert

      Delete