Tuesday, October 25, 2011

IOD Dispatch (1) - Of IDAA and DB2 DRDA

Greetings from Las Vegas, Nevada, site of the 2011 IBM Information on Demand conference (aka IOD). This is the first of several entries I'll post this week to my blog, with highlights from some of the DB2 sessions I've attended. Today I want to share with you good stuff that I picked up from two excellent presentations: one, delivered yesterday to a group of IBMers, on the new IBM DB2 Analytics Accelerator, and one about DRDA best practices in a DB2 for z/OS environment.

The IBM DB2 Analytics Accelerator: wow-level performance for complex, data-intensive business intelligence queries. Boy, did this session ever grab my attention. Big Blue recently announced a new product, the IBM DB2 Analytics Accelerator (IDAA), that is already delivering, at beta sites, eye-popping response times for some really hairy data warehouse queries. The IDAA is, in essence, a deep integration of DB2 for z/OS and IBM's Netezza data analytics server. Here's how it works: a user or application directs a report-generating or decision support query to a DB2 for z/OS system. DB2 then does what it always does -- it optimizes the query; BUT -- and this is the really cool part -- the optimizer has, in the form of the IDAA, a new access path that it can select (if the IDAA is added to an existing DB2 system, the extended optimizer functionality is added by way of a DB2 PTF). Here's the deal: some or all of the tables in the DB2-managed data warehouse are copied over to the IDAA, and if a query targets one (or more) of those tables, the DB2 optimizer can analyze the query and send it to the IDAA for processing if it determines that this would be the best-performing option for result set generation; otherwise, DB2 will execute the query locally (the IDAA is attached to the DB2 for z/OS system via a 10 gigabit Ethernet connection).

Here's why I like this solution a lot:
  • The performance results can be pretty astounding (more on this in a moment).
  • In an environment in which a DB2 for z/OS database has been serving business intelligence query requests, the IDAA is invisible to end users and applications -- they continue to direct queries to the DB2 system. DB2 makes the query routing decision, so users and application programmers don't have to.
  • The database is updated, backed up, and secured on the DB2 end of things (data on the IDAA can ONLY be accessed via the attached DB2 system). Those are things that seasoned DB2 for z/OS DBAs do very well, and full advantage is taken of the industry-leading security and availability benefits delivered by the mainframe DB2 platform.
  • It's a best-of-both-worlds solution for modern data warehouse applications that are characterized by high-volume OLTP-like queries (served efficiently by the DB2 system) AND complex, data-intensive SQL requests (processed very effectively by the IDAA).
  • The time from "un-boxing" of the IDAA to ready-for-use is very short. No database schema changes are required (the IDAA can deliver outstanding complex query performance for traditional relational database designs and for star schemas), and IDAA set-up is accomplished via an intuitive graphical interface.
To that last point in the list above: the IDAA administration interface is basically an extension of the IBM Data Studio tool (which is FREE and DOWNLOADABLE, folks). If Data Studio is not already installed on a PC that will be used for IDAA administration, the extended edition with IDAA support can be installed. If Data Studio is already there, a plug-in will provide the IDAA administration functionality. DBA-requested IDAA administration tasks are performed via DB2 stored procedures that are supplied with the solution, including one that copies data to the IDAA (and subsequently updates that data) by invoking the DB2 UNLOAD and LOAD utilities (both the REPLACE and RESUME options of LOAD are supported).

And about performance: I've mentioned several times that the IDAA can deliver huge improvements in response time for queries that are highly complex and/or which require very large-scale data scans when executed. Words are fine, you may be thinking. How about some numbers? Well, we heard from an IT manager with an organization (a big insurance company) that was a beta test site for the IDAA. To quote this gentleman's understated assessment, test results "were impressive." How impressive? How about 4566 seconds for execution of a certain query before IDAA implementation, and 3.7 seconds after. How about 9558 seconds for another query before, and 5 seconds after. We're talking orders of magnitude improvement in elapsed time. More from the beta testing organization:
  • The data warehouse used for IDAA testing is big, with one table having more than a billion rows and some 200 tables having more than 500 million rows apiece.
  • High throughput was seen for IDAA data load operations, with 590 million rows loaded in 9.5 minutes, and 1.5 billion rows loaded in 15 minutes.
  • The IDAA was attached to a DB2 for z/OS data sharing group (attachment to a standalone DB2 subsystem is, of course, supported as well).
I encourage you to check out the IDAA. In particular, if you're already doing data warehousing with DB2 on System z, let this accelerator burn some rubber in your shop.

DRDA best practices for DB2 for z/OS -- This session was delivered by Christian Molaro, a Belgium-based DB2 professional whose expertise has been recognized by Big Blue: Christian is an IBM Gold Consultant and an IBM Champion for Information Management. In his presentation, Christian made a number of very sound recommendations related to the processing of client-server workloads in DB2 a for z/OS environment (DRDA, or Distributed Relational Database Architecture, is the protocol used by DB2 servers and requesters for distributed database access). Among these recommendations were the following:

  • Keep your DB2 client code current. In working with his clients, Christian has encountered several situations in which DB2 distributed database access problems were traced to down-level DB2 client code. He recommended that client applications utilize -- at least -- the version 9.5, fix pack 3 level of DB2 client code, and pointed out that the 9.7, fix pack 3a level of client code is required for full support of DB2 for z/OS Version 10 functionality.
  • Choose the right client package. IBM provides a number of DB2 drivers and client packages (see http://www-01.ibm.com/support/docview.wss?uid=swg21385217). Christian recommended going with the package that has the smallest footprint while delivering the functionality required by the client application.
  • Using an IBM driver to go straight from the client application server to DB2 (versus going through a DB2 Connect gateway server) is usually the right choice -- but not always. The no-gateway approach generally delivers optimal performance and simplifies the application infrastructure. On the other hand, using a DB2 Connect gateway server can be a good approach if the environment is characterized by a large number of "fat" clients, as opposed to a relatively small number of application servers that might front numerous upstream end user workstations.
  • Speaking of DB2 Connect, this can be a very good initial workload for hosting on a Linux image on a System z server. A performance advantage can be realized through the use of HiperSockets for communication with a DB2 subsystem in a z/OS LPAR on the same mainframe server. On top of that, when folks see that Linux on System z provides a great environment for a DB2 Connect gateway, they might be inclined to follow up with the implementation of an application server (such as WebSphere) under Linux on z.
  • DB2 10 for z/OS does not support private protocol, so make sure that your programs aren't using it before you migrate to DB2 10. APAR PK64045 provides useful information in this area.
  • For most efficient use of mainframe memory and cycles, ensure that DB2 support for inactive threads is enabled. This is done via the CMTSTAT parameter of DSNZPARM.
  • Location aliases are a great way to restrict execution of certain client-server workloads to a subset of the members of a DB2 data sharing group. DB2 10 delivered a nice enhancement here, enabling dynamic addition, deletion, and modification of location aliases via the -MODIFY DDF command (with prior releases of DB2, location alias changes required execution of the change log inventory utility (aka DSNJU003), and that utility can can be executed when DB2 is down.
  • High-performance DBATs, a DB2 10 enhancement enabled through binding of DRDA-using packages with RELEASE(DEALLOCATE), can deliver significant improvements in CPU efficiency for client-server workloads. I blogged on this very topic earlier this year.
That's all for now. More to come tomorrow.

No comments:

Post a Comment