Wednesday, November 12, 2008

Free Statistics Software for DB2 for z/OS

I'm sure you've heard that age old expression that there is no such thing as a free lunch, right? Well, sometimes even the old tried and true sayings can be wrong. No, I'm not going to tell you how to eat lunch for free, but you can get some helpful DB2 mainframe software for free!

NEON Enterprise Software continues to offer free software for checking the health of the statistics in your DB2 Catalog.

Statistics HealthCheck for DB2 z/OS enables you to quickly and effectively analyze and judge the quality of your DB2 Catalog statistics. Checking the health of your DB2 subsystem is especially important considering the heightened sensitivity of DB2 V8 and V9 to bad statistics.

The extensive rule system used by Statistics HealthCheck is based on best practice recommendations for maintaining good statistics. Using a violation system, Statistics HealthCheck pinpoints precisely those objects that could benefit from a RUNSTATS utility health check... or that otherwise require statistics housekeeping.

Statistics HealthCheck is particularly useful as a prerequisite to Bind ImpactExpert during a DB2 version migration. First, Statistics HealthCheck identifies the RUNSTATs you need. Then, Bind ImpactExpert provides REBIND insurance to further guarantee consistent and improved access paths.

With Statistics HealthCheck, you know exactly what is “wrong” with your statistics so that you can proactively correct any problems that might negatively affect DB2 subsystem performance.

And come on, it is free after all. What do you have to lose by downloading it and trying it on your DB2 subsystems today?

Labels: , , , , , , ,

Thursday, September 04, 2008

Database Performance and Row Size

Recently I was reading through some posts in a database-related newsgroup or mailing list (actually, right now I can't remember which one it was). The conversation I was reading was in response to a question like "Does the number of columns or size of the row matter in terms of performance?"

Actually, the question asked what kind of a performance impact might be expected if a query was issued against two similar tables. The first table had (say) 20 columns, and the second table had the same 20 columns, as well as 35 additional columns.

Well, most of the basic responses were similar. The consensus was that as long as the query was going against the same columns then performance should be about the same. I disagree. Here is why.

You also need to factor in the I/O requests that are required to return the data. The DBMS will perform I/O at the block (or page) level - this is so whether you return one row or millions of rows. For multi-row results, accessing data from the table with the wider row (more columns) will usually be less efficient. This is so because fewer rows will exist on each page (the row with 100 columns is smaller than the row with 150 columns so more rows can reside in a single, pre-sized block/page). The bigger the result set, the more pronounced the performance degradation can be (because more physical I/Os are required to retrieve the data).

Think about it this way. Is it faster to pull smaller peaches out of a basket than bigger peaches? That is about the same type of question and anybody can envision the process. Say you want 100 peaches. Small peaches fit 25 per basket; big peaches fit ten per basket. To get 100 small peaches you'd need to pull 4 baskets from the shelf. To get 100 big peaches you'd need to pull 10 baskets from the shelf. The second task will clearly take more time.

Of course, the exact performance difference is difficult to calculate - especially over an online forum and without knowledge of the specific DBMS being used. But there will, more than likely, be a performance effect on queries when you add columns to a table.

Labels: ,