Wednesday, June 25, 2008

No Alphabetic Characters Wanted

Here is a question that was posed to me recently:


Q: We have a CHAR(10) column that cannot contain alphabetic characters. How can we make sure that the letters A thru Z are not allowed.


A: Well, think about the characteristics of alphabetic characters versus the other "things" that can be stored in a CHAR column. One thing that separates an alphabetic letter from numbers, punctuation, etc. is that there are upper and lower case versions (e.g. A, a). So, you could use the following predicate to preclude alphabetic characters from being accepted:

LOWER(:string) = UPPER(:string)

Of course, you will not be able to put this into a CHECK constraint because of restrictions on their content (for example, you cannot use function in a CHECK constraint). But you could use this in SQL statements and as a check in your programs before allowing data to be inserted or modified in your CHAR(10) column.

Anyone else have any other ideas?

Labels: ,

Monday, March 03, 2008

Q+A: Nulls and DASD

Recently, I received the following question, which I will answer today on this blog posting.

Question: Lets say I have a table A which has 500 columns. Out of those 500 columns only 5 columns have been defined as not nullable and the rest have been defined as NULLS allowed. And out of those 500 columns I have found that 300 columns are unused(empty) totally. My business allows me to remove those 300 columns. My doubt is if I remove those 300 empty columns will I save on DASD space occupied by DB2? Will empty columns occupy DASD space?Would be really helpful if you can guide me on this.

Answer: I'm happy to try to help out. First of all, the short answer to your question is "Yes!" Those 500 columns are all consuming valuable disk space. To determine how much space is being consumed, you will need to examine the data type and length assigned to each column and add them up. And to make matters worse, you must add an additional 1 byte to each of them because the columns are nullable.

In DB2, a NULL is stored using a special one-byte null indicator that is "attached" to every nullable column. If the column is set to NULL, then the indicator field is used to record this. Using NULL will never save space in a DB2 database design - in fact, it will always add an extra byte for every column that can be NULL. The byte is used whether or not the column is actually set to NULL.

So, a column defined as CHAR(5) NOT NULL will required five bytes of storage space - but if it is defined as nullable, then it requires six bytes of storage space - five bytes for the data, and one byte for the null indicator.

Given all of this, it would seem that there is a very viable case to be made for you to remove those columns that are not being used. Of course, this means that you will likely have to make changes to any programs accessing that table. Because the table definition will change (fewer columns) you will need new DCLGENs and those will have to be included and bound into your programs. Be sure to factor this additional workload into your planning before moving forward with this change.

The better question to ask is "How the heck did all of the empty columns get put into the table to begin with and how did that design get past the DBAs?"

If you have an answer for that one, please share it by posting your answer in a comment here!

Labels:

Wednesday, January 16, 2008

Q+A: Locking

I get questions e-mailed to me all the time. Although I try to read and answer them all, sometimes I don't. I hope those whom I've ignored over the years will forgive me, but I can't always answer everything (not enough time/energy) and sometimes things get lost or drop through the cracks.

Anyway, at times I will take a question I get and blog about it in Q+A format. Today is one of those days!

The question was: I want to perform a retry on an INSERT under DB2 Z/OS when I get a deadlock/timeout. -911 causes a rollback automatically. Is there a ZPARM or other method of turning this off? I am inserting millions of rows and do not want a rollback to the last commit point.

Here is my answer:

Well, first of all, let me recommend that you minimize the size of your unit of work. If you are inserting millions of rows without a COMMIT you are likely causing locking issues in your environment. The pages you have locked while you are waiting for your millions of inserts to finish are all unavailable to any other user of the table (assuming page locking). That means any data on any page that you have locked cannot be read by anyone else until your unit of work is committed. Any other user, running at the same time as you are, trying to get to any page you have modified, would be getting -911 too.

That being said, you can control whether or not the work is rolled back automatically in CICS (on a thread basis) using an RDO parameter (or RCT if on an ancient CICS). The parameter is called ROLBE (RCT) or DROLLBACK (RDO). If it is set to YES a CICS SYNCPOINT ROLLBACK is issued and a -911 SQLCODE is returned to the program. If NO is coded, a CICS SYNCPOINT ROLLBACK is not issued and the SQLCODE is set to -913. You will have to programmatically either specify COMMIT or ROLLBACK for the unit of work.

In a batch environment you will need to code your programs to periodically issue COMMITs after so many modifications (or using some other method like a timer or loop counter). There is no method I am aware of to automatically control this behavior outside of looking into a third party product (for example, Softbase Checkpoint Restart, and others).

Labels: ,

Monday, December 19, 2005

Minimizing Lock Contention Issues

I frequently get e-mails with DB2 questions and I plan to start posting answers to some of the more common ones up here.

One issue that comes up a lot is dealing with locking issues. Usually it is posed by someone who is experiencing timeouts in an online environment and they want to know how to minimize them. Here is some guidance.

When you experience a timeout, it means that another process holds a lock on the data that you are trying to modify. So, it stands to reason that you should try to minimize the duration of locks that are being held in your system. There are some approaches you can take to achieve this, but for the most part, they require programming changes.

First of all, make sure that all of your batch processes -- especially any that run during the same timeframe, but really all batch process -- have a COMMIT strategy. This means that your programs should issue a COMMIT after processing "a set number of" inserts, updates, and deletes. A COMMIT will tell DB2 to make the changes permanent and releases locks.
A good approach is to set a counter that is incremented after every modification. Then, check it and when it exceeds a predefined threshold -- say 25 or 50 or 100 modications -- then issue a COMMIT. You should make the threshold an input parameter so that you can change it as the workload in your system changes. For example, make it 25 when concurrent activity is high, but ramp it up to 100 or higher when it is low. Failure to issue COMMITs will result in timeouts, as well as possibly deadlocks and lock escalation.

(Also, please note that these are just sample numbers and not necessarily the correct numbers to start with at your shop.)

A good article to read regarding COMMIT strategies is "The Woes of Commitment" by Bonnie Baker.

Next, look at all of your programs, batch and online, and try to save the data modification statements to as close to the COMMIT as possible. By saving the data modification until right before you issue a COMMIT, you reduce the overall average lock duration. This should result in reducing contention and therefore, the number of timeouts.

If you want to investigate the timeout details, be sure to examine the statistics trace class 3 and IFCID 0196 for timeouts (IFCID 0172 is for deadlocks).

Labels: ,