Wednesday, January 16, 2008

Q+A: Locking

I get questions e-mailed to me all the time. Although I try to read and answer them all, sometimes I don't. I hope those whom I've ignored over the years will forgive me, but I can't always answer everything (not enough time/energy) and sometimes things get lost or drop through the cracks.

Anyway, at times I will take a question I get and blog about it in Q+A format. Today is one of those days!

The question was: I want to perform a retry on an INSERT under DB2 Z/OS when I get a deadlock/timeout. -911 causes a rollback automatically. Is there a ZPARM or other method of turning this off? I am inserting millions of rows and do not want a rollback to the last commit point.

Here is my answer:

Well, first of all, let me recommend that you minimize the size of your unit of work. If you are inserting millions of rows without a COMMIT you are likely causing locking issues in your environment. The pages you have locked while you are waiting for your millions of inserts to finish are all unavailable to any other user of the table (assuming page locking). That means any data on any page that you have locked cannot be read by anyone else until your unit of work is committed. Any other user, running at the same time as you are, trying to get to any page you have modified, would be getting -911 too.

That being said, you can control whether or not the work is rolled back automatically in CICS (on a thread basis) using an RDO parameter (or RCT if on an ancient CICS). The parameter is called ROLBE (RCT) or DROLLBACK (RDO). If it is set to YES a CICS SYNCPOINT ROLLBACK is issued and a -911 SQLCODE is returned to the program. If NO is coded, a CICS SYNCPOINT ROLLBACK is not issued and the SQLCODE is set to -913. You will have to programmatically either specify COMMIT or ROLLBACK for the unit of work.

In a batch environment you will need to code your programs to periodically issue COMMITs after so many modifications (or using some other method like a timer or loop counter). There is no method I am aware of to automatically control this behavior outside of looking into a third party product (for example, Softbase Checkpoint Restart, and others).

Labels: ,

Wednesday, August 22, 2007

Optimistic Locking [DB2 9 for z/OS]

DB2 Version 9 improves support for coding optimistic locking techniques. What is optimistic locking? Sometimes referred to as optimistic concurrency control, optimistic locking is basically just what it sounds like. We are optimists and think that usually we will be the only ones with interest in the data. In other words, when optimistic locking is implemented you are assuming that most of the time there will be no other programs that are interested in the page of data that you are planning to modify.


Of course, even in the most optimistic world there will be exceptions, so optimistic locking does not assume that there will never be any concurrent processes that need to access your page(s). Basically, with optimistic locking you can improve performance by minimizing locking. So how do we do that?


When an application uses optimistic locking, locks are obtained immediately before a read operation and then released immediately. Update locks are obtained immediately before an update operation and held until the end of the transaction. Optimistic locking uses the RID (Record IDentifier) and a row change timestamp to test whether data has been changed by another transaction since the last read operation.


DB2 knows when a row was changed and so therefore he (I make DB2 masculine) can ensure data integrity even as he minimizes the duration of locks. With optimistic locking, DB2 releases the page (or row) locks immediately after a read operation. And if you are using row locks, DB2 releases the row lock after each FETCH, taking a new lock on a row only for a positioned update or a positioned delete.


Careful readers will have noticed that I talked about a “row change timestamp” but you may not have heard that expression before. DB2 V9 adds support for automatically generated timestamp columns and if you wish to implement optimistic locking you will need to create (or alter) your tables to have a row change timestamp column, defined as follows:


NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

or

NOT NULL GENERATED BY DEFAULT
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP

For tables having a row change timestamp column, DB2 will automatically populate and maintain the timestamp values for each row. Notice how the syntax is similar to the syntax used for other automatically generated DB2 values, such as sequences. DB2 will automatically generate the timestamp value for each row when the row is inserted, and modify the timestamp for each row when any column in that row is updated.

When you add a ROW CHANGE TIMESTAMP column to an existing table, the initial value for existing rows will not be immediately populated. Instead, DB2 places the table space in an advisory-REORG pending state. When you reorganize the table space, DB2 will generates the values for the ROW CHANGE TIMESTAMP column for all rows (and, of course, remove the advisory-REORG pending status).

Now that you have this new column you can use it as a condition for making an UPDATE and specify it in your WHERE clause. Let’s walk thru a couple of examples.

First of all, when a table contains a ROW CHANGE TIMESTAMP you can use it to find out when its rows were modified. Let’s use the following table as an example:

CREATE TABLE CUSTOMER
(CUSTNO CHAR(8) NOT NULL,
CUST_INFOCHANGE NOT NULL GENERATED ALWAYS
FOR EACH ROW ON UPDATE
AS ROW CHANGE TIMESTAMP,
CUST_NAME VARCHAR(50),
CUST_ADDRESS VARCHAR(100),
CUST_CITY CHAR(20),
CUST_STATE CHAR(2),
CUST_ZIP CHAR(9),
CUST_PHONE CHAR(10),

PRIMARY KEY (CUSTNO)
);

Now that the table is defined with the ROW CHANGE TIMESTAMP we can use it in our programs and queries to determine change information about the data. For example, if we want to find all of the customer rows that were changed in the past week (ie. the last 7 days) we could run the following query:


SELECT CUSTNO, CUST_NAME

FROM CUSTOMER
WHERE ROW CHANGE TIMESTAMP FOR CUSTOMER <=
CURRENT TIMESTAMP AND
ROW CHANGE TIMESTAMP FOR CUSTOMER >=
CURRENT TIMESTAMP - 7 DAYS;

But what would happen if you issued a statement like this against a table that was altered to include a ROW CHANGE TIMESTAMP? For example, if we created the CUSTOMER table as shown but without the CUST_INFOCHANGE column, populated the table with data, and then altered the table to include the CUST_INFOCHANGE column? In this case, DB2 will use the time the page was last modified. So the results will not be exactly correct because it would return all the rows on each page that qualifies (because at least one row on the page changed). This is why it is important to clear up the advisory REORG pending as soon as possible after adding the ROW CHANGE TIMESTAMP.

OK, this is all well and good, and you can probably see the value of having this automagically changing timestamp in some of your tables, but where is the optimistic locking part? Well, for programs that use updateable static scrollable cursors DB2 can use optimistic locking as long as the plan/package is bound specifying ISOLATION(CS). If you have this situation, DB2 will deploy optimistic locking to reduce the duration of locks between consecutive FETCH operations and between fetch operations and subsequent positioned UPDATE or DELETE operations.

Remember, though, DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With dynamic scrollable cursors, the most recently fetched row or page from the base table remains locked to maintain position for a positioned UPDATE or DELETE.

Without optimistic locking, the lock taken at the first FETCH is held until the next FETCH. The lock taken at the last FETCH is held until COMMIT, ROLLBACK, or the end of transaction.

With optimistic locking, the scenario changes significantly. When the application requests a FETCH to position the cursor on a row, DB2 locks that row, executes the FETCH and releases the lock. When the application requests a positioned UPDATE or DELETE on the row, DB2 locks the row and then re-evaluates the predicate to ensure that the row still qualifies for the result table.

So, when you move to DB2 9 for z/OS you should evaluate your applications looking for programs that could take advantage of optimistic locking – and then add the ROW CHANGE TIMESTAMP to the appropriate tables.

Labels: ,

Monday, June 18, 2007

Skipping Locked Rows [DB2 9 for z/OS]

In DB2 9 it is possible for a transaction to skip over rows that are locked. This can be accomplished by means of the SKIP LOCKED DATA option within your SQL statement(s). SKIP LOCKED DATA can be specified in SELECT, SELECT INTO, and PREPARE, as well as searched UPDATE and DELETE statements. You can also use the SKIP LOCKED DATA option with the UNLOAD utility.

Of course, if a program skips over locked data then that data is not accessed and the program will not have it available. When this option is used DB2 will just skip over any locked data instead of waiting for it to be unlocked. The benefit, of course, is improved performance because you will not incur any lock wait time. But it comes at the cost of not accessing the locked data at all. This means that you should only utilize this clause when your program can tolerate skipping over some data.

The SKIP LOCKED DATA option is compatible with cursor stability (CS) isolation and read stability (RS) isolation. But it cannot be used with uncommitted read (UR) or repeatable read (RR) isolation levels. DB2 will simply ignore the SKIP LOCKED DATA clause under UR and RR isolation levels.

Additionally, SKIP LOCKED DATA works only with row locks and page locks. That means that SKIP LOCKED DATA does not apply to table, partition, LOB, XML, or table space locks.
Let's look at an example. Suppose we have a table with 5 rows in it that looks like this:

KEY FNAME LNAME
--- ------ -------
1 JOE MAMA
2 DON KNOTTS
3 KIM PORTANT
4 BOB NOBBLE
5 KIM BIMBO


Assume row level locking. Next assume that an UPDATE statement is run against the table changing FNAME to JIM WHERE LNAME = 'KIM'. And it is hanging out there without a COMMIT. Next, we run:


SELECT COUNT (*)
FROM TABLE
WHERE FNAME >= ’AAA’
SKIP LOCKED DATA;

The count returned would be 3 because DB2 skips the two locked rows (rows 3 and 5). And, of course, if the locks are released the count would be 5 again.

Labels: ,

Monday, December 19, 2005

Minimizing Lock Contention Issues

I frequently get e-mails with DB2 questions and I plan to start posting answers to some of the more common ones up here.

One issue that comes up a lot is dealing with locking issues. Usually it is posed by someone who is experiencing timeouts in an online environment and they want to know how to minimize them. Here is some guidance.

When you experience a timeout, it means that another process holds a lock on the data that you are trying to modify. So, it stands to reason that you should try to minimize the duration of locks that are being held in your system. There are some approaches you can take to achieve this, but for the most part, they require programming changes.

First of all, make sure that all of your batch processes -- especially any that run during the same timeframe, but really all batch process -- have a COMMIT strategy. This means that your programs should issue a COMMIT after processing "a set number of" inserts, updates, and deletes. A COMMIT will tell DB2 to make the changes permanent and releases locks.
A good approach is to set a counter that is incremented after every modification. Then, check it and when it exceeds a predefined threshold -- say 25 or 50 or 100 modications -- then issue a COMMIT. You should make the threshold an input parameter so that you can change it as the workload in your system changes. For example, make it 25 when concurrent activity is high, but ramp it up to 100 or higher when it is low. Failure to issue COMMITs will result in timeouts, as well as possibly deadlocks and lock escalation.

(Also, please note that these are just sample numbers and not necessarily the correct numbers to start with at your shop.)

A good article to read regarding COMMIT strategies is "The Woes of Commitment" by Bonnie Baker.

Next, look at all of your programs, batch and online, and try to save the data modification statements to as close to the COMMIT as possible. By saving the data modification until right before you issue a COMMIT, you reduce the overall average lock duration. This should result in reducing contention and therefore, the number of timeouts.

If you want to investigate the timeout details, be sure to examine the statistics trace class 3 and IFCID 0196 for timeouts (IFCID 0172 is for deadlocks).

Labels: ,