Re: Philosophical question on primary keys

  • From: Tim Gorman <tim@xxxxxxxxx>
  • To: mark.powell@xxxxxxx
  • Date: Fri, 07 Aug 2009 10:58:34 -0600

Jay and Mark,

A recent "situation" involving massive performance problems resulting from sessions contending for TX enqueues probably resulted from a decision similar to this.

An extremely active OLTP table (call it "order activity") had two unique indexes in addition to several other non-unique indexes:
  • ID
  • ID and STATUS
As you can tell, the second index on ID and STATUS was redundant, but it was probably created based on a similar chain of reasoning described here.  The fatal flaw was that unique indexes become essentially "in doubt" when a transaction modifies them but is yet uncommitted.  Other transactions wishing to modify that unique index to add a new unique data value cannot do so until the first transaction commits.  Thus, unique indexes become a point of serialization.

Having the unique index on ID was necessary and manageable, since it was modified only on INSERT and DELETE operations, which were relatively well-controlled.  The unique index on the ID and STATUS columns was very problematic however, because the application modified the STATUS column almost constantly.  While this was causing serious performance problems all along, it was regarded as acceptable, or at least not intolerable.  The application developers, wielding the unassailable argument of "its been like that for more than four years" did not want to discuss either dropping the ID/STATUS index or making it non-unique to reduce the contention and queueing.  If it ain't broke, why fix it?

This was true (of course) until volume increased dramatically.  The argument of "its worked for years and nothing has changed" always focuses on code and fails to notice changes in usage and volume.  The straw that broke the camel's back was an "archiving" batch job which deleted orders and inserted them into an "archive" schema.  This consisted of about 30 insert statements followed by 30 delete statements followed by a single commit.  This batch job usually ran during an overnight window, but with the increase in data volumes it was now spilling over into prime time.  The long-running uncommitted transaction for the archival job meant the two unique indexes were essentially "in doubt" for hours, and any modifications could not proceed until the thing committed.  This was bad enough for the programs performing inserts of new orders, but for the apps which wanted to update status it was like hitting a brick wall.

Moral of the story:  always question the use of unique indexes and always seek to minimize their quantity as well as their scope, if possible, because they are a source of serialization.  With partitioned tables, unique indexes often lead to global indexes, which presents another set of impossibilities and difficulties on top of the original difficulties.  I'm not saying that unique indexes are evil, but I've noticed them spew pea soup occasionally...

Hope this helps...
Tim Gorman
consultant - Evergreen Database Technologies, Inc.
P.O. Box 630791, Highlands Ranch CO  80163-0791
website   =
email     = Tim@xxxxxxxxx
mobile    = +1-303-885-4526
fax       = +1-303-484-3608
Yahoo IM  = tim_evdbt

Powell, Mark D wrote:
 Have you considered?

Dropping the PK, creating a unique index on message_id, folder_id, then
adding the PK constraint.  Oracle should use the existing index to
enforce the PK.

-- Mark D Powell --
Phone (313) 592-5148

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of
Sent: Friday, August 07, 2009 12:09 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Philosophical question on primary keys

We have a table with a primary key of message_id.  This table is
occasionally queried on that column alone but most often on both the id
and a folder_id column (also numeric) with no other data being returned.

There are no foreign keys pointing to this primary key.

I'm trying to squeeze every last bit of performance out of one piece of
SQL that accounts 92% of all database i/o.

Using an index on message_id,folder_id reduces LIOs from 36 to 32 for a
typical query as opposed to using the primary key index on message_id.  

The question is whether to create a new index or change the primary key
to include both columns.  

Arguments against modifying primary key: 
The primary key is just message_id, adding folder_id doesn't make it any
more unique.  Also folder_id currently only exists in this table so if
for some unforeseen reason we someday need to point a foreign key to
this table this might cause a problem.  No one thinks this will ever be
necessary but who knows what might happen in the future.

Arguments for modifying primary key: 
One less index on the table means less overhead for inserts/updates.
One less index is less storage used.  Also, I'm having trouble getting
the optimizer to use the second index in our test environment without
resorting to an index hint which I prefer to avoid.  

Comments welcome.  Thanks!

Here's the SQL in case anyone wants to take a look at it:
SELECT i.message_priority_cd, COUNT(*) AS count_label 
FROM 	CLIENTMSG_ADMIN.message_instance i,
CLIENTMSG_ADMIN.message_transmission m, 
	CLIENTMSG_ADMIN.secure_inbox_message secure_inbox WHERE
i.message_id = secure_inbox.message_id AND m.message_id = i.message_id
AND m.channel_cd = 'SECURE_INBOX' 
AND i.account_nbr = :1
AND (m.delivery_status_cd = 'PENDING' OR m.delivery_status_cd
AND m.message_read_ts IS NULL
AND m.delete_ts IS NULL
AND (i.expiration_ts > current_timestamp OR i.expiration_ts IS NULL) AND
secure_inbox.folder_id <> 3 AND secure_inbox.folder_id <> 2 GROUP BY




Other related posts: