RE: Philosophical question on primary keys

  • From: "Powell, Mark D" <mark.powell@xxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 7 Aug 2009 13:55:37 -0400

The columns in this table appear to be static data values by their
nature so hopefully this will not be a problem.  It is also the only
index on the table if I remember the initial post correctly.
Based on the issue being reported I am not sure there is a good
alternative, other than using both columns as part of the PK, since
Richard Foote has written about the performance penalty of using a
non-unique index to support the PK.  Using both columns was not desired
by the OP,

-- Mark D Powell -- 
Phone (313) 592-5148 



        From: Tim Gorman [mailto:tim@xxxxxxxxx] 
        Sent: Friday, August 07, 2009 12:59 PM
        To: Powell, Mark D
        Cc: oracle-l@xxxxxxxxxxxxx
        Subject: Re: Philosophical question on primary keys
        Jay and Mark,
        A recent "situation" involving massive performance problems
resulting from sessions contending for TX enqueues probably resulted
from a decision similar to this.
        An extremely active OLTP table (call it "order activity") had
two unique indexes in addition to several other non-unique indexes:

        *       ID 
        *       ID and STATUS 

        As you can tell, the second index on ID and STATUS was
redundant, but it was probably created based on a similar chain of
reasoning described here.  The fatal flaw was that unique indexes become
essentially "in doubt" when a transaction modifies them but is yet
uncommitted.  Other transactions wishing to modify that unique index to
add a new unique data value cannot do so until the first transaction
commits.  Thus, unique indexes become a point of serialization.
        Having the unique index on ID was necessary and manageable,
since it was modified only on INSERT and DELETE operations, which were
relatively well-controlled.  The unique index on the ID and STATUS
columns was very problematic however, because the application modified
the STATUS column almost constantly.  While this was causing serious
performance problems all along, it was regarded as acceptable, or at
least not intolerable.  The application developers, wielding the
unassailable argument of "its been like that for more than four years"
did not want to discuss either dropping the ID/STATUS index or making it
non-unique to reduce the contention and queueing.  If it ain't broke,
why fix it?
        This was true (of course) until volume increased dramatically.
The argument of "its worked for years and nothing has changed" always
focuses on code and fails to notice changes in usage and volume.  The
straw that broke the camel's back was an "archiving" batch job which
deleted orders and inserted them into an "archive" schema.  This
consisted of about 30 insert statements followed by 30 delete statements
followed by a single commit.  This batch job usually ran during an
overnight window, but with the increase in data volumes it was now
spilling over into prime time.  The long-running uncommitted transaction
for the archival job meant the two unique indexes were essentially "in
doubt" for hours, and any modifications could not proceed until the
thing committed.  This was bad enough for the programs performing
inserts of new orders, but for the apps which wanted to update status it
was like hitting a brick wall.
        Moral of the story:  always question the use of unique indexes
and always seek to minimize their quantity as well as their scope, if
possible, because they are a source of serialization.  With partitioned
tables, unique indexes often lead to global indexes, which presents
another set of impossibilities and difficulties on top of the original
difficulties.  I'm not saying that unique indexes are evil, but I've
noticed them spew pea soup occasionally...
        Hope this helps...
        Tim Gorman
        consultant - Evergreen Database Technologies, Inc.
        P.O. Box 630791, Highlands Ranch CO  80163-0791
        website   =
        email     = Tim@xxxxxxxxx
        mobile    = +1-303-885-4526
        fax       = +1-303-484-3608
        Yahoo IM  = tim_evdbt

        Powell, Mark D wrote: 

                 Have you considered?
                Dropping the PK, creating a unique index on message_id,
folder_id, then
                adding the PK constraint.  Oracle should use the
existing index to
                enforce the PK.
                -- Mark D Powell --
                Phone (313) 592-5148
                -----Original Message-----
                From: oracle-l-bounce@xxxxxxxxxxxxx
                [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of
                Sent: Friday, August 07, 2009 12:09 PM
                To: oracle-l@xxxxxxxxxxxxx
                Subject: Philosophical question on primary keys
                We have a table with a primary key of message_id.  This
table is
                occasionally queried on that column alone but most often
on both the id
                and a folder_id column (also numeric) with no other data
being returned.
                There are no foreign keys pointing to this primary key.
                I'm trying to squeeze every last bit of performance out
of one piece of
                SQL that accounts 92% of all database i/o.
                Using an index on message_id,folder_id reduces LIOs from
36 to 32 for a
                typical query as opposed to using the primary key index
on message_id.  
                The question is whether to create a new index or change
the primary key
                to include both columns.  
                Arguments against modifying primary key: 
                The primary key is just message_id, adding folder_id
doesn't make it any
                more unique.  Also folder_id currently only exists in
this table so if
                for some unforeseen reason we someday need to point a
foreign key to
                this table this might cause a problem.  No one thinks
this will ever be
                necessary but who knows what might happen in the future.
                Arguments for modifying primary key: 
                One less index on the table means less overhead for
                One less index is less storage used.  Also, I'm having
trouble getting
                the optimizer to use the second index in our test
environment without
                resorting to an index hint which I prefer to avoid.  
                Comments welcome.  Thanks!
                Here's the SQL in case anyone wants to take a look at
                SELECT i.message_priority_cd, COUNT(*) AS count_label 
                FROM    CLIENTMSG_ADMIN.message_instance i,
                CLIENTMSG_ADMIN.message_transmission m, 
secure_inbox WHERE
                i.message_id = secure_inbox.message_id AND m.message_id
= i.message_id
                AND m.channel_cd = 'SECURE_INBOX' 
                AND i.account_nbr = :1
                AND (m.delivery_status_cd = 'PENDING' OR
                AND m.message_read_ts IS NULL
                AND m.delete_ts IS NULL
                AND (i.expiration_ts > current_timestamp OR
i.expiration_ts IS NULL) AND
                secure_inbox.folder_id <> 3 AND secure_inbox.folder_id

Other related posts: