Re: cache buffers chain latch contention

  • From: "Jonathan Lewis" <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 25 Mar 2004 16:57:46 -0000

Which version of Oracle ?

How parallel ?

Would you send the full execution plans for
the parallel and serial create statements.



Contention for cache buffers chains latches
only means that multiple processes want to
find blocks that are protected by the same
latch - this could be caused by almost any 
type of activity.

In the case of a complex parallel CTAS,
it could be that you have N slaves which
are scanning one table and indexing into
another (very small) table, and therefore
continually hitting the same couple of blocks 
in the small table concurrently.  If this is the
case, then perhaps identifying that small table
and rebuilding it with one row per block 
would be sufficient to solve the problem.
But you do need to know very specifically
where the problem is before you start rebuilding
objects.

Since the process is stuck anyway, why not
run it with 10046 set at level 8 - this should
propagate to the slaves, and you can see which
of the decomposed pieces of SQL they are
getting stuck on, and this might help pinpoint
the problem.


(In general, a db block doesn't hold rows from
multiple tables - only a block belonging to a 
cluster has that option).

Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

April 2004 Iceland  http://www.index.is/oracleday.php 
June  2004      UK - Optimising Oracle Seminar


----- Original Message ----- 
From: "Koivu, Lisa" <Lisa.Koivu@xxxxxxxxxxxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Sent: Thursday, March 25, 2004 4:10 PM
Subject: cache buffers chain latch contention


Hi everyone,

A couple of days ago I attempted a parallel mv-create statement in my
dev database. Instead of completing, or even showing any progress, it
sat there.  When I didn't run it in parallel, I saw temp space being
chewed up, to me that indicates progress.  When I run it in parallel,
nothing.  It sat there.

Investigation of v$system_event indicated latch free wait.  OK, fine, so
I look at v$latch and find I have enormous misses and sleeps on cache
buffers chains. I chased down the child latches and pinpointed the
affected objects.  I had several hash partitioned indexes (4096
partitions, yes I went crazy, hey why not?) that were on the latches.  I
rebuilt them range partitioned, tried the mv build again, same problem.
OK, fine, then what? 

My impression of this latch, and what happens, is this (PLEASE correct
me if I am wrong):  A db block can house records from multiple tables.
When I fired the build in parallel, the slaves began fighting over who
got the latch for what hot block in cache.  It appears that I can begin
to alleviate this problem by rebuilding objects off of hot child
latches.   

Am I wrong?  Metalink doesn't say much other than "this is your
problem".  Gaja states this is because of excessive i/o according to
Cary Millsap http://www.revealnet.com/newsletter-v3/0402_C.htm , and
Steve Adams theorizes that it could be due to an extreme number of free
extents, http://www.ixora.com.au/q+a/0104/13001720.htm and comes to the
same conclusion as Cary
http://www.ixora.com.au/newsletter/2000_11.htm#hash_latches in that it's
excessive i/o. 

Well, yes, that's why I'm trying to build a mv to stop this nasty join
between 18 tables for queries that someone set the expectation of "it
should run in a few minutes" (don't ask, I'm tired of fighting).  I need
to get through the create. 

My question is, will reorging objects really help?  IN the last few days
I have heard more than once "just separate data and indexes, temp and
redo on different luns" and don't worry too much about it.  This also
came from an oracle "expert" that is coming next week to hold our hands
through a Windows 2003 RAC install.  Seems to me that if moving objects
will help, this idea of just split it up simplistically and let it go is
out the window.  

What am I missing?  Since this is a dw, yes, it's going to have a ton of
i/o.  There isn't much I can do about that - the reports for our first
user group are detailed, not summarized.  They really do pull thousands
of records and look at them.  The bottom line is I will need to turn PQO
on in this database, and right now I am convinced I'll have the same
lockup in prod because I see a large number of sleeps and misses on the
same latch.  

I have no idea why my title is "senior" dba.  Today I feel like I just
started learning Oracle yesterday.  

I am interested in any comments... they sure would be appreciated...
thanks all

Lisa Koivu
Oracle Database Monkey
Orlando, FL, USA



----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: