Predicting high "PX Deq Credit" waits in PQ

  • From: "Rich Jesse" <rjoralist@xxxxxxxxxxxxxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 30 Jul 2008 14:11:26 -0500 (CDT)

Hey all,

Environment is 10.1.0.5.0 on a 4-core development AIX P5 box with all
"parallel%" init.ora parameters defaulted.  Due to Oracle bugs (especially
        6765819), these cannot be altered at the system level.  For testing
purposes, I've done an "alter session force parallel query parallel 3" in a
login trigger for my schema.  Also, the DB has a blocksize of 8KB with a
somewhat confusing DB_FILE_MULTIBLOCK_READ_COUNT at 128 -- perhaps a
leftover from previous testing of IO.  This is a test system afterall...

We have a 12-table query (10/11 joins are LEFT OUTERs, same one as my
previous post on hints) for data warehousing whose resulting rows are then
used to fetch data from another table one at a time.  I've incorporated that
last table into the primary query and am testing PQ with it now.

If the new table is (incorrectly) inner joined, the execution is relatively
fast.  Our friend Grid Control shows the three parallel processes happly
[sic] in "db file sequential read" wait while slamming through the tables.

If I correct the join on the added table to LEFT OUTER, the explain plan
changes, obviously, but the parallel execution changes drastically.  For a
few seconds, three processes hammer the IO -- this is what I'd expect. 
After that initial burst however my process that's running the query does
all the IO while two of the parallel procs sit in the "PX Deq Credit: send
blkd" idle wait for the 15-minute duration of the query.

This has the effect of negating the benefit of the parallel processing for
me.  While three procs are concurrently issuing IO, "nmon" shows 40-120MB/s
reads on the datafile volume, with the R:W-SizeKB stat being 24-100+.  When
the parallelism stops in the second case, I get 5-10MB/s reads with the
R:W-SizeKB being firmly planted at the predictable number of 8.

I've been looking at All Things Oracle Parallel via MetaLink and Google, but
no luck so far other than the definition of the wait event and a somewhat
unhelpful explanation of "_px_trace" output.

Thoughts anyone?  I'm thinking that I should be able to predict this
behavior from the explain plan, but not sure where to look.  I won't try to
post the 69-line explain plan here...

TIA!
Rich

--
//www.freelists.org/webpage/oracle-l


Other related posts: