RE: I/O and db_file_multiblock_read_count

hmm. from the numbers it looks like 16 at a time are coming from cache (1.3
milliseconds each call) and 128 at a time are coming from spinning rust
(17.96 milliseconds each call). Probably Oracle?s db file scattered read
instrumentation includes some overhead above the actual disk calls, big
*probably* we can ignore that since you?re looking for a difference of a
factor of almost 14.

So I guess first I?d want to see if an OS routine sucking back the data and
doing nothing with it has similar characteristics.

If you dump the file block of the data you get back and group them up that
way, then you could use something like OD (octal dump) and run a timing
command to toss the data at /dev/null. I guess you?d want to be sure you got
back the file blocks in the same order both times (I sure hope so). You?ll
need to account for skipping around in case the data is not allocated in
order, that is, you need to check for streams of blocks shorter than 16 in
the one case and 128 in the other. But, despite having a million rows you
should crunch down to under 18000 blocks (140*128 is less than that,
likewise 1008*16 ? a bit mysterious that 128 needs 140 reads which at max is
17920 while 16*1008 is only 16128 blocks, but that probably means you have a
bunch of 128 attempts that are longer than the contiguous chunk you?re
trying to read.) So anyway, the amount of data you?ll need to manipulate as
input to od is reasonable to handle. Presuming nothing is pre-warming the
Oracle buffer cache, those block stretches are what I think Oracle should be
trying to read, ignoring multiblock rows and out of band objects.

Using the parameters of starting address and how many blocks to dump, you
can simulate dumping at 16 and 128 blocks at a time by feeding od from an
input file.

If that gives you a similar speed disparity, then it is time to stop looking
at Oracle and look at the details of your SAN configuration and exactly how
it works.

If there is a size request larger than which your SAN bypasses cache and
goes to the rust, for example, that would explain it. (Please notice I?m not
claiming that is the case in your case, it is just an example of slightly
too clever engineering I?ve seen over the years that serves someone?s notion
of the statistical fairness at the cost of the owner of the hardware.)

If the OS level test does *NOT* give a better speed to the smaller
chunkiness, and if the size of the data chunks themselves don?t give you a
clue, then I?d focus on Oracle?s handling of the data.

I?ll be really interested in your results, too.

Regards,

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]On
Behalf Of Kevin Lidh
Sent: Friday, December 08, 2006 3:46 PM
To: oracle-l
Subject: I/O and db_file_multiblock_read_count

I was reading an article about the appropriate setting for
db_file_multiblock_read_count.  I'm on a HP-UX 11.11 64-bit system with
Oracle 9.2.0.7.0.  The original value was 16 and I bounced the database and
ran a million record full-scan test (10046 trace) and then set the value to
128 (max value) and re-ran the test.  Although I did have less calls to 'db
file scattered read' (140 compared to 1008), the time to get the same number
of blocks was longer.  A random example from the trace files looks like
this:

WAIT #1: nam='db file scattered read' ela= 21614 p1=6 p2=3979 p3=126  -->
126 blocks = 21614
WAIT #1: nam='db file scattered read' ela= 10724 p1=6 p2=4107 p3=126

WAIT #1: nam='db file scattered read' ela= 577 p1=6 p2=3979 p3=16
WAIT #1: nam='db file scattered read' ela= 1524 p1=6 p2=3995 p3=16
WAIT #1: nam='db file scattered read' ela= 916 p1=6 p2=4011 p3=16
WAIT #1: nam='db file scattered read' ela= 1022 p1=6 p2=4027 p3=16
WAIT #1: nam='db file scattered read' ela= 1095 p1=6 p2=4043 p3=16
WAIT #1: nam='db file scattered read' ela= 1026 p1=6 p2=4059 p3=16
WAIT #1: nam='db file scattered read' ela= 829 p1=6 p2=4075 p3=16
WAIT #1: nam='db file scattered read' ela= 826 p1=6 p2=4091 p3=14   --> 126
blocks = 7815
WAIT #1: nam='db file scattered read' ela= 437 p1=6 p2=4107 p3=16

And I ran the test again with 16 after my 128 test and the results were
similar to the first test.  The cumulative times for the 'db file scattered
read' was 1.3181s for the 16 and 2.5149s when it was 128.  We use a Hitachi
SAN and I know it has caching but I would still think that the fewer, bigger
requests would be faster.  Is there a measurement or setting on the
operating system or even SAN that I can look at to help determine the
optimal setting?

Other related posts: