Re: 10g System statistics - single and multi

  • From: Tim Gorman <tim@xxxxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 18 May 2005 10:12:49 -0600

Ladies and gentlemen,

Some weeks/months ago there was a discussion about what constitutes an
"Oracle scientist" and whether the term is valid or not, pretentious or not,
justified or not, etc.  I believe the discussion began on "AskTom.com" but I
think it spilled over onto this forum, and grew rather rancorous.

I don't want to revisit any of that.  However, I just want to say the
following...

I don't know how long it took Wolfgang to perform all this testing and
collate the results into this concise format to post on this free forum, but
I think that, beyond the shadow of a doubt, Mr Breitling qualifies as a
"scientist" by any definition of the term.  He saw the question, decided how
to prove or disprove it, tested, and posted the reasoning and the results.
Granted the results are neither conclusive nor complete, but the method is
flawless and benefits everyone.

It is a genuine pleasure to read what you write, Wolfgang!

That's all I have to say.

-Tim


on 5/18/05 9:30 AM, Wolfgang Breitling at breitliw@xxxxxxxxxxxxx wrote:

> I would say that concluding from your example that "in all modern SANs,
> unless your dfmbrc is such that you will read > 512 Kb, your mread will
> be lower then sread" is a rather bold statement.
> 
> Excluding caching, mreadtm should always be higher than sreadtm since on
> average it should take the same amount of time to position the head and
> wait for the rotational delay until the data shows up under the head.
> But it takes longer to transmit nK of data than mK when n > m
> 
> I did a quick test, setting db_file_multiblock_read_count to 1, 2, 8,
> 16, 32, 64, 128, and 256 for a table on an 8K blocksize LMT with uniform
> 4M extends stored on an IBM ESS (Shark) 700. These are the numbers from
> the ELA of the extended 10046 trace sequential and scattered read wait
> events:
> 
> 1    539.095
> 2    682.760
> 3    795.782
> 4    911.000
> 6    1066.778
> 7    1171.429
> 8    1274.440
> 9    1824.500
> 10    1912.000
> 11    1994.800
> 15    2569.000
> 16    2812.132
> 25    3794.500
> 26    3880.000
> 31    4688.000
> 32    4790.857
> 36    5218.000
> 38    5260.000
> 40    5332.667
> 56    7578.000
> 57    7565.833
> 64    8454.308
> 102    12553.500
> 108    13349.000
> 128    15635.545
> 
> I failed to clear the buffer between reads which is why some "odd"
> counts show up that to not coincide with any of the dfmrc settings. But
> in general, with the exception of multiblock reads 56 and 57, more
> blocks take longer to read than fewer, and thus mreadtm should be higher
> than sreadtm.
> 
> If system statistics are gathered over a long enough representative
> workload, mreadtm should definitely come out higher than sreadtm. If
> mreadtm is consistently less than sreadtm then I would investigate why
> that is.
> 
> 
> 
> Christo Kutrovsky wrote:
>> I've profiled my SAN. IBM FastT 700
>> 
>> Stripe size plays very little in sequencial or random IO. Actually
>> larger stripe size is a bit better.
>> 
>> Sequencial reads at sizes between 512 bytes to 128 Kb are under 1 ms.
>> Compared to random IO been always in the 6ms range.
>> 
>> So in all modern SANs, unless your dfmbrc is such that you will read >
>> 512 Kb, your mread will be lower then sread.
>> 
>> P.S.
>> Not sure why you send this to me only, and not to the list.
>> 
>> On 5/17/05, Wolfgang Breitling <breitliw@xxxxxxxxxxxxx> wrote:
>> 
>>> Actually, depending on your SAN, it could just as easily be reverse. If you
>>> have a large db_file_multiblock_read_cound (I always refer to it as dfmrc,
>>> taking the initials of each word) the SAN microcode could very well detect
>>> a sequential read pattern and prefetch the next chunk so that cumulatively
>>> the average multiblock read count comes out very fast because later reads
>>> are serviced from the cache and do no real physical IO, wheras if you leaf
>>> dfmrc at a moderate value of say 32, it may be below the prefetch radar.
>>> Christian Antognini has an interesting chart on the relationship between
>>> dfmrc and IO time on different systems. Unfortunately there is not data
>>> about the different storage architectures on those systems.
>>> If prefetch is not a factor, stripe size can come into the equation. If
>>> dfmrc is greater than the stripe size, the average IO time goes up
>>> depending on the # of physical disks involved. The IO rate is spread more
>>> evenly, avoiding hot disks, but a single large IO request can get slower.
>>> 

--
http://www.freelists.org/webpage/oracle-l

Other related posts: