Re: log writer tuning


Thanks for all suggestions and explaination of the whole process

Mark W. Farnham wrote:
Switch related "log file sync" could be driven by competition for i/o by
ARCH. That can be cured by alternating the online logs on areas of the disk
farm that are not in competition with each other for i/o. (If you're not
archiving, this is not an issue. If you have a similar test box you can
switch to noarchivelog mode and run a load test, then you can see if this is
the problem pretty easily.)
That is very good point - I will take it into consideration, thanks
If shrinking the log_buffer size reduces "log file sync" waits then you just
slowed your system down, but that is quite unlikely unless you have commits
larger than one-third the buffer size after you shrink it. Since you have
many, many small commits that is unlikely to be the case.

Having the log buffer bigger than it needs to be just steals memory you
could be devoting to something else, so if you're memory rich it doesn't
hurt much. Having the log buffer too small will hurt, especially if you have
some otherwise unmolested large commits, but the size that is "too small" is
a bit difficult to measure and varies by circumstance in any case. So
usually I make it big enough to remove it from consideration without wasting
too much.
So we decrease log_buffer size to get back memory - I am now convinced, it would not help us much

Cary Millsap wrote:
Another comment worth making: when <log file sync> is a dominant
response time contributor, it can be because of CPU starvation as well.
(By "CPU starvation", I mean too much CPU workload for the CPU capacity
you have. ...Such as what happens when several CPU-intensive queries run
concurrently.)

A <log file sync> is basically the time between the committing process's
posting a message to LGWR and then getting the response back. In between
those two events, these things have to happen:

A. LGWR has to be scheduled and begin executing its code path,
B. ...which of course includes the flush of the buffer.
C. Then the committing process has to be scheduled and returned to its
code path, where it can issue an OS timer call to see how long the <log
file sync> event took.

Most people who see <log file sync> automatically jump to the conclusion
that the problem is B. ...It's always disk for some people, even when
it's really not. But on a system that's CPU starved, A and C can be the
dominant time consumers. Step A is also a dominant time consumer on
systems where people (DO NOT DO THIS!!!) renice their LGWR process to
have a diminished slot in the OS scheduler's pecking order.
thanks for this insight - however I suppose CPU starvation is not the case here - usually we run at avg load of 6 with 12 procs (I believe Solaris load number is well bound with proc number and the good estimation here)

Concluding, we will have an occassion to reconfigure our storage in near future - I suppose it would be fine to separate log files from datafiles and as it would be possible to turn storage for redo into 2 mount points with odd logs on one and even logs on the other.

Thanks once again
Remigiusz


--
---------------------------------------
Remigiusz Sokolowski <rems@xxxxxxxx>
WP/PTI/DIP/ZAB (+04858) 52 15 770
MySQL  v.  4.x
Oracle v. 10.x
---------------------------------------

--
http://www.freelists.org/webpage/oracle-l


Other related posts: