Re: cache buffer chains/where in code

From: Christo Kutrovsky <kutrovsky.oracle@xxxxxxxxx>
To: Martin Berger <martin.a.berger@xxxxxxxxx>
Date: Wed, 9 Dec 2009 22:04:15 -0500
Hi All, an update on this issue. I am actively working on this trying to
find and eliminate the cause.

I've separated the CPUs on which Oracle runs. I've assigned it 32 of the 64
threads. Comparable to 8 'real' CPUs. I have 3 applications connecting to
this database. Two are remote (same type, different servers) and are read
only, 1 is local and is read/write. The remote applications are C++, the
local is java. The appications are "driven" via a java test suite simulating
users.

For whatever reason, at some random point BOTH remote applications send a
storm of requests at a specific instant. This is observed via very deep run
queue (400-600 via vmstat) and matching v$active_session_history data.

When this happens, there are 2 scenarios, with one of them occuring most
times.
- high user mode CPU with corresponding "cursor: pin S" waits and many
sessions on the CPU.  This occurs rarelly.
- high system (kernel mode) CPU (100% of all assigned threads), which either
missed samples in v$active_sesssion_history or cache buffer chain waits.

What I am after is the second scenario, high kernel mode CPU. At this point
I will not be addressing the reason for a storm of connections nor reducing
the number of connections.

When this high kernel mode CPU is happening, mpstat is reporting a very high
number of smtx - spins on mutexes. Aggregated over all 32 threads (virtual
CPUs) it's in the 1-2 million range. During "normal" processing it's in the
1-2 thousand range. This "condition" happens for 5-8 seconds at most.

Now I have to find a way to do long term reporting on "lockstat -C" and
time/graph where it occurs. At this point I am considering OS bugs as well.

For those interested in how I am collecting and plotting the stats, i used
this simplistic approach:

nohup vmstat 1 | perl -ne '/kthr|r/ || print
join("\t",scalar(localtime()),split),"\n"' > vmstat_1.log &
nohup mpstat -a 1|perl -ne '/^SET/ || print
join("\t",scalar(localtime()),split),"\n"' > mpstat_a1.log &
nohup mpstat 5|perl -ne '/^CPU/ || print
join("\t",scalar(localtime()),split),"\n"' > mpstat_5.log &

And then directly copy/paste into EXCEL. The new version has some very
pretty conditional formating - shows relative 'bars' inside the cells and
makes thing really easy to spot.

If the list is interested I will keep posting updates. Any ideas are also
welcome.


On Sat, Nov 28, 2009 at 2:21 PM, Martin Berger <martin.a.berger@xxxxxxxxx>wrote:

>
> It's verry interresting to follow this thread, as it seems there are 2
> discussions:
> Christo wants to know why this particular latch (and only this one) causes
> problems at big load (maybe even not at such big system-load, this is not
> clear to me right now);
> but Greg warns to consider the special 'features' of a CMT system regarding
> general CPU-queues in a OLTP system.
> To hunt Crhistos hot latch, I'd follow Tanels suggestions (LatchprofX) and
> more or less ordinary tuning methods.
> I total agree with Gregs considerations, I just like to see valid numbers
> (response times in relation to any of the possible 'processors') - this
> might enable us to pinpoint gregs 65% to dedicated numbers (processes,
> transactions, whatever).
> If I'm totally wrong, please tell me; otherwise I'll follow this
> conversations (both) with high interrest!
>
>
> Am 28.11.2009 um 18:05 schrieb Greg Rahn:
>
>  Given that config, I'd say that system is has at over 4X the amount of
>> db connections it probably should (and needs to work well) - I'd back
>> it down to 64 as a staring point and make sure the connection pool
>> does not grow.  Set the initial and max connections to be the same
>> number.  One might think that you need more sessions to keep the CPUs
>> busy (and you may need more than 1 per CPU thread) but the reality is
>> this:  With a high number of sessions, the queue is longer for
>> everything.   The chance of getting scheduled when it needs to goes
>> down and if there is fairly steady and a medium to high load, any
>> "bip" will cause a massive queue for a resource.  Consider what
>> happens when calls are taking milliseconds and for a split second,
>> some session holds a shared resource - it may take the system tens of
>> minutes to recover from that backlog.  This is why most high
>> throughput OLTP systems only want to run at a max of 65% (or so) CPU
>> utilization with very short run queues - so that if there is any slow
>> down, there is enough resource head room to recover.  Otherwise the
>> system will likely be in a unrecoverable flat spin at Mach 5.
>>
>> On Sat, Nov 28, 2009 at 12:13 AM, Christo Kutrovsky
>> <kutrovsky.oracle@xxxxxxxxx> wrote:
>>
>>> Greg,
>>>
>>> It's a single UltraSparc T2 CPU, which is 8 cores, 8 threads. Note that
>>> each
>>> core has 2 integer pipelines. So you could assume 16 CPUs and 64 threads.
>>>
>>> There are many things that are wrong with this setup, and reducing the
>>> number of connections is something I am considering. However it's not
>>> that
>>> simple. Imagine that instead of CPU those were doing IO. You want to have
>>> a
>>> relatively deep IO queue to allow the raid array to deliver.
>>>
>>> One thing that puzzles me is given that the suspicion is deep cpu run
>>> queue
>>> is problems, why only one very specific latch is causing the problem.
>>> There
>>> are several different types of queries running at the same time, why only
>>> one specific query is causing latch contention, why not the other ones.
>>>
>>
>> --
>> Regards,
>> Greg Rahn
>> http://structureddata.org
>> --
>> //www.freelists.org/webpage/oracle-l
>>
>>
>>
>


-- 
Christo Kutrovsky
Senior Consultant
Pythian.com
I blog at http://www.pythian.com/blogs/
Re: cache buffer chains/where in code

Other related posts: