Re: P3 for "gc buffer busy"

  • From: "Jonathan Lewis" <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 30 Jul 2013 23:17:21 +0100

Idle speculation - when a transaction needs a new undo block it updates the 
transaction table slot (and when a transaction start or ends, it updates a 
transaction table slot, of course). But these changes tend to be relatively 
infrequent because transactions are spread across a number of undo 
segments.

When a large transaction is rolling back, it decrements the undo block 
counter in the transaction table slot each time it moves (backwards) from 
one undo block to the previous used undo block.

If you had a session rolling back a very large transaction it could result 
in a large number of updates to the segment header (hence current gets, 
leading to BBWs from other sessions) in a short period of time. Since RAC 
nodes can collide when checking for transaction commit time (ktugct) the 
effect could be exaggerated cross-instance.

Rebooting the system would result in a "less urgent" rollback after the 
database restarted - and might even bypass the blockwise update to the 
transaction table slot.


Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com/all-postings

Author: Oracle Core (Apress 2011)
http://www.apress.com/9781430239543

----- Original Message ----- 
From: "Herring, David" <HerringD@xxxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Sent: Tuesday, July 30, 2013 7:57 PM
Subject: P3 for "gc buffer busy"


| Folks,
|
| I'm wondering if one of you fine listers would be able to point me in the 
right direction concerning a recent problem we had.  The system in question 
is a 4-node RAC on 10.2.0.2, RHEL 4.6 and the database is primarily an OLTP 
db.
|
| Yesterday afternoon a bunch of INSERT statements on nodes 1 and 2 started 
running really slow.  We get thousands of these small INSERTs - straight 
data, no joins.  I checked on top waits and saw "gc buffer busy" being the 
top wait by a factor of 200.  Checking that wait, p1 = 6 and p2 = 69.  I 
checked for FILE_ID = 6 and saw it was UNDOTBS3, our undo tablespace on 
node 3.  I dumped this block and the tracefile ends with the transaction 
table listing (TRN TBL) so I assume file# 6, block# 69 in UNDOTBS3 is an 
undo segment header.
|
| This is where I am stuck.  I had a ton of statements running VERY slow, 
all waiting on an undo segment header for node 3.  What I couldn't 
determine is why?  I can understand needing to lookup a block in undo but I 
couldn't determine why that access was so extremely slow.  In the end 
because of the extreme pressure applied by those above me we decided to 
shutdown instance 3, which resolved the issue a few seconds after it was 
officially down.  I hated the brute force method but that's at that point 
my vote didn't count.
|
| What I'm stuck with is trying to answer why access of that block was so 
slow.  Is it possible/probable that an active transaction at some point 
modified a block in TABLEA and then continued for a long time without 
commit/rollback, then all these thousands of INSERTs needed to possibly 
modify a block in TABLEA so they were all forced to try and get the 
transaction table of this undo segment on UNDOTBS3?  We had someone review 
all hardware resources, including the interconnect, and all those checks 
came back clean.
|
| I should add that before instance 3 was restarted I did a CTAS of 
GV$ACTIVE_SESSION_HISTORY, just to preserve the ASH detail so I can step 
through it and find plenty of details.  I'm just at a loss of trying to 
find out why everyone wanted this block, and who, if anyone, was somehow 
blocking on it.
|
| Dave Herring
|
|
| --
| //www.freelists.org/webpage/oracle-l
|
|
|
|
| -----
| No virus found in this message.
| Checked by AVG - www.avg.com
| Version: 2013.0.3349 / Virus Database: 3209/6536 - Release Date: 07/30/13
| 

--
//www.freelists.org/webpage/oracle-l


Other related posts: