Re: Poor Performance of undo space management in 10.2 - bug info
- From: Daniel Fink <daniel.fink@xxxxxxxxxxxxxx>
- To: john.hallas@xxxxxxxxxx
- Date: Fri, 15 Feb 2008 11:50:42 -0700
These are actually problems that date back to 9i. The first problem we
encountered was the incorrect status of the extent. The second was the
excessive growth of the undo tablespace as new undo segments were
created in excess of actual transaction load (eventually crashing the
instance).
We found (and could reproduce) a situation where extents containing just
committed (less than 1 minute with a 60 minute undo_retention)
transactions were marked as expired. Other extents containing data
committed over 24 hours previous were still marked as unexpired. We
never received an adequate explanation about the behavior.
The second issue was hard to reproduce as it only occurred on large oltp
systems with very high transaction load (thousands of transactions per
second, though only about 2 thousand concurrent transactions) and was
very sporadic. Segments were created when other segments were actually
usable (no active transactions in them). Eventually, thousands of undo
segments were created and once created an undo segment is *never*
dropped (a real weakness in automatic undo). This was the result of SMON
managing the segments and how it tracked used/usable segments.
All in all, automatic undo (from 9.2 on) is pretty good and works well
for 99% of the systems out there. Like anything, you do need to be
careful when working with a high volume system. If you want to learn
more about automatic undo (9i...I never updated the research nor paper
for 10g), you can download my paper on Automatic Undo Internals at
http://www.optimaldba.com/papers/AutomaticUndoInternals.pdf
Regards,
Daniel Fink
--
Daniel Fink
Oracle Performance, Diagnosis and Training
OptimalDBA http://www.optimaldba.com
Oracle Blog http://optimaldba.blogspot.com
John Hallas wrote:
This note is a heads-u on a bug which has caused me some problems over
the last few days and yet is easily identifiable and resolvable (once
you recognize the issue).
We have spent the last 2 days trying to run a benchmark to prove end
to end performance of a new code set. We have been plagued by undo
tablespace problems which had the following symptons :-
* Rapid growth of used undo tablespace
* Serious deterioration in performance as the undo tablespace got
very full
* Loss of application connectivity as responses were not received
in time (trading system with about 7 servers being used to hold
components of the system)
The complex set up of the test rig tended to mask the database aspect
on each run and it was only this morning that we really focused on the
undo tablespace.
We had AUM set and a 60 second retention period and various sized t/s
using both Ramsan and local disk.
The problem was identified as undo extents remaining marked as
unexpired well past the retention period, despite all connections
being terminated and no active transactions running.
Searching on Metalink showed Note 5387030.1 which refers to a bug with the
TUNED_UINDORETENTION setting. This can be seen in v$undostat and once we had run alter
system set "_smu_debug_mode" = 33554432; the v$undostat.tuned_undoretention
statistic dropped from 345600 to 2188 and performance improved with unexpired undo
segments hardly rising despite heavy throughput.
This bug is common through 10.2.01 to 10.2.0.3 or is fixed in 10.2.0.4 or V11.
John
+44 (0)113 223 2274 (direct)
+44 (0)113 297 9797
------------------------------------------------------------------------
The information included in this email and any files transmitted with
it may contain information that is confidential and it must not be
used by, or its contents or attachments copied or disclosed, to
persons other than the intended addressee. If you have received this
email in error, please notify BJSS.
In the absence of written agreement to the contrary BJSS' relevant
standard terms of contract for any work to be undertaken will apply.
Please carry out virus or such other checks as you consider
appropriate in respect of this email. BJSS do not accept
responsibility for any adverse effect upon your system or data in
relation to this email or any files transmitted with it.
BJSS Limited, a company registered in England and Wales (Company
Number 2777575), VAT Registration Number 613295452, Registered Office
Address, First Floor, Coronet House, Queen Street, Leeds, LS1 2TW
------------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.20.4/1277 - Release Date: 2/13/2008 8:00 PM
- Follow-Ups:
- Two 9i DBs on Win Server 2003
- From: JApplewhite
- References:
- Poor Performance of undo space management in 10.2 - bug info
- From: John Hallas
Other related posts:
- » Poor Performance of undo space management in 10.2 - bug info
- » Re: Poor Performance of undo space management in 10.2 - bug info
- » RE: Poor Performance of undo space management in 10.2 - bug info
- » RE: Poor Performance of undo space management in 10.2 - bug info
- » Re: Poor Performance of undo space management in 10.2 - bug info
- » Re: Poor Performance of undo space management in 10.2 - bug info
This note is a heads-u on a bug which has caused me some problems over the last few days and yet is easily identifiable and resolvable (once you recognize the issue).
We have spent the last 2 days trying to run a benchmark to prove end to end performance of a new code set. We have been plagued by undo tablespace problems which had the following symptons :-
* Rapid growth of used undo tablespace
* Serious deterioration in performance as the undo tablespace got
very full
* Loss of application connectivity as responses were not received
in time (trading system with about 7 servers being used to hold
components of the system)
The complex set up of the test rig tended to mask the database aspect
on each run and it was only this morning that we really focused on the
undo tablespace.
We had AUM set and a 60 second retention period and various sized t/s using both Ramsan and local disk.
The problem was identified as undo extents remaining marked as unexpired well past the retention period, despite all connections being terminated and no active transactions running.
Searching on Metalink showed Note 5387030.1 which refers to a bug with the TUNED_UINDORETENTION setting. This can be seen in v$undostat and once we had run alter system set "_smu_debug_mode" = 33554432; the v$undostat.tuned_undoretention statistic dropped from 345600 to 2188 and performance improved with unexpired undo segments hardly rising despite heavy throughput.This bug is common through 10.2.01 to 10.2.0.3 or is fixed in 10.2.0.4 or V11. John
+44 (0)113 223 2274 (direct) +44 (0)113 297 9797
------------------------------------------------------------------------The information included in this email and any files transmitted with it may contain information that is confidential and it must not be used by, or its contents or attachments copied or disclosed, to persons other than the intended addressee. If you have received this email in error, please notify BJSS. In the absence of written agreement to the contrary BJSS' relevant standard terms of contract for any work to be undertaken will apply. Please carry out virus or such other checks as you consider appropriate in respect of this email. BJSS do not accept responsibility for any adverse effect upon your system or data in relation to this email or any files transmitted with it. BJSS Limited, a company registered in England and Wales (Company Number 2777575), VAT Registration Number 613295452, Registered Office Address, First Floor, Coronet House, Queen Street, Leeds, LS1 2TW
------------------------------------------------------------------------ No virus found in this incoming message.Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.20.4/1277 - Release Date: 2/13/2008 8:00 PM
- Two 9i DBs on Win Server 2003
- From: JApplewhite
- Poor Performance of undo space management in 10.2 - bug info
- From: John Hallas