Re: are redo records always flushed in order?
- From: Jessica Mao <jessica.mao@xxxxxxxxxx>
- To: Jeremy Paul Schneider <jeremy.schneider@xxxxxxxxxxxxxx>
- Date: Fri, 27 Apr 2007 02:34:45 -0700
by introducing batch nowait oracle has left the tx durability in users'
hands. i believe physically db will still be consistent. (engineers at
oracle ain't that bad. ;o) ) but logically the data could be corrupted.
tx1 tx2 below belong to different sessions. that's the point, if some
sessions are running in batch nowait, some immediate wait, and have data
overlapping/dependence, any chances for data corruption?
coming to redo write and physical write, many platform can handle 1MB
per physical write, if the db is on raw, so is enough. but if db is on
file system, it could be as small as 8K per write and is subject to
tuning. (didn't use lgwr but did test i/o size using simple os cmds) but
today as our storage specialist pointed out, os/storage have rollback
too! they should be able to make sure 1 big write request from lgwr is
either accomplished or failed -- cleanup then. so could probably put my
assumption / worry about corruption from partially flushed redo write to
rest.
p.s. it wasn't our choice to join, and we're still getting lost on the
campus. i'll come back once i have something (that's not confidential) ;o)
p.p.s. thanks for pointing to the very interesting thread. exactly why i
chose to post my Qs here.
-jessica
Jeremy Paul Schneider wrote, On 4/26/2007 6:22 AM:
FWIW here's a good discussion of private redo strands (aka zero-copy
redo):
http://www.freelists.org/archives/oracle-l/02-2005/threads.html#00630
- The thread is called "latch-free SCN scheme ( 10.1.0.3
<http://10.1.0.3>)
On 4/26/07, *Jeremy Paul Schneider* <jeremy.schneider@xxxxxxxxxxxxxx
<mailto:jeremy.schneider@xxxxxxxxxxxxxx>> wrote:
Yeah... I wasn't thinking about nowait or private strands... and
I don't (yet) know a lot about the specifics of how these work
internally. Also, in addition to private strands which were
apparently introduced in 10g there's also log parallelism which
was introduced in 9i allowing multiple processes to write to
different areas of the main redo buffer simultaneously.
I don't know what the implications are of this; but as I said
before I have a hunch that this has already been carefully worked
through by the engineers at Oracle - considering the fanfare with
the release of COMMIT NOWAIT and considering the importance of
crash recoverability in Oracle.
A few other thoughts - based on my understanding of redo and crash
recovery my guess is the opposite of yours - that in your example
using COMMIT NOWAIT *any* records whose COMMIT made it into the
redo log will not be rolled back. But another thought - from what
I can gather (based on reading a few old oracle-l emails,
presentations, and my own guesses) - private redo strands and
individual buffer latches (when using parallelism) are allocated
per-process; so assuming that TX1 and TX2 are happening in the
same session, I think that their log entries would probably be
written out in order to the logfiles even if private strands or
parallelism were enabled. But that's just conjecture on my part.
Hmmm... maybe you could make a test tablespace and a test table
with a few rows and one row per block, then put the tablespace in
backup mode and spawn a few processes that update the table. Then
strace (or truss on sun) the LGWR process and see if the writes
are sequential and how big the writes are... also it's worth
pointing out that even if we're issuing 1MB writes to the OS we'd
still want to ensure that that OS is writing that data in order
(if the device itself doesn't support 1MB writes). I think it
does but I can't prove this either at the moment.
-Jeremy
PS - considering the domain name of your email address, if this is
such a critical question for your "bosses" then is there any way
they can make an inquiry to some of the engineers who actually
work on this stuff?
PPS - maybe someone who's got a lot more experience than I will
add their thoughts... then I could learn a bit more about this
too. :)
--
http://www.freelists.org/webpage/oracle-l
- References:
- are redo records always flushed in order?
- From: Jessica Mao
- Re: are redo records always flushed in order?
- From: Jessica Mao
- Re: are redo records always flushed in order?
- From: Jeremy Paul Schneider
- Re: are redo records always flushed in order?
- From: Jessica Mao
Other related posts:
- » are redo records always flushed in order?
- » Re: are redo records always flushed in order?
- » Re: are redo records always flushed in order?
- » Re: are redo records always flushed in order?
- » 答复: are redo records always flushed in order?
- » Re: are redo records always flushed in order?
- » RE: are redo records always flushed in order?
- » Re: are redo records always flushed in order?
On 4/26/07, *Jeremy Paul Schneider* <jeremy.schneider@xxxxxxxxxxxxxx <mailto:jeremy.schneider@xxxxxxxxxxxxxx>> wrote:
Yeah... I wasn't thinking about nowait or private strands... and
I don't (yet) know a lot about the specifics of how these work
internally. Also, in addition to private strands which were
apparently introduced in 10g there's also log parallelism which
was introduced in 9i allowing multiple processes to write to
different areas of the main redo buffer simultaneously.
I don't know what the implications are of this; but as I said
before I have a hunch that this has already been carefully worked
through by the engineers at Oracle - considering the fanfare with
the release of COMMIT NOWAIT and considering the importance of
crash recoverability in Oracle.
A few other thoughts - based on my understanding of redo and crash
recovery my guess is the opposite of yours - that in your example
using COMMIT NOWAIT *any* records whose COMMIT made it into the
redo log will not be rolled back. But another thought - from what
I can gather (based on reading a few old oracle-l emails,
presentations, and my own guesses) - private redo strands and
individual buffer latches (when using parallelism) are allocated
per-process; so assuming that TX1 and TX2 are happening in the
same session, I think that their log entries would probably be
written out in order to the logfiles even if private strands or
parallelism were enabled. But that's just conjecture on my part.
Hmmm... maybe you could make a test tablespace and a test table
with a few rows and one row per block, then put the tablespace in
backup mode and spawn a few processes that update the table. Then
strace (or truss on sun) the LGWR process and see if the writes
are sequential and how big the writes are... also it's worth
pointing out that even if we're issuing 1MB writes to the OS we'd
still want to ensure that that OS is writing that data in order
(if the device itself doesn't support 1MB writes). I think it
does but I can't prove this either at the moment.
-Jeremy
PS - considering the domain name of your email address, if this is
such a critical question for your "bosses" then is there any way
they can make an inquiry to some of the engineers who actually
work on this stuff?
PPS - maybe someone who's got a lot more experience than I will
add their thoughts... then I could learn a bit more about this
too. :)
- are redo records always flushed in order?
- From: Jessica Mao
- Re: are redo records always flushed in order?
- From: Jessica Mao
- Re: are redo records always flushed in order?
- From: Jeremy Paul Schneider
- Re: are redo records always flushed in order?
- From: Jessica Mao