Re: My worst nightmare - ORA-8103

  • From: Kenny Payton <k3nnyp@xxxxxxxxx>
  • To: Maureen English <maureen.english@xxxxxxxxxx>
  • Date: Tue, 10 Jun 2014 19:02:16 -0400

If the blocks appear to be empty Oracle formatted blocks I would suspect a
bug on the Oracle side.   It will be interesting to hear what support comes
back with from a root cause perspective.

If, by lost IO, they are speculating an IO to the database was written but
somewhere between Oracle and the disk was lost there should be clear signs
of the block changes in the redo stream.

Kenny
On Jun 10, 2014 6:55 PM, "Maureen English" <maureen.english@xxxxxxxxxx>
wrote:

> This is really an odd problem.  We ran the rman validate commands on
> datafiles and the whole database.
> We also ran dbverify on all of the datafiles.  Nothing comes back marked
> as corrupt.  So, maybe the
> word corrupt isn't quite right.  The data is truly inaccessible, though.
>  Oracle says it's lost I/O:
>
>  The ora-8103 was raising accessing rdba: 0x30c76bb9 (195/486329), because
>> the type=0x00= unkown
>>
>> scn: 0x0000.00000000 seq: 0x01 flg: 0x05 tail: 0x00000001
>> frmt: 0x02 chkval: 0x5ad8 type: 0x00=unknown
>>
>> This block looks as a new block, I don't think had any data in.
>> Means you had lost IO.
>>
>
> But when I copy all of the rows from the table that don't give me an
> ORA-8103 error, using an Oracle
> provided script, I'm missing 480 rows....  Whether or not those rows have
> needed data in them or not
> is a good question....
>
> - Maureen
>
> On 6/10/2014 1:23 PM, Kenny Payton wrote:
>
>>
>> You might want to run a “validate database” in RMAN and check
>> v$database_block_corruption.  You could also start at a datafile level if
>> you can narrow down where you think the corruption exists with a "validate
>> datafile” command.
>>
>> Once you identify the blocks you you can then look to RMAN for block
>> recovery as Jonathan suggested.  If you have had successful backups, and
>> you haven’t set MAXCORRUPT to a non-zero value, then you should be able to
>> use those backups to recover the blocks.  Once the blocks are recorded in
>> v$database_block_corruption you could use “RECOVER CORRUPTION LIST” to
>> restore/recover these blocks.
>>
>>
>> Kenny
>>
>>
>>
>> On Jun 10, 2014, at 5:05 PM, Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
>> wrote:
>>
>>
>>> The error number is still a puzzle - I would have expected 1578 or 1410
>>> (corrupt block, or invalid rowid), but I guess if a block has got corrupted
>>> so that it seems to belong to an object with a higher data object id than
>>> expected then you might get an 8103.  (That's speculation - there may be
>>> other more obvious reasons why you'd get an 8103 that I've overlooked).
>>>
>>> Recover from last know is the "obvious" strategy - but if you've been
>>> doing rman backups then there's a "block recover" option that could tell
>>> rman to pick specific blocks from the file backup and roll forward through
>>> archived redo logs from there.  I guess this is why Oracle support have
>>> been trying to find the rowids of the lost data. Traditionally you do this
>>> by forcing a tablescan (e.g. select count(*) where non-indexed, nullable
>>> column = value you don't expect to see) and checking what gets into the
>>> dump file.  Recover the guilty block, and repeat until no more corrupt
>>> blocks. Perhaps you've already got the list of blocks since you know there
>>> are 480 rows missing, though.
>>>
>>>
>>>
>>> Regards
>>> Jonathan Lewis
>>> http://jonathanlewis.wordpress.com
>>> @jloracle
>>>
>>> ________________________________________
>>> From: Maureen English [maureen.english@xxxxxxxxxx]
>>> Sent: 10 June 2014 21:39
>>> To: Jonathan Lewis; oracle-l@xxxxxxxxxxxxx
>>> Subject: Re: My worst nightmare - ORA-8103
>>>
>>> No partitioned objects.
>>>
>>> Per Oracle Support, it's due to lost I/O.  The blocks are apparently
>>> empty, not formatted,
>>> data is probably not recoverable.  The Support Analyst suggested restore
>>> the datafiles for
>>> the tablespace from the last good backup then doing a restore/recover to
>>> roll forward to
>>> the current time.
>>>
>>> - Maureen
>>>
>>> On 6/10/2014 11:07 AM, Jonathan Lewis wrote:
>>>
>>>>
>>>>
>>>> Ora-8103 is "object no longer exists" - how does this tie in with a
>>>> corrupted block ?  Is this a partitioned object with a missing partition ?
>>>>
>>>>
>>>> Regards
>>>> Jonathan Lewis
>>>> http://jonathanlewis.wordpress.com
>>>> @jloracle
>>>>
>>>> ________________________________________
>>>> From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on
>>>> behalf of Maureen English [maureen.english@xxxxxxxxxx]
>>>> Sent: 10 June 2014 19:28
>>>> To: oracle-l@xxxxxxxxxxxxx
>>>> Subject: My worst nightmare - ORA-8103
>>>>
>>>> Hi,
>>>>
>>>> Original post to BOracle list...apologies to those who are seeing this
>>>> again.
>>>>
>>>> On 5/31, the refresh of a materialized view in our reporting instance
>>>> failed
>>>> with an ORA-8103 error, as did a gather stats job for the same table in
>>>> the
>>>> production database.  We had an application upgrade done on 6/1 and
>>>> copied
>>>> our production database to a preprod version on 6/2, so the errors
>>>> weren't
>>>> caught immediately.
>>>>
>>>> Users complained that the current data wasn't available in the
>>>> reporting instance
>>>> but my attempts to refresh and recreate all failed with the same
>>>> ORA-8103 error.
>>>>
>>>> I've been working with Oracle Support since early last week and keep
>>>> hitting
>>>> brick walls.  At the moment, we've managed to copy 'uncorrupted' rows
>>>> out of the
>>>> table in our preproduction database and are 480 rows short in a table
>>>> with 550M
>>>> rows in it.  Oracle is currently working to identify the rowids of the
>>>> corrupt
>>>> blocks based on the output in a trace file generated by a failing query.
>>>>
>>>> I'm looking at Document 336133.1 while I wait for more info from Oracle.
>>>>
>>>> Anyone have any comments/suggestions/other info that might help
>>>> identify and fix
>>>> the problem?  We really can't afford to lose that much data.  We're
>>>> working on
>>>> recovering our database to a different location to try to get back any
>>>> data that
>>>> we lose, but since I don't have any idea what caused the corruption,
>>>> I'm lost.
>>>>
>>>> - Maureen
>>>> --
>>>> //www.freelists.org/webpage/oracle-l
>>>>
>>>>
>>>> --
>>>> //www.freelists.org/webpage/oracle-l
>>>>
>>>>
>>>>
>>> --
>>> //www.freelists.org/webpage/oracle-l
>>>
>>>
>>>
>>
> --
> Maureen English
> Lead Database Administrator
> University of Alaska
> Fairbanks, AK
> (907) 450-8329
>

Other related posts: