ORA-00376 errors - oracle-l / Investigating ORA-01552

Investigating ORA-01552 / ORA-00376 errors

From: Renee Dekenah <renee.dekenah@xxxxxxxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Wed, 21 Jun 2006 14:21:44 +0100

Hi,

I am investigating the cause of a problem we had with an undo tablespace and would appreciate hearing your thoughts on this. While I have fixed the problem by creating a new replacement undo tablespace, and everything now seems fine, I am interested in why the problem occurred, and whether there is a better solution than the one I employed.

Problem Description:

We are running IBM Rational ClearQuest on Oracle 10g R2. On Monday ClearQuest users started getting the error ORA-01552: cannot use system rollback segment for non-system tablespace 'CQ_USER' when trying to add a new entry within ClearQuest.

Full error message from ClearQuest ERROR! SQLExecute: RETCODE=-1, State=HY000, Native Error=1552 SQL statement="update dbglobal set next_request_id=906 where site_id=2 and next_request_id=905" [DataDirect][ODBC Oracle Wire Protocol driver][Oracle]ORA-01552: cannot use system rollback segment for non-system tablespace 'CQ_USER'.

Further Details:

Upon investigation using Enterprise Manager, it appeared that the CQ_USER tablespace had no space left - a critical error for this was logged on the preceding Friday - and it was not set to autoextend. We rectified this by changing it to autoextend in Enterprise Manager and applying the changes, thus clearing the tablespace critical errors. (please don't kill me for the tablespace being full, I'm just helping out an ex-employer, monitoring the database was not my responsibility)

However, this did not resolve the initial Oracle error shown in ClearQuest, so 
we investigated further.

On Thursday/Friday, the alert error log in BDUMP logged a few ORA-1653 errors relating to tables that could not be extended in the (then) almost full CQ_USER tablespace.

On the Saturday, there was the following: KCF: write/open error block 0x29 online=1 file=2 D:\ORACLE\ORADATA\...\UNDOTBS01.DBF error=27072 text: 'OSD-04008: WriteFile() failure, unable to write to file O/S-Error: (OS 33) The process cannot access the file because another process has locked a portion of the file.' Automatic datafile offline due to write error on file 2: D:\ORACLE\ORADATA\...\UNDOTBS01.DBF

Since then - even after extending the CQ_USER tablespace on Monday - the system logged many of these: Errors in file c:\oracle\admin\...\..._pmon_2604.trc: ORA-00376: file 2 cannot be read at this time ORA-01110: data file 2: 'D:\ORACLE\ORADATA\...\UNDOTBS01.DBF'

Enterprise Manager claimed the undo tablespace (UNDOTBS1) was ReadWrite status, and 0MB in size. When I used the SQLPlus command line to confirm the status of this tablespace it was listed as AVAILABLE. Dbverify on the file UNDOTBS01.DBF showed nothing failing or corrupt, but Enterprise Manager did list this file as "Offline (Needs Recovery)". Based on this, and some information I found online, I decided to create a new undo tablespace and switched the database to use that instead. The ClearQuest error no longer occurs.

Questions:

1. Was the undo datafile being locked/going offline a result of the tablespace maxing out, or could it have been an unrelated error? (note that we have Symantec backups running daily on the database server and database, I seem to remember reading somewhere that backup software can sometimes cause datafile problems?)

2. Was recreating the undo tablespace the best solution, or should I have done something else?

Many thanks for any assistance you can offer,

Renee


--
//www.freelists.org/webpage/oracle-l

Follow-Ups:
- Re: Investigating ORA-01552 / ORA-00376 errors
  - From: Paul Drake

Investigating ORA-01552 / ORA-00376 errors

Other related posts: