RE: A tale of data dictionary corruption in Oracle10gR2

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: "'Hameed, Amir'" <Amir.Hameed@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 6 Oct 2006 12:04:02 -0400

There is part of your problem. You went in the wrong direction. Try 1000.
There is a reason that is the example in the manual. 1000 directory
references will be easily cached and quickly resolved, and then you'll be in
a directory structure accessing sweet spot up to like 1 million compiled
routines. If you pretend you had only apps, it would give you 35 or so per
directory instead of 3500.

Now that might not solve your problem, but it is a step in the right
direction. To see whether this is material to you in your OS environment,
copy the directory structure you've got to a 100 fold bushier tree and time
od'ing all the files to /dev/null for both setups. Presuming no extraneous
competition on your lab environment, the delta was the difference in
threading through the file system structure for two different topologies. If
the difference is very tiny, then all this advice is moot, but be careful to
scale the difference versus the huge number of references thinking about
this as program calling subroutines.

The other bit about doing it in upgrade mode and doing the whole database
probably introduces a lot of efficiencies, because it can probably ignore a
lot of dependency evaluation.

Again,

Good luck!

-----Original Message-----
From: Hameed, Amir [mailto:Amir.Hameed@xxxxxxxxx] 
Sent: Friday, October 06, 2006 11:34 AM
To: mwf@xxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: A tale of data dictionary corruption in Oracle10gR2

Before compiling the APPS schema, I had compiled the SYS and SYSTEM
schemas. I had also set plsql_native_library_subdir_count to 10 (the
default was 149 which I thought was a bit high ) to reduce the number of
files in a sub-directory. This can be increased if needed.

-----Original Message-----
From: Mark W. Farnham [mailto:mwf@xxxxxxxx] 
Sent: Friday, October 06, 2006 10:05 AM
To: mwf@xxxxxxxx; Hameed, Amir; oracle-l@xxxxxxxxxxxxx
Subject: RE: A tale of data dictionary corruption in Oracle10gR2

One more thing. In addition to using the subdirectory structure count
thing, you might consider the "do the whole database" thing. That is
done in a "not really available" UPGRADE mode, so not nearly as many
catch-22 situations are possible. As a fraction of APPS' routines, the
rest is pretty trivial.

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Mark W. Farnham
Sent: Friday, October 06, 2006 9:49 AM
To: Amir.Hameed@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: A tale of data dictionary corruption in Oracle10gR2

First, since I'm usually pretty sure about APPS stuff and I am
definitely guessing here, I want to make sure everyone knows this is
just a guess. I'm not using compiled PL/SQL with APPS anywhere yet.

Have you tried compiling schema in this order:

SYS, SYSTEM, then APPS?

My guess is based on the quite possibly misguided notion that one of the
storage management procedures is forced to recompile under the chain of
events from recompiling APPS, and that it is not functioning at the time
it is trying to be recompiled and some place in the code is retrying
forever instead of telling you what is really wrong.

If I'm even in the right ballpark, you might work around this by
pre-allocation of extents to the relevant tables (and clusters) in the
SYS schema if the order of compilation thing doesn't work.

The other big deal is that 30,000+ probably trashes your file system
inode structure pretty badly, so you're going to want to set up the
multiple subdirectory thingy as described in the very fine manual. If I
recall correctly, Oracle tells you to do this if you're over 10K or 15K
entries.
I'd probably do it much smaller than that.

Good luck.

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Hameed, Amir
Sent: Friday, October 06, 2006 8:41 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: A tale of data dictionary corruption in Oracle10gR2

I just wanted to share some information with this list on a corruption
scenario that I have recently experienced in my lab environment. I
recently upgraded an Oracle applications 11.5.9 database from version
9.2.0.6(64-bit) to version 10.2.0.2 (64-bit) on Solaris9. The upgrade
went fine. While testing the new plsql setting where the plsql code can
be natively compiled, I ran into some issues that eventually corrupted
the data dictionary. This is what happened:

1. The plsql_code_type parameter was set to "native". The goal was to
natively generate the plsql code of the APPS schema. A very brief
description of the APPS schema (for those who do not know what the APPS
schema is) is that this schema owns all the standard PLSQL code in the
11i applications and the count of the combined
packages/procedures/triggers/functions is in thousands. In my
environment, the count was 34,647

2. Because in order to natively generate the PLSQL code, the code needs
to be re-compiled. The option that I used was to execute procedure
dbms_utility.compile_schema for the APPS schema as shown below:
        dbms_utility.compile_schema('APPS',true)

The caveat of running this procedure is that it compiles everything
(unconditionally) including the views.

3. This procedure first invalidated all the plsql code as well as views.
It then started to compile them but the compilation process was very
slow and after approximately 16 hours, I checked and the invalid-count
showed that it was almost half way done. I decided to terminated the
process so I issued the "startup force" command. The instance got
terminated but then would not start in NORMAL mode and kept returning
ORA-0600 error as shown below:
        ORA-00600: internal error code, arguments:
[kksfbc-reparse-infinite-loop], [0xFFFFFFFF7A7F62F0], [], [], [], [],
[], [] I was able to start the instance only in the RESTRICT mode

4. At this point there were a lot of invalid packages owned by SYS;
which was not the case when I had started the compilation of APPS code

5. I tried to re-compile the INVALID packages owned by SYS manually via
SQLPLUS but I started to get the same ORA-0600 error

6. I opened a TAR. The analyst suggested running the catproc.sql script
which I did but it did not help and the script errored out with message:
                ERROR at line 1:
                ORA-04045: errors during recompilation/revalidation of
SYS.DBMS_SCHEDULER
                ORA-00942: table or view does not exist
                ORA-00942: table or view does not exist
                ORA-00942: table or view does not exist
                ORA-00942: table or view does not exist
        I tried to also manually re-compile this package but got the
same error

7. Oracle has concluded that because I forcefully disrupted the
compilation process, some of the timestamps in the dictionary dependency
tree have gone out of synch and that the dictionary is now corrupted.
They are suggesting that I re-run the upgrade, which I can but they have
not been able to provide me the answer on how to natively compile the
PLSQL code in a relatively quick time because we have mission critical
systems and if the only way to compile PLSQL is via the compile_schema
procedure then based upon the test timing, it is not feasible

Does anyone has any suggestions/comments on this?

Thanks
Amir

--
//www.freelists.org/webpage/oracle-l



--
//www.freelists.org/webpage/oracle-l





--
//www.freelists.org/webpage/oracle-l


Other related posts: