shared pool latching issue and missing time in trace file

From: "Sigrid Keydana" <keydana@xxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Tue, 11 Jun 2013 13:05:33 +0200 (CEST)
Hi,

sorry for the misformatted mail this morning - I'm trying again.

I'd like to ask for your advice debugging a shared pool latching problem. For 
background, let me state that in general, in the application concerned, shared 
pool fragmentation has been a known issue, but less so in the present since 
quite some work has been done to reduce hard parsing. However, there is a 
statement (which I cannot reproduce here but might do so in a masked form if 
required) that regularly causes the alert (deemed "harmless" by Oracle Support, 
see 330239.1)

Memory Notification: Library Cache Object loaded into SGA
Heap size 75750K exceeds notification threshold (51200K)

I have been tracing this statement in 3 different conditions:
- as SYS user on the database server
- as non-SYS user from sqlplus, on the database server
- as non-SYS user from a client machine, using PL/SQLDeveloper


1) SYS on db server
In this case, I see a small number of recursive statements in the tracefile, 
and parse cpu/elapsed time is c=419936,e=420700. No "memory notification" in 
the alert log. Query executes smoothly.


2) non-SYS on db server, using sqlplus
Here, the query does not execute smoothly. Before the entries for the recursive 
sqls, I have the following lines:

..
*** ACTION NAME:() 2013-06-10 15:31:32.544

WAIT #47026981749560: nam='latch: shared pool' ela= 65 address=1611705384 
number=307 tries=0 obj#=0 tim=1370871092544435

*** 2013-06-10 15:31:46.136
WAIT #47026981749560: nam='latch: row cache objects' ela= 119 
address=32940374584 number=280 tries=0 obj#=0 tim=1370871106136480

*** 2013-06-10 15:33:50.873
WAIT #47026981749560: nam='latch: shared pool' ela= 9308 address=1611705384 
number=307 tries=0 obj#=0 tim=1370871230873117
WAIT #47026981749560: nam='latch: shared pool' ela= 204 address=1611705384 
number=307 tries=0 obj#=0 tim=1370871230873406

Parse time is c=261363267,e=261410049 - so parsing took more than 4 minutes. 
What I don't understand is what happened during all this time - the 4 wait 
events reported look far "too short" to match the actual time elapsed (e.g., 
the longest shared pool latch wait accounting for 9 ms only…) (And just in 
parentheses, the recursive queries turn out to be very quick).

In spite of the long query duration, no memory notification occurred.


3) non-SYS from PL/SQLDeveloper
Lastly, apart from the fact that the PL/SQLDeveloper trace file is messed up by 
lots of (for me :-)) unexpected sql (at dep=0!), it resembles the second case 
for the "how was the time spent" part:

** 2013-06-06 12:04:21.832
WAIT #47816534037856: nam='latch: shared pool' ela= 1012 address=1611705224 
number=307 tries=0 obj#=-1 tim=1370513061832831
WAIT #47816534037856: nam='latch: row cache objects' ela= 94 
address=33352528640 number=280 tries=0 obj#=-1 tim=1370513061833689

*** 2013-06-06 12:04:21.881
WAIT #47816534037856: nam='latch: row cache objects' ela= 170 
address=33352528640 number=280 tries=0 obj#=-1 tim=1370513061881325

*** 2013-06-06 12:04:57.512
WAIT #47816534037856: nam='latch: row cache objects' ela= 641 
address=32940374584 number=280 tries=0 obj#=-1 tim=1370513097512746

*** 2013-06-06 12:05:05.639
WAIT #47816534037856: nam='latch: shared pool' ela= 3 address=1611705224 
number=307 tries=0 obj#=-1 tim=1370513105639833

*** 2013-06-06 12:07:59.226
WAIT #47816534037856: nam='latch: row cache objects' ela= 76 
address=32824711104 number=280 tries=0 obj#=-1 tim=1370513279226650

*** 2013-06-06 12:08:13.053
WAIT #47816534037856: nam='latch: shared pool' ela= 369 address=1611705224 
number=307 tries=0 obj#=-1 tim=1370513293053337

*** 2013-06-06 12:08:22.800
WAIT #47816533789368: nam='library cache lock' ela= 265 handle 
address=33747902984 lock address=33507635152 100*mode+namespace=1632087638018 
obj#=-1 tim=1370513302800299
WAIT #47816533789368: nam='library cache pin' ela= 180 handle 
address=33747902984 pin address=33507634896 100*mode+namespace=1632087638018 
obj#=-1 tim=1370513302800606
WAIT #47816533853720: nam='library cache lock' ela= 458 handle 
address=33475669128 lock address=33527516264 100*mode+namespace=124554379266 
obj#=-1 tim=1370513302802113

In this scenario, however, contrary to the sqlplus execution, I can bet I will 
get the memory notification in the alert log.


Please excuse the long description - now these are my questions:

1) In cases 2 and 3, where is the time spent?
2) What might be the relation between the the long waits and the "memory 
notification" alert, given there seems to be no 1:1 relationship? What do you 
think, how should on judge/react on this alert?
3) In general, would you have any advice how to proceed with this issue? Is it 
even worthwhile to follow up questions like "why this is parsed quickly when 
issued by SYS, but not otherwise", or to try to do more of "shared pool 
diagnostics / debugging" (given that in general, some level of fragmentation 
will be unavoidable due to how the application works)? So perhaps, concentrate 
on tuning this specific sql, rather?

I'd be very glad about any advice / opinions on this.
Thanks in advance, Sigrid


--
//www.freelists.org/webpage/oracle-l
Follow-Ups:
- Re: shared pool latching issue and missing time in trace file
  - From: David Fitzjarrell
- Re: shared pool latching issue and missing time in trace file
  - From: Martin Klier
shared pool latching issue and missing time in trace file

Other related posts: