RE: Manual execution vs v$session discrepancy

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: Ram Raman <veeeraman@xxxxxxxxx>
  • Date: Thu, 10 Sep 2015 06:24:32 +0000


The 0.04 seconds for 1,671 executions looks like a figure you can't trust
because of possible granularity error (in this case a larger number of very
small times being seen as zero).

Do the two child cursors have the same plan_hash_value.
How many disk_reads and buffer_gets (total, and per execution)
What about the other times reported in v$sql - any clues there.

There's also v$sql_optimizer_env which you can address by sql_id and
child_number - this will list the optimizer environments for the two child
cursors: check if they differ, check if your optimizer environment for testing
differs from the production one. Bear in mind, also, that there's a
"secure_view_merging" effect that means different users may get plans that
differ in view merging.



Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: Ram Raman [veeeraman@xxxxxxxxx]
Sent: 10 September 2015 01:37
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Manual execution vs v$session discrepancy


ok not a great stat, but this seems to indicate that it is waiting on something
since the time taken is more about 5x.

SQL_ID EXECUTIONS FIRST_LOAD_TIME CPUSECS ELAPSEDSECS
------------- ---------- ------------------- ---------- -----------
4abcc3v6tdfqj 14081 2015-09-09/10:49:54 12381.73 58403.7323
4abcc3v6tdfqj 1673 2015-09-09/10:49:54 .04 1.450262


On Wed, Sep 9, 2015 at 7:30 PM, Ram Raman
<veeeraman@xxxxxxxxx<mailto:veeeraman@xxxxxxxxx>> wrote:
Jonathan,

We have run into a very similar situation with another SQL in our prod
environment again. The awr reports also show average of 4 seconds/exec for this
query in consideration.

I decided to trace the query anyway, but I am not lucky there. I know the
sessionid, but when I set the trace I am not able to locate the trace file. I
cannot modify the query in the app server since that is a different group; I
set the trace on at level 8 with dbms_system.set_ev. v$session shows trace
enabled for the session in qn, but I do not know which one is the trace file. I
tried trcsess by giving it sid.serial#, that seems to produce a file with just
4 lines and no useful info.

I can sample the sql bind capture but as before, the values from there return
superfast when i run the query for those values. I used the values used while
peeking (thanks to Jonathans blog), still the results were fast with those
values. fyi, I test these values in prod itself. child_cursors - maybe one or
two. I should have checked for variation in the child cursors, but was too
busy. this is a psft fin app. as before several 1000 executions per day.

At this point, I would really like to see the bind var values for which the
queries are taking several seconds. Any help is greatly appreciated.

Thanks.


On Thu, Sep 3, 2015 at 1:51 AM, Jonathan Lewis
<jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>> wrote:

It's perfectly feasible for a session to report a much longer run-time for a
query than v$sql shows. The session time you're looking at may include all the
network roundtrip time and any client processing time that goes on between
round-trips.

As far as dealing with 7 seconds vs. 2/10th seconds - some of that may be down
to the way the client and server communicate, some of it may be down to the
server being busy while the production query was being run, some may be down
very specifically to the concurrency of usage (competing for latches and
mutexes, driving up CPU time), some may be down to the production data being
subject to update and read-consistency overheads while the queries were being
run. However, the difference does look a little extreme so ...


Apart from acquiring sets of bind values, did you also acquire the bind values
used when the cursors were optimised ? Did you check the plan(s) actually used
matched the plan(s) your tests produced.

When you tested did you use the bind variable sets as literals to the query, or
did you have a framework that injected them as bind variables of the correct
type? When you tested was it against the same data set with the same set of
object statistics and similar concurrent rate of change of data ? When you
queried v$sql how many child cursors were in existence (and what was the
highest child number) ? If there were multiple child cursors was there much
variation in the stats between children. Were the figures in v$sqlstat
consistent with those in v$sql. vary Did the details from v$sql show you
similar buffer_gets in the production and test cases. There are several
time-related columns in v$sql beyond elapsed_time and cpu_time (e.g.
plsql_exec_time, java_exec_time, user_io_wait_time, cluster_wait_time) are
there any clues where the time was spent?


Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
[oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>] on behalf
of Ram Raman [veeeraman@xxxxxxxxx<mailto:veeeraman@xxxxxxxxx>]
Sent: 03 September 2015 03:50
To: ORACLE-L
Subject: Manual execution vs v$session discrepancy


List,

We had a SQL that was being run repeatedly over 10K times in our prod system
today. Turned out to be something related to fiscal year. It spiked up our CPU
big time. I got several value sets for the SQL from the sql bind capture and
executed them myself for about a dozen value sets; each and everyone of my
manual execution returned in about 2/10ths of a second. However the v$session
view would show the sessions run that SQL (sql_id, last_call_et, status) for a
few minutes. That was the case for several users the whole day. Per v$sql, the
average execution time for that SQL was 7 seconds since the first load this
morning. AWR reports showed that the averages are anywhere from 4 seconds and
up depending on the hour. I was confused by what the v$session showed me vs my
execution times. Am I missing something here? bind capture interval is left at
default 900 seconds.

Could it be that the values I checked (about 15) just happened to be 'good
values' or the environment could have been different for the executed SQLs? I
was hesitant to turn on the trace with the CPU usage through the roof. v10.2

Thanks.

--





--





--


Other related posts: