RE: Missing SQL in DBA_HIST_SQLSTAT
- From: "Herring Dave - dherri" <Dave.Herring@xxxxxxxxxx>
- To: "Petr Novak" <Petr.Novak@xxxxxxxxxxxx>, "Teehan, Mark" <mark.teehan@xxxxxxxxxxxxxxxxx>, "ORACLE-L" <oracle-l@xxxxxxxxxxxxx>
- Date: Tue, 2 Nov 2010 11:04:24 -0500
SELECT topnsql
FROM sys.wrm$_wr_control;
TOPNSQL
------------------
2000000000
As an example of what I'm talking about:
SELECT inst_id, plan_hash_value, first_load_time, MAX(last_active_time)
FROM cgv$sql
WHERE sql_id = '8qfp7bf2tcw57'
GROUP BY inst_id, plan_hash_value, first_load_time
ORDER BY inst_id;
I PLAN_HASH_VALUE FIRST_LOAD_TIME MAX(LAST_ACTIVE_TI
- ------------------ ------------------- ------------------
1 2586770207 2010-10-10/15:09:48 02-NOV-10 11:39:49
2 2586770207 2010-10-10/15:19:15 02-NOV-10 08:24:58
SELECT sq.instance_number, sq.plan_hash_value, MAX(end_interval_time)
FROM dba_hist_sqlstat sq, dba_hist_snapshot s
WHERE sq.sql_id = '8qfp7bf2tcw57'
AND ( sq.dbid = s.dbid
AND sq.instance_number = s.instance_number
AND sq.snap_id = s.snap_id)
GROUP BY sq.instance_number, sq.plan_hash_value
ORDER BY 1, 2;
INSTANCE_NUMBER PLAN_HASH_VALUE MAX(END_INTERVAL_TIME)
------------------ ------------------ ----------------------------------------
1 690739501 22-OCT-10 04.00.17.953 AM
1 2586770207 22-OCT-10 04.00.17.953 AM
2 690739501 03-SEP-10 04.30.42.593 PM
2 2586770207 03-SEP-10 04.30.42.593 PM
Dave Herring | DBA
Acxiom Global Technology Solutions
630-944-4762 office | 630-430-5988 cell | 630-944-4989 fax
1501 Opus Pl | Downers Grove, IL, 60515 | U.S.A. | www.acxiom.com
Service Desk: 888-243-4566, https://servicedesk.acxiom.com, GSCA@xxxxxxx
-----Original Message-----
From: Petr Novak [mailto:Petr.Novak@xxxxxxxxxxxx]
Sent: Tuesday, November 02, 2010 10:32 AM
To: Herring Dave - dherri; Teehan, Mark; ORACLE-L
Subject: AW: Missing SQL in DBA_HIST_SQLSTAT
Hallo Dave,
in Metalink ist 10.2.0.4 Bug 8484269 , fixed in 11.2 . Topnsql must be in range
(30,50000), otherwise you get ORA-13530.
The question is how TOPNSQL setting works under 11.2. Where do you find out,
that default is 2 billion ? It does not make sense trying to catch so much
statements (if you would have such v$SQL). Maybe such big value is
ignored/corrected internally.
Best Regards,
Petr
________________________________________
Von: Herring Dave - dherri [Dave.Herring@xxxxxxxxxx]
Gesendet: Dienstag, 2. November 2010 16:14
Bis: Petr Novak; Teehan, Mark; ORACLE-L
Betreff: RE: Missing SQL in DBA_HIST_SQLSTAT
True, smaller statements may not be as important, but per my original post, why
would they be skipped if the default for TOPNSQL is 2 billion?
I got into this research because I was looking into SQL statements suffering
from bind variable peeking and skewed values. I wanted to check AWR history to
see the frequency of different plans chosen for some statements and the average
resource impact per SQL_ID + PLAN_HASH_VALUE. In reviewing the results I saw
that some days certain plans weren't chosen at all. Those were for recent
entries and it just so happened that those entries were still in the shared
pool, even after a few days (because they're executed multiple times per day
and haven't been invalidated). Because of this, AWR won't help because it's
inaccurate, which sent me down this whole path.
Dave Herring | DBA
Acxiom Global Technology Solutions
630-944-4762 office | 630-430-5988 cell | 630-944-4989 fax
1501 Opus Pl | Downers Grove, IL, 60515 | U.S.A. | www.acxiom.com
Service Desk: 888-243-4566, https://servicedesk.acxiom.com, GSCA@xxxxxxx
-----Original Message-----
From: Petr Novak [mailto:Petr.Novak@xxxxxxxxxxxx]
Sent: Tuesday, November 02, 2010 10:01 AM
To: Herring Dave - dherri; Teehan, Mark; ORACLE-L
Subject: AW: Missing SQL in DBA_HIST_SQLSTAT
Hallo Dave,
I executed your query and checked one of the returned SQL_IDs.
SQL> select FIRST_LOAD_TIME,EXECUTIONS,LAST_LOAD_TIME,LAST_ACTIVE_TIME
,PARSING_SCHEMA_NAME,CHILD_NUMBER,PLAN_HASH_VALUE,OPTIMIZER_ENV_HASH_VALUE
from v$sql where sql_id='cjkmnxsvny74t';
FIRST_LOAD_TIME EXECUTIONS LAST_LOAD_TIME LAST_ACTIVE_TIME
PARSING_SCHEMA_NAME CHILD_NUMBER PLAN_HASH_VALUE
OPTIMIZER_ENV_HASH_VALUE
------------------- -------------- ------------------- -------------------
------------------------------ -------------- ---------------
------------------------
2010-09-01/05:00:29 45 2010-11-02/14:55:50 02.11.2010 14:55:49 TMU
0 1721219251
2180547739
2010-09-01/05:00:29 7 2010-11-01/05:00:48 01.11.2010 05:07:01 TMU
1 1721219251
2180547739
AWR reports only one interval for this query:
End Plan Hash Optim Env CPU Elapsed IOTime AppTime
Cluster Concurr Execs Loa PaC Ela/Ex Buffer/Ex DReads/Ex
Rows/Ex Prf
----------- ------------- ----------- ---------- ---------- --------- --------
-------- -------- -------- ---- ------ ----------- ------------- -----------
----------- ---
25.10 11:00 1721219251 2180547739 .01 .05 .0 .0
.0 .0 3 0 3 .02 20.0 14.0
3.0
I thing it is natural, that AWR does not capture every statement executed in
the interval,even if the statement was not aged out from V$SQL.
Such statements are very 'small' , concerning resource consumption.
I didnt try to play with TOPNSQL, but you could check the maximum count of
captured statements for snapshot and instance, if TOPNSQL has any impact.
It is not important how many statements are captured, but how big percentage of
DB load (reads,gets,writes,CPU) you can explain with captured statements.
Best Regards,
Petr
________________________________________
Von: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx]" im
Auftrag von "Herring Dave - dherri [Dave.Herring@xxxxxxxxxx]
Gesendet: Dienstag, 2. November 2010 14:37
Bis: Teehan, Mark; ORACLE-L
Betreff: RE: Missing SQL in DBA_HIST_SQLSTAT
Mark,
Thanks for the detailed response!
I can understand issues where SQL is aged out/invalidated inbetween AWR
snapshots, causing statements to be missed. But I'm finding SQL statements
having changes in executions ACROSS snapshots, which is where I'm confused.
Here's one of the SQL statements I'm using to try and determine the extent of
this problem. It'll list statements whose last active time is greater than
then date of the last AWR snapshot + snapshot interval, meaning it should have
been captured but hasn't:
SELECT sql_id, TO_CHAR(awr_last_active_dt, 'DD-MON-YY HH24:MI:SS')
chr_awr_last_active_dt
, TO_CHAR(last_snapshot_dt, 'DD-MON-YY HH24:MI:SS') chr_last_snapshot_dt
, shared_pool_last_active_dt
FROM (SELECT MAX(ss.end_interval_time) awr_last_active_dt
, sq.sql_id, snap_interval, ss2.last_snapshot_dt,
s.shared_pool_last_active_dt
FROM dba_hist_sqlstat sq
, dba_hist_snapshot ss
/* Get the date of the most recent snapshot */
, (SELECT MAX(begin_interval_time) last_snapshot_dt
FROM dba_hist_snapshot) ss2
/* Get the date of LAST_ACTIVE_TIME for every SQL stmt in the
shared pool */
, (SELECT sql_id, plan_hash_value
, MAX(last_active_time) shared_pool_last_active_dt
FROM v$sql
GROUP BY sql_id, plan_hash_value) s
/* Get the SNAP_INTERVAL, to help determine the offset for the
most recent
possible LAST_ACTIVE_TIME */
, (SELECT snap_interval FROM dba_hist_wr_control)
WHERE ( sq.snap_id = ss.snap_id
AND sq.instance_number = ss.instance_number
AND sq.dbid = ss.dbid)
AND ( sq.sql_id = s.sql_id
AND sq.plan_hash_value = s.plan_hash_value)
GROUP BY sq.sql_id, snap_interval, ss2.last_snapshot_dt,
s.shared_pool_last_active_dt)
WHERE LEAST(shared_pool_last_active_dt, last_snapshot_dt) - snap_interval >
awr_last_active_dt
ORDER BY LEAST(shared_pool_last_active_dt, last_snapshot_dt) - snap_interval -
awr_last_active_dt DESC;
Dave Herring | DBA
Acxiom Global Technology Solutions
630-944-4762 office | 630-430-5988 cell | 630-944-4989 fax
1501 Opus Pl | Downers Grove, IL, 60515 | U.S.A. | www.acxiom.com
Service Desk: 888-243-4566, https://servicedesk.acxiom.com, GSCA@xxxxxxx
-----Original Message-----
From: Teehan, Mark [mailto:mark.teehan@xxxxxxxxxxxxxxxxx]
Sent: Tuesday, November 02, 2010 12:53 AM
To: Herring Dave - dherri; ORACLE-L
Subject: RE: Missing SQL in DBA_HIST_SQLSTAT
Dave,
I've noticed that the quality of AWR data collection varies a lot depending on
the application profile.
I am not an expert on database internals ("Tanel!!") but this is what I have
worked out; my problems in this area are due to flushing of the sql area. As
you can see the metrics in gv$sqlstat; then it may not be relevant.
A database hosting an application that executes most of its workload during a
batch window seems to have lower quality of AWR data collection during its
busiest period. One major reason for this is aging of single-execution
statements.
The SQL Area sizes automatically based on the workload of the entire day, so
when a large burst of statements is run (= overnight batch cycle), the SQL area
will not resize up immediately (if at all), and single-execution statements
will be aged out first before the next AWR snap occurs. I've seen this happen a
lot - the statement metrics are no longer visible in gv$sqlstat at snap-time so
it never goes into AWR.
Aging of metrics for statements is only a problem if the database needs to
flush the sql area: it can be flushed by other activity. Single-execution
statements appear to be chosen first, irrespective of how long they ran for -
AFAIK no attempt is made to classify which single-execution statements may
still have important metrics.
A flush offender is a stats gather on objects with a lot of column statistics:
it generates a large number of recursive statements for each column (select
min(), max() etc) in a short span, which can flush a lot of metrics from the
sql area.
The AWR partition rolling job can also generate a lot of statements - this
generally runs once a day.
I've raised an SR to find out if there is any way to keep statements in the SQL
area until they have been snapped into AWR, but support told me to go away, its
supposed to be like that. It is not possible to directly set the size of the
sql area to improve metric retention. Statement pinning doesn't work either:
the text is kept, but the metrics (gv$sqlstat) are flushed!
I hit a brick wall on this. It seems a design flaw to me - the sql area flush
process should assess the metrics for each statement to see if it is eligible
for AWR retention. Not trivial I know, but a lot depends on AWR quality.
Interesting that you have seen this on a 4-node RAC - the worst offender I have
is a 3-node RAC - is there any chance AWR is not picking up metrics across all
instances?
HTH
Mark Teehan
Singapore
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Herring Dave - dherri
Sent: 26 October 2010 06:08
To: ORACLE-L
Subject: Missing SQL in DBA_HIST_SQLSTAT
Has anyone checked into the accuracy and reliability of AWR SQL stats? From
checks I've done it appears that a lot of SQL statements are not getting
captured in AWR snapshots.
Here's our config: 4-node RAC, Oracle 10.2.0.2, RH 4.x, 30-minute AWR snapshots.
We've left TOPNSQL as "DEFAULT", which in our case should be enough:
SELECT topnsql
FROM sys.wrm$_wr_control;
TOPNSQL
------------------
2000000000
This means that the first 2 BILLION SQL statements per category will be
selected (seems rather excessive!). Yet I ran a few checks where I compared
(GV$SQL.LAST_ACTIVE_TIME - snapshot interval) to the MAX(end_interval_time) for
snapshots (all by SQL_ID) and found over 1600 statements that are executed yet
no updates are captured by AWR.
As a double check, I reviewed data in DBA_HIST_ACTIVE_SESS_HISTORY and found
that 1 of the 1600 statements from above is executed multiple times per day,
yet the last DBA_HIST_SQLSTAT snapshot it's in is from 4 months ago.
Has anyone else seen this type of thing and/or done detailed checking on the
accuracy of AWR SQL data?
BTW, I tried tracing the snapshot activity and can't get past what's done to
populate internal objects X$KEWRSQLIDTAB and X$KEWRSQLCRIT.
Dave Herring | DBA
Acxiom Global Technology Solutions
630-944-4762 office | 630-430-5988 cell | 630-944-4989 fax
1501 Opus Pl | Downers Grove, IL, 60515 | U.S.A. | www.acxiom.com Service Desk:
888-243-4566, https://servicedesk.acxiom.com, GSCA@xxxxxxx
***************************************************************************
The information contained in this communication is confidential, is intended
only for the use of the recipient named above, and may be legally privileged.
If the reader of this message is not the intended recipient, you are hereby
notified that any dissemination, distribution or copying of this communication
is strictly prohibited.
If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy of it
from your computer system.
Thank You.
****************************************************************************
--
http://www.freelists.org/webpage/oracle-l
===============================================================================
Please access the attached hyperlink for an important electronic communications
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================
--
http://www.freelists.org/webpage/oracle-l
--
http://www.freelists.org/webpage/oracle-l
Other related posts: