Re: Enforcing SQL execution plan

  • From: Chris Taylor <christopherdtaylor1994@xxxxxxxxx>
  • To: jonathan@xxxxxxxxxxxxxxxxxx
  • Date: Fri, 2 Sep 2016 13:44:27 -0500

Tacking onto Jonathan's point here.

When you find the bind variable for the bad plan, it is likely that stats
have been updated on the table and its likely (if Jonathan is correct) that
histograms exist for this column.  So what is likely happening is that when
you test, the stats now have info on the bind variable you're using for the
histogram so you're getting an unexpected good plan.

To test that theory you need to review the histograms (if any) that exist
on the table for the bind variable column(s) , when they were last
collected and what the high value is in the histogram.  Once you know the
high value, choose a bind variable that is above the histogram value and
re-run your test.

If you get the bad plan, you know that histograms are causing you the
headache.  To remedy a histogram problem if you have a few choices:

1. Disable bind peeking for the session thats experiencing the odd behavior
(or for the whole system) , or
2. Delete the histogram on the table and prevent new histograms from being
collected or
3. Create a manual sql profile that will always enforce the plan you want

Chris


On Fri, Sep 2, 2016 at 1:28 PM, Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
wrote:


There shouldn't really be any need to ask that question since the tkprof
summary of the trace file from a bad run is a few lines further down the
email.

The problem appears to be an awful choice of index for the second half of
a concatenation that is (presumably) only triggered occasionally, and
perhaps requires an out of range (above high value) bind variable to cause
it to be considered a good choice.


Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
------------------------------
*From:* oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on
behalf of fmhabash@xxxxxxxxx [fmhabash@xxxxxxxxx]
*Sent:* 02 September 2016 18:49
*To:* Hameed, Amir; oracle-l@xxxxxxxxxxxxx
*Subject:* RE: Enforcing SQL execution plan

So, how do the 2 runs compare when broken down by wait event class? What
class is contributing the most to the bad run?



select st.SNAP_ID, st.sql_id, st.plan_hash_value, ss.begin_interval_time
,EXECUTIONS_DELTA execs

, IOWAIT_DELTA

, CLWAIT_TOTAL

, CCWAIT_DELTA

, APWAIT_DELTA

, CPU_TIME_DELTA

, ELAPSED_TIME_DELTA

, round((IOWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) io_wt_perc

, round((CLWAIT_TOTAL/ELAPSED_TIME_DELTA)*100,2) clust_wt_perc

, round((CCWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) concur_wt_perc

, round((APWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) app_wt_perc

, round((CPU_TIME_DELTA/ELAPSED_TIME_DELTA)*100) cpu_wt_perc

from dba_hist_sqlstat st, dba_hist_snapshot ss

where

ss.snap_id = st.snap_id

and st.instance_number = (select instance_number from v$instance )

and EXECUTIONS_DELTA > 0

--and sql_id = '&sql_id'

and rownum < 1000

and begin_interval_time >= sysdate - 7

order by snap_id

;






----------------------------------------
Thank you



*From: *Hameed, Amir <Amir.Hameed@xxxxxxxxx>
*Sent: *Friday, September 2, 2016 1:17 PM
*To: *fmhabash@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
*Subject: *RE: Enforcing SQL execution plan



Thanks

The problem is that the plan is the same for the good run and the bad run,
which might mean that because of bind-peeking, the skewing of data could
potentially be causing the issue.



*From:* fmhabash@xxxxxxxxx [mailto:fmhabash@xxxxxxxxx]
*Sent:* Friday, September 02, 2016 1:12 PM
*To:* Hameed, Amir <Amir.Hameed@xxxxxxxxx>; oracle-l@xxxxxxxxxxxxx
*Subject:* RE: Enforcing SQL execution plan



Hameed …



From your description, it looks like this job did run acceptably in the
past. If this is the case, why not grab one of the known good xplans and
use it to create a baseline.



We stabilized most our EBS critical tasks using plan baselines. However,
for some SQL, the baselines are becoming UNREPRODUCIBLE. Seeking MOS
support, we are advised to apply 21463894.



Hope this helps as well.


----------------------------------------
Thank you



*From: *Hameed, Amir <Amir.Hameed@xxxxxxxxx>
*Sent: *Thursday, September 1, 2016 9:13 PM
*To: *oracle-l@xxxxxxxxxxxxx
*Subject: *Enforcing SQL execution plan



Hi,

We run a certain job in the Oracle E-Business Suite (EBS) through a
concurrent program. The environment information is listed below:

DB version – 11.2.0.4

EBS version – 11.5.10.2

OS – Solaris



The statement uses bind variables that are supplied as part of running the
concurrent program. As far as I can tell, the statement always produces the
following execution plan:



------------------------------------------------------------
------------------------------------------------------------
---------------------------------

| Id  | Operation                       | Name                        |
Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   |
Buffers |

------------------------------------------------------------
------------------------------------------------------------
---------------------------------

|   0 | SELECT STATEMENT                |
|     17 |        |       |    12 (100)|          |     17 |00:00:00.01
|     145 |

|   1 |  CONCATENATION                  |
|     17 |        |       |            |          |     17 |00:00:00.01
|     145 |

|*  2 |   FILTER                        |
|     17 |        |       |            |          |     17 |00:00:00.01
|     145 |

|   3 |    NESTED LOOPS                 |
|     17 |      1 |   197 |     6   (0)| 00:00:01 |     17 |00:00:00.01
|     145 |

|*  4 |     TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_DETAILS
|     17 |      1 |   184 |     3   (0)| 00:00:01 |     17 |00:00:00.01
|      73 |

|*  5 |      INDEX UNIQUE SCAN          | WSH_DELIVERY_DETAILS_U1
|     17 |      1 |       |     2   (0)| 00:00:01 |     17 |00:00:00.01 |
   51 |

|*  6 |     TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_ASSIGNMENTS
|     17 |      1 |    13 |     3   (0)| 00:00:01 |     17 |00:00:00.01
|      72 |

|*  7 |      INDEX RANGE SCAN           | WSH_DELIVERY_ASSIGNMENTS_N3
|     17 |      1 |       |     2   (0)| 00:00:01 |     17 |00:00:00.01
|      53 |

|*  8 |   FILTER                        |
|      0 |        |       |            |          |      0 |00:00:00.01
|       0 |

|   9 |    NESTED LOOPS                 |                             |
0 |      1 |   197 |     6   (0)| 00:00:01 |      0 |00:00:00.01 |       0 |

|  10 |     NESTED LOOPS                |
|      0 |      1 |   197 |     6   (0)| 00:00:01 |      0 |00:00:00.01 |
      0 |

|* 11 |      TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS
|      0 |      1 |   184 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|* 12 |       INDEX RANGE SCAN          | WSH_DELIVERY_DETAILS_N1
|      0 |      1 |       |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|* 13 |      INDEX RANGE SCAN           | WSH_DELIVERY_ASSIGNMENTS_N3
|      0 |      1 |       |     2   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|* 14 |     TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_ASSIGNMENTS
|      0 |      1 |    13 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

------------------------------------------------------------
------------------------------------------------------------
---------------------------------



For the past few weeks, the concurrent program has been intermittently
taking time to complete. When traced, one of the “bad” runs produced the
following information:

call     count       cpu    elapsed       disk      query
current        rows

------- ------  -------- ---------- ---------- ---------- ----------
----------

Parse        1      0.00       0.00          0          0
0           0

Execute      1      0.00       0.00          0          0
0           0

Fetch        1    614.23   11147.25     941317   18000438
0           1

------- ------  -------- ---------- ---------- ---------- ----------
----------

total        3    614.23   11147.25     941317   18000438
0           1



Event waited on                             Times   Max. Wait  Total Waited

  ----------------------------------------   Waited  ----------
------------

  db file sequential read                    941317        1.00
3563.29

  Disk file operations I/O                       74        0.00
0.00

  gc current grant busy                           3        0.00
0.00

  gc cr grant 2-way                              44        0.08
0.09

  read by other session                     1706610        1.49
6482.34

  gc buffer busy acquire                      83710        1.00
431.39

  latch: cache buffers chains                173199        0.01
9.25



(only the main wait events are shown above)



Rows (1st) Rows (avg) Rows (max)  Row Source Operation

---------- ---------- ----------  ------------------------------
---------------------

         1          1          1  CONCATENATION  (cr=18000438 pr=941317
pw=0 time=2557323227 us)

         0          0          0   FILTER  (cr=0 pr=0 pw=0 time=2 us)

         0          0          0    NESTED LOOPS  (cr=0 pr=0 pw=0 time=0
us cost=6 size=197 card=1)

         0          0          0     TABLE ACCESS BY INDEX ROWID
WSH_DELIVERY_DETAILS (cr=0 pr=0 pw=0 time=0 us cost=3 size=184 card=1)

         0          0          0      INDEX UNIQUE SCAN
WSH_DELIVERY_DETAILS_U1 (cr=0 pr=0 pw=0 time=0 us cost=2 size=0
card=1)(object id 165859)

         0          0          0     TABLE ACCESS BY INDEX ROWID
WSH_DELIVERY_ASSIGNMENTS (cr=0 pr=0 pw=0 time=0 us cost=3 size=13 card=1)

         0          0          0      INDEX RANGE SCAN
WSH_DELIVERY_ASSIGNMENTS_N3 (cr=0 pr=0 pw=0 time=0 us cost=2 size=0
card=1)(object id 165871)

         1          1          1   FILTER  (cr=18000438 pr=941317 pw=0
time=2557323217 us)

         1          1          1    NESTED LOOPS  (cr=18000438 pr=941317
pw=0 time=2557323212 us cost=6 size=197 card=1)

         1          1          1     NESTED LOOPS  (cr=18000361 pr=941296
pw=0 time=2557119556 us cost=6 size=197 card=1)

         1          1          1      TABLE ACCESS BY INDEX ROWID
WSH_DELIVERY_DETAILS (cr=18000336 pr=941295 pw=0 time=2557112429 us cost=3
size=184 card=1)

  42599164   42599164   42599164       INDEX RANGE SCAN
WSH_DELIVERY_DETAILS_N1 (cr=1798095 pr=99233 pw=0 time=600382285 us cost=3
size=0 card=1)(object id 917318)

         1          1          1      INDEX RANGE SCAN
WSH_DELIVERY_ASSIGNMENTS_N3 (cr=25 pr=1 pw=0 time=7116 us cost=2 size=0
card=1)(object id 165871)

         1          1          1     TABLE ACCESS BY INDEX ROWID
WSH_DELIVERY_ASSIGNMENTS (cr=77 pr=21 pw=0 time=203643 us cost=3 size=13
card=1)





It seems that when it runs longer, it scans a lot of rows from index
WSH_DELIVERY_DETAILS_N1. This index is defined on a date column and is not
very selective.



As part of the troubleshooting exercise, I captured bind variables of the
*bad* run from DBA_HIST_SQLBIND, defined them as variables with
appropriate data type and then ran them through SQL*Plus. The problem that
I am facing is that from SQL*Plus, by using this technique, it always
produces the following plan, regardless of whether I use values from the
good run or the bad run:

------------------------------------------------------------
------------------------------------------------------------
--------------------------------

| Id  | Operation                      | Name                        |
Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   |
Buffers |

------------------------------------------------------------
------------------------------------------------------------
--------------------------------

|   0 | SELECT STATEMENT               |
|      1 |        |       |    12 (100)|          |      0 |00:00:00.01
|       4 |

|   1 |  CONCATENATION                 |
|      1 |        |       |            |          |      0 |00:00:00.01
|       4 |

|*  2 |   FILTER                       |
|      1 |        |       |            |          |      0 |00:00:00.01
|       0 |

|   3 |    NESTED LOOPS                |
               |      0 |      1 |   197 |     6   (0)| 00:00:01 |      0
|00:00:00.01 |       0 |

|*  4 |     TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS
|      0 |      1 |   184 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|*  5 |      INDEX UNIQUE SCAN         | WSH_DELIVERY_DETAILS_U1
|      0 |      1 |       |     2   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|*  6 |     TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS
|      0 |      1 |    13 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|*  7 |      INDEX RANGE SCAN          | WSH_DELIVERY_ASSIGNMENTS_N3
|      0 |      1 |       |     2   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|*  8 |   FILTER                       |
            |      1 |        |       |            |          |      0
|00:00:00.01 |       4 |

|   9 |    NESTED LOOPS                |
|      1 |      1 |   197 |     6   (0)| 00:00:01 |      0 |00:00:00.01
|       4 |

|* 10 |     TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS
|      1 |      1 |   184 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       4 |

|* 11 |      INDEX UNIQUE SCAN         | WSH_DELIVERY_DETAILS_U1
|      1 |      1 |       |     2   (0)| 00:00:01 |      1 |00:00:00.01
|       3 |

|* 12 |     TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS
|      0 |      1 |    13 |     3   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

|* 13 |      INDEX RANGE SCAN          | WSH_DELIVERY_ASSIGNMENTS_N3
|      0 |      1 |       |     2   (0)| 00:00:01 |      0 |00:00:00.01
|       0 |

------------------------------------------------------------
------------------------------------------------------------
--------------------------------



I have tried to capture hints through outline from the bad run and added
them as hints to the statement that I am running from SQL*Plus but I am
still not able to enforce the “bad” plan. The outline hints are shown below:



  /*+

      BEGIN_OUTLINE_DATA

      IGNORE_OPTIM_EMBEDDED_HINTS

      OPTIMIZER_FEATURES_ENABLE('11.2.0.4')

      DB_VERSION('11.2.0.4')

      OPT_PARAM('_b_tree_bitmap_plans' 'false')

      OPT_PARAM('_fast_full_scan_enabled' 'false')

      OPT_PARAM('_index_join_enabled' 'false')

      ALL_ROWS

      OUTLINE_LEAF(@"SEL$1")

      OUTLINE_LEAF(@"SEL$1_1")

      USE_CONCAT(@"SEL$1" 8 OR_PREDICATES(6) PREDICATE_REORDERS((23 22)
(24 23) (26 24) (22 26)))

      OUTLINE_LEAF(@"SEL$1_2")

      OUTLINE(@"SEL$1")

      INDEX_RS_ASC(@"SEL$1_1" "WDD"@"SEL$1" ("WSH_DELIVERY_DETAILS"."
DELIVERY_DETAIL_ID"))

      INDEX_RS_ASC(@"SEL$1_1" "WDA"@"SEL$1" ("WSH_DELIVERY_ASSIGNMENTS"."
DELIVERY_DETAIL_ID"))

      INDEX_RS_ASC(@"SEL$1_2" "WDD"@"SEL$1_2"
("WSH_DELIVERY_DETAILS"."DATE_SCHEDULED"))

      INDEX(@"SEL$1_2" "WDA"@"SEL$1_2" ("WSH_DELIVERY_ASSIGNMENTS"."
DELIVERY_DETAIL_ID"))

      LEADING(@"SEL$1_1" "WDD"@"SEL$1" "WDA"@"SEL$1")

      LEADING(@"SEL$1_2" "WDD"@"SEL$1_2" "WDA"@"SEL$1_2")

      USE_NL(@"SEL$1_1" "WDA"@"SEL$1")

      USE_NL(@"SEL$1_2" "WDA"@"SEL$1_2")

      NLJ_BATCHING(@"SEL$1_2" "WDA"@"SEL$1_2")

      END_OUTLINE_DATA

  */



The SQL statement is also pasted below:

SELECT WDD.SOURCE_LINE_ID, WDD.SOURCE_HEADER_ID, WDD.INVENTORY_ITEM_ID,

  WDD.SHIPMENT_PRIORITY_CODE, WDD.SOURCE_CODE SOURCE_CODE,

  WDD.SOURCE_HEADER_NUMBER SOURCE_HEADER_NUMBER, WDD.ORGANIZATION_ID,

  WDD.MOVE_ORDER_LINE_ID, WDD.SHIP_FROM_LOCATION_ID, WDD.SHIP_METHOD_CODE,

  WDD.SHIPMENT_PRIORITY_CODE, WDD.DATE_SCHEDULED, WDD.REQUESTED_QUANTITY,

  WDD.REQUESTED_QUANTITY_UOM, WDD.PREFERRED_GRADE,
WDD.REQUESTED_QUANTITY2,

  WDD.REQUESTED_QUANTITY_UOM2, WDD.PROJECT_ID, WDD.TASK_ID,
WDD.SUBINVENTORY,

  WDD.SUBINVENTORY, WDD.RELEASED_STATUS, WDD.SHIP_MODEL_COMPLETE_FLAG,

  WDD.TOP_MODEL_LINE_ID, WDD.SHIP_SET_ID, WDA.DELIVERY_ID,

  WDD.LAST_UPDATE_DATE

FROM

WSH_DELIVERY_DETAILS WDD, WSH_DELIVERY_ASSIGNMENTS WDA WHERE

  WDD.DELIVERY_DETAIL_ID = :B15 AND WDA.DELIVERY_DETAIL_ID =

  WDD.DELIVERY_DETAIL_ID AND WDD.DATE_SCHEDULED IS NOT NULL AND

  WDD.DATE_REQUESTED >= NVL(:B14 , WDD.DATE_REQUESTED) AND
WDD.DATE_REQUESTED

  <= NVL(:B13 , WDD.DATE_REQUESTED) AND WDD.DATE_SCHEDULED >= NVL(:B12 ,

  WDD.DATE_SCHEDULED) AND WDD.DATE_SCHEDULED <= NVL(:B11 ,
WDD.DATE_SCHEDULED)

   AND NVL(WDD.REQUESTED_QUANTITY,0) > 0 AND WDD.RELEASED_STATUS IN
('R','B',

  'X') AND NVL(WDD.SUBINVENTORY, -99) = DECODE(:B10 , NULL,

  NVL(WDD.SUBINVENTORY, -99), :B10 ) AND NVL(WDD.PROJECT_ID,0) =
DECODE(:B9 ,

  0, NVL(WDD.PROJECT_ID,0), :B9 ) AND NVL(WDD.TASK_ID,0) = DECODE(:B8 , 0,

  NVL(WDD.TASK_ID,0), :B8 ) AND NVL(WDD.SHIP_SET_ID,0) = DECODE(:B7 , 0,

  NVL(WDD.SHIP_SET_ID,0), :B7 ) AND NVL(WDD.SHIPMENT_PRIORITY_CODE,-99) =

  DECODE(:B6 , NULL, NVL(WDD.SHIPMENT_PRIORITY_CODE,-99), :B6 ) AND

  WDD.ORGANIZATION_ID = NVL(:B5 , WDD.ORGANIZATION_ID) AND

  WDD.SHIP_FROM_LOCATION_ID = DECODE(:B4 , -1, WDD.SHIP_FROM_LOCATION_ID,
:B4

  ) AND (( WDA.DELIVERY_ID IS NOT NULL AND ( :B3 <> 'N' OR WDA.DELIVERY_ID
=

  :B2 OR :B1 <> 0 ) ) OR ( WDA.DELIVERY_ID IS NULL AND :B2 = 0 AND :B1 = 0
) )



How can I enforce the *bad* execution plan from SQL*Plus? If I am able to
enforce it then I already have bind values from that run and I will be able
to reproduce the issue.



Thanks,

Amir







Other related posts: