Very unstable execution plan

  • From: Thomas Kellerer <thomas.kellerer@xxxxxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 05 Jun 2013 09:24:47 +0200

Hi,

we have an UPDATE statement that is run every night during the process of 
populating a test environment for our DWH. This is an automated build 
environment and therefor the tables are dropped and re-populated every night.

There is one UPDATE statement that is based on a largish view that shows a very 
erratic behaviour in terms of it's execution time.

During the build process the statement runs in about 30 seconds which is 
acceptable. When looking at the stored exeuction plan, we can tell that it is 
using dynamic sampling as no statistics have been gathered before running the 
update. As soon as the automatic statistics gathering kicks in, the runtime of 
the statement increases to about 25 minutes(!) - with exactly the same 
execution plan.

This is reproducable. If we manually delete the statistics for one of the 
tables involved, exeuction time is back to 30 seconds again. I tried various 
options for gather_table_stats() but none of them led to statistics that would 
let the statement run in an acceptable time.

The statement looks like this:

UPDATE CXT_FAKT_PROVISIONSBUCHUNG fakt
    SET (HAUPT_VU_SPARTE_SCHLUESSEL,HAUPT_VU_SPARTE_NAME) = (
   SELECT haupt_vu.HAUPT_VU_SPARTE_SCHLUESSEL,
          haupt_vu.HAUPT_VU_SPARTE_NAME
   FROM cxv_haupt_vu_sparte haupt_vu
   WHERE fakt.deklaration_ext_ref = haupt_vu.deklaration_ext_Ref
   AND   fakt.vertrag_ext_ref = haupt_vu.vertrag_ext_ref
   AND   fakt.buchung_pop_id = haupt_vu.pop_id
   AND   fakt.deklaration_gueltig_von = haupt_vu.deklaration_gueltig_von
   AND   fakt.deklaration_gueltig_bis = haupt_vu.deklaration_gueltig_bis);

cxv_haupt_vu_sparte is a view that incorporates several other tables, including 
TMP_VU_SPARTE which is the culprit for the bad behaviour.
TMP_VU_SPARTE is rather small with only 106332 rows (inside the view it is 
joined against itself though which - if no index is used - will yield quite a 
big temporary result)

This is the execution plan when there are no statistics on the table 
TMP_VU_SPARTE (and none for the used index).

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  
---------------------------------------------------
          0          0          0  UPDATE  CXT_FAKT_PROVISIONSBUCHUNG 
(cr=2039813 pr=3010 pw=0 time=47745718 us)
      15456      15456      15456   TABLE ACCESS FULL 
CXT_FAKT_PROVISIONSBUCHUNG (cr=1328 pr=1325 pw=0 time=40734 us cost=370 
size=880992 card=15456)
      11225      11225      11225   VIEW  CXV_HAUPT_VU_SPARTE (cr=2038477 
pr=1684 pw=0 time=47297497 us cost=10 size=4293 card=1)
      11225      11225      11225    SORT UNIQUE (cr=2038477 pr=1684 pw=0 
time=47284436 us cost=10 size=824 card=1)
     126457     126457     126457     VIEW  (cr=2038167 pr=1669 pw=0 
time=27753402 us cost=9 size=824 card=1)
     126457     126457     126457      WINDOW SORT (cr=2038167 pr=1669 pw=0 
time=27667835 us cost=9 size=853 card=1)
     126457     126457     126457       WINDOW SORT (cr=2038167 pr=1669 pw=0 
time=26884699 us cost=9 size=853 card=1)
     126457     126457     126457        NESTED LOOPS  (cr=2038167 pr=1669 pw=0 
time=26342292 us)
     126457     126457     126457         NESTED LOOPS  (cr=1995241 pr=1615 
pw=0 time=26192173 us cost=7 size=853 card=1)
     141581     141581     141581          NESTED LOOPS  (cr=642683 pr=1066 
pw=0 time=3039331 us cost=5 size=499 card=1)
      11225      11225      11225           TABLE ACCESS BY INDEX ROWID 
POP_INFO (cr=22473 pr=32 pw=0 time=109767 us cost=2 size=16 card=1)
      11225      11225      11225            INDEX UNIQUE SCAN PK_POP_INFO 
(cr=11248 pr=4 pw=0 time=51796 us cost=1 size=0 card=1)(object id 1790009)
     141581     141581     141581           TABLE ACCESS BY INDEX ROWID 
TMP_VU_SPARTE (cr=620210 pr=1034 pw=0 time=2889850 us cost=3 size=483 card=1)
    1732978    1732978    1732978            INDEX RANGE SCAN IDX_TMP_VU_SPARTE 
(cr=140952 pr=204 pw=0 time=2094982 us cost=2 size=0 card=1)(object id 1795724)
     126457     126457     126457          INDEX RANGE SCAN IDX_TMP_VU_SPARTE 
(cr=1352558 pr=549 pw=0 time=23078816 us cost=1 size=0 card=1)(object id 
1795724)
     126457     126457     126457         TABLE ACCESS BY INDEX ROWID 
TMP_VU_SPARTE (cr=42926 pr=54 pw=0 time=94791 us cost=2 size=354 card=1)

********************************************************************************

As soon as statistics are available the plan is basically the same, but the 
actual processing times are very different:

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  
---------------------------------------------------
          0          0          0  UPDATE  CXT_FAKT_PROVISIONSBUCHUNG 
(cr=89894995 pr=1701 pw=0 time=318766975 us)
      15456      15456      15456   TABLE ACCESS FULL 
CXT_FAKT_PROVISIONSBUCHUNG (cr=1328 pr=1031 pw=0 time=46975 us cost=370 
size=880992 card=15456)
      11225      11225      11225   VIEW  CXV_HAUPT_VU_SPARTE (cr=89893656 
pr=670 pw=0 time=1553653734 us cost=11 size=4293 card=1)
      11225      11225      11225    SORT UNIQUE (cr=89893656 pr=670 pw=0 
time=1553640071 us cost=11 size=419 card=1)
     126457     126457     126457     VIEW  (cr=89893656 pr=670 pw=0 
time=1533733864 us cost=10 size=419 card=1)
     126457     126457     126457      WINDOW SORT (cr=89893656 pr=670 pw=0 
time=1533646166 us cost=10 size=155 card=1)
     126457     126457     126457       WINDOW SORT (cr=89893656 pr=670 pw=0 
time=1532847028 us cost=10 size=155 card=1)
     126457     126457     126457        NESTED LOOPS  (cr=89893656 pr=670 pw=0 
time=1532238656 us)
    3501658    3501658    3501658         NESTED LOOPS  (cr=89167767 pr=665 
pw=0 time=1529652013 us cost=8 size=155 card=1)
    5300707    5300707    5300707          NESTED LOOPS  (cr=1339312 pr=480 
pw=0 time=9657987 us cost=5 size=81 card=1)
      11225      11225      11225           TABLE ACCESS BY INDEX ROWID 
POP_INFO (cr=22473 pr=32 pw=0 time=119070 us cost=2 size=16 card=1)
      11225      11225      11225            INDEX UNIQUE SCAN PK_POP_INFO 
(cr=11248 pr=3 pw=0 time=54707 us cost=1 size=0 card=1)(object id 1790009)
    5300707    5300707    5300707           TABLE ACCESS BY INDEX ROWID 
TMP_VU_SPARTE (cr=1316839 pr=448 pw=0 time=8359987 us cost=3 size=65 card=1)
    5300707    5300707    5300707            INDEX RANGE SCAN IDX_TMP_VU_SPARTE 
(cr=140971 pr=87 pw=0 time=3603882 us cost=2 size=0 card=1)(object id 1795724)
    3501658    3501658    3501658          INDEX RANGE SCAN IDX_TMP_VU_SPARTE 
(cr=87828455 pr=185 pw=0 time=1518016475 us cost=2 size=0 card=1)(object id 
1795724)
     126457     126457     126457         TABLE ACCESS BY INDEX ROWID 
TMP_VU_SPARTE (cr=725889 pr=5 pw=0 time=1829196 us cost=3 size=74 card=1)

********************************************************************************

Checking ALL_TABLES using

select table_name, num_rows, avg_row_len, blocks
from all_tables
where table_name = 'TMP_VU_SPARTE'

returns the following:

TABLE_NAME    | NUM_ROWS | AVG_ROW_LEN | BLOCKS
--------------+----------+-------------+-------
TMP_VU_SPARTE | 106332   | 79          | 1176

So the row estimate calculated by dbms_stats is accurate.


Can anyone think of a reason why the presence of the statistics lead to 
basically the same plan, but completely different runtimes?
The data in the tables did not change during the two runs of the statement.

Is there a way to get statistics on the table in such a way that the update 
statement will work even if they are present?
(I have never had a statement before that required the statistics to be absent 
in order to perform well)


As an additional pitfall the statement behaves differently when run from 
different clients. We narrowed this down to the JDBC driver initializing 
NLS_SORT differently depending on the language of the operating system. If 
NLS_SORT is initialized to BINARY things are fine (and the above plan is used, 
doing two index scans on IDX_TMP_VU_SPARTE)

But if the driver initializes NLS_SORT to "GERMAN", the plan (with updated 
statistics) changes to the following

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | 
Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                     |                            | 
15456 |   860K|    10M  (2)| 34:00:16 |
|   1 |  UPDATE                              | CXT_FAKT_PROVISIONSBUCHUNG |     
  |       |            |          |
|   2 |   TABLE ACCESS FULL                  | CXT_FAKT_PROVISIONSBUCHUNG | 
15456 |   860K|   370   (1)| 00:00:05 |
|   3 |   VIEW                               | CXV_HAUPT_VU_SPARTE        |     
1 |  4293 |   330   (2)| 00:00:04 |
|   4 |    SORT UNIQUE                       |                            |     
1 |   419 |   330   (2)| 00:00:04 |
|*  5 |     VIEW                             |                            |     
1 |   419 |   329   (2)| 00:00:04 |
|   6 |      WINDOW SORT                     |                            |     
1 |   155 |   329   (2)| 00:00:04 |
|   7 |       WINDOW SORT                    |                            |     
1 |   155 |   329   (2)| 00:00:04 |
|   8 |        NESTED LOOPS                  |                            |     
  |       |            |          |
|   9 |         NESTED LOOPS                 |                            |     
1 |   155 |   327   (1)| 00:00:04 |
|  10 |          NESTED LOOPS                |                            |     
1 |    90 |   324   (1)| 00:00:04 |
|  11 |           TABLE ACCESS BY INDEX ROWID| POP_INFO                   |     
1 |    16 |     2   (0)| 00:00:01 |
|* 12 |            INDEX UNIQUE SCAN         | PK_POP_INFO                |     
1 |       |     1   (0)| 00:00:01 |
|* 13 |           TABLE ACCESS FULL          | TMP_VU_SPARTE              |     
1 |    74 |   322   (1)| 00:00:04 |
|* 14 |          INDEX RANGE SCAN            | IDX_TMP_VU_SPARTE          |     
1 |       |     2   (0)| 00:00:01 |
|  15 |         TABLE ACCESS BY INDEX ROWID  | TMP_VU_SPARTE              |     
1 |    65 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Note the FTS in line 13 which makes the update tage ages (I killed it after 1.5 
hours)

The first column in IDX_TMP_VU_SPARTE is a VARCHAR2 column, and that columns is 
used in the self join of the view (which makes the index worthwhile).
The database uses the character set: WE8MSWIN1252. I'm suprised that the 
NLS_SORT setting would influence an equality condition on an indexed column.

Interesting enough the index is created during our build process by a 
Kettle/Pentaho process where the driver initializes NLS_SORT to GERMAN.
My understanding was, that this will not influence the way the index is built. 
And I did not expect NLS_SORT to have an impact on an equality condition.
The other columns in the index are DATE columns.


The environment:

Oracle 11.2.0.3 64bit
CentOS 6.2

Any input is very much appreciated.

Regards
Thomas
--
//www.freelists.org/webpage/oracle-l


Other related posts: