Completely insane execution plan with 11.2.0.3 but not with 11.2.0.2
- From: Thomas Kellerer <thomas.kellerer@xxxxxxxxxx>
- To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
- Date: Mon, 23 Jul 2012 17:01:59 +0200
Hello,
(I have also posted this to forums.oracle.com in case this sounds familiar to
someone).
we have a production server that is running Oracle 11.2.0.2 on Windows and a
Development/Test server
that is running 11.2.0.3 on Linux.
We have one statement that selects from a rather large view (that statement
does a simple select * from viewname).
The execution plan shows a cost of ~20k and a row estimate of ~800
On our test server which has similar data the optimizer completely
mis-estimating the cost for one join which
results in totally insane estimates higher up the chain. All other row
estimates are pretty close to reality.
After testing several things we discovered that setting
optimizer_features_enable = '11.2.0.2' things are fine on our test server.
The bad plan (optimizer_features_enable = '11.2.0.3')
--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows |
Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 39P|
7042P| | 14T(100)|999:59:59 |
| 1 | VIEW | XV_POSITION | 39P|
7042P| | 14T(100)|999:59:59 |
| 2 | WINDOW SORT | | 39P|
7750P| 8784P| 14T(100)|999:59:59 |
|* 3 | VIEW | | 39P|
7750P| | 9614G(100)|999:59:59 |
| 4 | WINDOW SORT | | 39P|
3538P| 4026P| 9614G(100)|999:59:59 |
| 5 | WINDOW SORT | | 39P|
3538P| 4026P| 9614G(100)|999:59:59 |
|* 6 | HASH JOIN RIGHT OUTER | | 39P|
3538P| 10G| 189G(100)|999:59:59 |
| 7 | VIEW | V_POS_GUELTIGKEIT | 199M|
8377M| | 43M (1)|146:17:54 |
| 8 | HASH GROUP BY | | 199M|
121G| 126G| 43M (1)|146:17:54 |
|* 9 | HASH JOIN RIGHT OUTER | | 199M|
121G| | 6573K (2)| 21:54:40 |
| 10 | TABLE ACCESS FULL | VORGANG | 7698 |
135K| | 32 (0)| 00:00:01 |
|* 11 | HASH JOIN RIGHT OUTER | | 199M|
118G| | 6572K (2)| 21:54:28 |
| 12 | VIEW | V_ALLE_POSITIONEN | 25974 |
634K| | 6006 (2)| 00:01:13 |
.......
| 594 | NESTED LOOPS | | 199M|
113G| | 6564K (2)| 21:52:58 |
|*595 | HASH JOIN RIGHT OUTER | | 7686 |
315K| | 303 (1)| 00:00:04 |
| 596 | TABLE ACCESS FULL | VORGANG | 7698 |
135K| | 32 (0)| 00:00:01 |
| 597 | VIEW | | 7686 |
180K| | 270 (1)| 00:00:04 |
| 598 | UNION-ALL | | |
| | | |
|*599 | TABLE ACCESS FULL | VERTRAGSKOPF | 1089 |
30492 | | 37 (0)| 00:00:01 |
.......
The good plan: (optimizer_features_enable = '11.2.0.2')
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows |
Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 296 |
58904 | 20370 (2)| 00:04:05 |
| 1 | VIEW | XV_POSITION | 296 |
58904 | 20370 (2)| 00:04:05 |
| 2 | WINDOW SORT | | 296 |
64824 | 20370 (2)| 00:04:05 |
|* 3 | VIEW | | 296 |
64824 | 20369 (2)| 00:04:05 |
| 4 | WINDOW SORT | | 296 |
34928 | 20369 (2)| 00:04:05 |
| 5 | WINDOW SORT | | 296 |
34928 | 20369 (2)| 00:04:05 |
|* 6 | HASH JOIN RIGHT OUTER | | 296 |
34928 | 20367 (2)| 00:04:05 |
| 7 | VIEW | V_POS_GUELTIGKEIT | 296 |
13024 | 13983 (2)| 00:02:48 |
| 8 | HASH GROUP BY | | 296 |
254K| 13983 (2)| 00:02:48 |
|* 9 | HASH JOIN RIGHT OUTER | | 296 |
254K| 13982 (2)| 00:02:48 |
| 10 | TABLE ACCESS FULL | VORGANG | 7698 |
135K| 32 (0)| 00:00:01 |
|* 11 | HASH JOIN OUTER | | 296 |
249K| 13949 (2)| 00:02:48 |
|* 12 | HASH JOIN RIGHT OUTER | | 296 |
241K| 7943 (2)| 00:01:36 |
| 13 | TABLE ACCESS FULL | VORGANG | 7698 |
135K| 32 (0)| 00:00:01 |
|* 14 | HASH JOIN | | 296 |
236K| 7910 (2)| 00:01:35 |
.................
Looking at the trace file I could verify that it really does a lot of reads
when using the bad plan:
The 11.2.0.3 optimizer decides to use a nested loop (line 594 in the bad plan)
instead of the hash join (line 14 in the good plan).
The question is now, why is it doing that?
On OTN I was pointed to v$system_fix_control and I tried disabling the
following "features"
(as shown here
http://jonathanlewis.wordpress.com/2009/12/22/optimizer-features/)
9814067 Sanity check when estimating range inner join
11830663 disallow HASH GROUP BY for subquery (in SELECT) processing
9980661 sanity check when estimating range inner join cardinality
10038373 Fix usage of transitive join predicate list in subquery pruning
11699884 fix selectivity of Table Lookup By NL view and dimensions
12410972 push predicate with NLS_SORT in window function
8683604 allow NLJ if there is no sub-query pruning (absolute dimension)
but none of them changed anything.
A quick search on Metalink did not turn up anything useful either.
How can I find out which feature exactly causes this behaviour?
For the time being it's absolutely OK for us to set
optimizer_features_enable='11.2.0.2' but I'm curious which "pattern"
triggers this behaviour. If I can pinpoint that, I'm inclined to open a SR with
Oracle.
Thanks
Thomas
--
http://www.freelists.org/webpage/oracle-l
Other related posts: