Hi All,
I am having an issue of a query getting a bad plan -which I have narrowed down
to major mismatch between estimated and actual cardinality of one of the tables
involved in the query. I need
help from experts on understanding why the cardinality estimate is wrong and
what can be done to rectify. For this purpose I have created a test table with
required columns and tried to
create same situation (of cardinality mismatch) and following are the details
on it. (Oracle version is : 11.2.0.4 on Linux.)
Table structure: CREATE TABLE processes
(
process_id NUMBER (12) NOT NULL PRIMARY KEY,
clientid VARCHAR2 (20) NOT NULL,
evt_type_id NUMBER (12) NOT NULL,
status VARCHAR2 (20) NOT NULL
)
Following are the stats about the table (Thanks to the print_table utility of
Tom Kyte Sir - the below is output from "select * from all_tables where
table_name='PROCESSES'" from
which several default stuff is remove to keep this a little brief)
OWNER : TEST
TABLE_NAME : PROCESSES
PCT_FREE : 10
NUM_ROWS : 2231714
BLOCKS : 12137
AVG_ROW_LEN : 31
SAMPLE_SIZE : 2231714
Here is the query and test data :
variable v_coid varchar2(16);
variable v_status varchar2(16);
variable v_evt_type_id number;
exec :v_coid := 'G31MQQNM99ABCA9V';
exec :v_status := 'INP';
exec :v_event_type_id :=5014;
SQL> SELECT COUNT (*) FROM processes wpi WHERE WPI.CLIENTID = :v_coid
AND wpi.status = :v_status AND WPI.evt_type_id = :v_event_type_id;
COUNT(*)
----------
10404
1 row selected.
Following is the Execution plsn:
Plan hash value: 3304333532
------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08
| 11406 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.08
| 11406 |
|* 2 | TABLE ACCESS FULL| PROCESSES | 1 | 62 | 10404 |00:00:00.08
| 11406 |
------------------------------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - (VARCHAR2(30), CSID=873): 'G31MQQNM99ABCA9V'
2 - (VARCHAR2(30), CSID=873): 'INP'
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3 - (NUMBER): 3004
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("WPI"."CLIENTID"=:V_COID AND "WPI"."STATUS"=:V_STATUS AND
"WPI"."EVT_TYPE_ID"=:V_EVT_TYPE_ID))
29 rows selected.
Here we can see the Huge difference - the estimated cardinality of just 62
versus the acutal cardinality of 10404. Note that all these 3 columns involved
in the query have frequency histograms
on them and following are details on it...and this is the issue - the
difference between estimated cardinality of 62 versus actual value of 10404
that I need help to sort out.
exec print_table('select * from all_tab_col_statistics where owner=''TEST'' AND
TABLE_NAME=''PROCESSES''');
OWNER : TEST -- same for all the columns below
TABLE_NAME : PROCESSES -- same for all the columns below
COLUMN_NAME : PROCESS_ID
NUM_DISTINCT : 2231714
LOW_VALUE : C50D01015C04
HIGH_VALUE : C5170B1A5218
DENSITY : .000000444209506971868
NUM_NULLS : 0
NUM_BUCKETS : 254
LAST_ANALYZED : 20-oct-2017 04:20:20 --for all columns below
too...the stats are current.
SAMPLE_SIZE : 5548
AVG_COL_LEN : 7
HISTOGRAM : HEIGHT BALANCED
-----------------
OWNER : TEST
TABLE_NAME : PROCESSES
COLUMN_NAME : CLIENTID
NUM_DISTINCT : 104
LOW_VALUE : 30324652425132433157563030303936
HIGH_VALUE : 47354D384A4D54394545484A51475859
DENSITY : .000000222104676853238
NUM_NULLS : 0
NUM_BUCKETS : 87
SAMPLE_SIZE : 5548
AVG_COL_LEN : 17
HISTOGRAM : FREQUENCY
-----------------
COLUMN_NAME : EVT_TYPE_ID
NUM_DISTINCT : 50
LOW_VALUE : C102
HIGH_VALUE : C21F07
DENSITY : .000000222104676853238
NUM_NULLS : 0
NUM_BUCKETS : 38
SAMPLE_SIZE : 5548
AVG_COL_LEN : 4
HISTOGRAM : FREQUENCY
-----------------
COLUMN_NAME : STATUS
NUM_DISTINCT : 3
LOW_VALUE : 434F4D
HIGH_VALUE : 494E50
DENSITY : .000000222104676853238
NUM_NULLS : 0
NUM_BUCKETS : 3
SAMPLE_SIZE : 5548
AVG_COL_LEN : 4
HISTOGRAM : FREQUENCY
-----------------
Please let me know if I should provide any other details. One thing I found was
that if I do dynamic sampling level 8 then the cardinality estimate improved to
nearly 7000 (don't recall that exact figure),but I would not like to depend on
it. The other thing is if I drop the histograms then the cardinality improved
slightly - to 143 as shown in the plan below but that is still way off the
actual value which is more than 10K (10404 to be exact).
------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08
| 11406 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.08
| 11406 |
|* 2 | TABLE ACCESS FULL| PROCESSES | 1 | 143 | 10404 |00:00:00.08
| 11406 |
------------------------------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - (VARCHAR2(30), CSID=873): 'G31MQQNMH3TW7A9V'
2 - (VARCHAR2(30), CSID=873): 'INP'
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3 - (NUMBER): 3004
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("WPI"."CLIENTID"=:V_COID AND "WPI"."EVT_TYPE_ID"=:V_EVT_TYPE_ID
AND "WPI"."STATUS"=:V_STATUS))
Regards, Nirav