RE: How to analyze a odd column distribution

  • From: "Lex de Haan" <lex.de.haan@xxxxxxxxxxxxxx>
  • To: <lex.de.haan@xxxxxxxxxxxxxx>, <jonathan@xxxxxxxxxxxxxxxxxx>, <aleon68@xxxxxxxxx>, "'Christian Antognini'" <Christian.Antognini@xxxxxxxxxxxx>
  • Date: Fri, 21 Jan 2005 19:42:58 +0100

 
sorry -- the timestamp literal needs a time component too -- see correction
below...
 
Lex.
 
----------------------------------------------------------------
Tom Kyte Seminar: http://www.naturaljoin.nl/events/seminars.html
----------------------------------------------------------------
 
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Lex de Haan
Sent: Friday, January 21, 2005 19:25
To: jonathan@xxxxxxxxxxxxxxxxxx; aleon68@xxxxxxxxx; 'Christian Antognini'
Cc: oracle-l@xxxxxxxxxxxxx
Subject: RE: How to analyze a odd column distribution
just to add to what Jonathan suspects, I think it is better to specify real
datetime literals in cases like this anyway. that is:
replace "FHASTA = TO_TIMESTAMP('31-DEC-2999','DD-MON-YYYY')"
with    "FHASTA = TIMESTAMP '2999-12-31 00:00:00'"

it reads much better, and Oracle for sure will interpret this as a literal,
not a function call.
hope this helps,
 
Lex.
 
----------------------------------------------------------------
Tom Kyte Seminar: http://www.naturaljoin.nl/events/seminars.html
----------------------------------------------------------------
 

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Jonathan Lewis
Sent: Friday, January 21, 2005 17:19
To: aleon68@xxxxxxxxx; Christian Antognini
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re: How to analyze a odd column distribution



If you check the 'wrong' plan, it is doing a KEY-KEY partition start/stop.
This means Oracle can see that a single partition will be visited, but
cannot identify which one, at parse time. Consequently it is probably using
the global level density to estimate the row count.  I suspect you have a
problem with the to_timestamp() call, that is making Oracle treat the result
as 'unknown' rather than 'constant'.


Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/seminar.html
Public Appearances - schedule updated Dec 23rd 2004






----- Original Message -----
From: "Alfonso León" <aleon68@xxxxxxxxx>
To: "Christian Antognini" <Christian.Antognini@xxxxxxxxxxxx>
Cc: <oracle-l@xxxxxxxxxxxxx>
Sent: Friday, January 21, 2005 3:44 PM
Subject: Re: How to analyze a odd column distribution


This is an example

first there is the statistics and histograms and at the end the explain plan
for the queries, as you can see if I specify the partitions then I get the
real number, but that is not the case when I don't specify the partitions.

SQL> select num_distinct,num_buckets,sample_size from
USER_tab_col_statistics
 2  where table_name='TCUENTA'
 3  and column_name='FHASTA'
 4  /

NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE
------------ ----------- -----------
    1333281          28     3668140

SQL> select endpoint_number,endpoint_value from user_tab_histograms
 2  where table_name='TCUENTA'
 3  AND COLUMN_NAME='FHASTA'
 4  /

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
             0     2453157,36
             1     2453168,08
             2     2453181,13
             3     2453194,03
             4        2453225
             5     2453231,64
             6     2453231,68
             7     2453231,72
             8     2453231,75
             9     2453266,01
            10     2453316,15
            11     2453321,86
            12     2453322,83
            13     2453323,06
            14     2453323,08
            15      2453323,1
            16     2453323,13
            17     2453323,15
            18     2453323,18
            19      2453323,2
            20     2453323,22
            21     2453323,24
            22     2453323,26
            23     2453329,59
            24     2453336,67
            25     2453374,59
            26     2453385,65
            27     2453385,67
            75        2816787

29 filas seleccionadas.

SQL> select partition_name,num_distinct,num_buckets,sample_size from
USER_part_col_statistics
 2  where table_name='TCUENTA'
 3  and column_name='FHASTA'
 4  /

PARTITION_NAME                 NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE
------------------------------ ------------ ----------- -----------
P1                                  1333280          75       50720
P2                                        1           1       65174

SQL> select Partition_name,bucket_number,endpoint_value from
user_PART_histograms
 2  where table_name='TCUENTA'
 3  AND COLUMN_NAME='FHASTA'
 4  /

PARTITION_NAME                 BUCKET_NUMBER ENDPOINT_VALUE
------------------------------ ------------- --------------
P1                                         0     2453157,42
P1                                         1     2453167,81
P1                                         2     2453167,86
P1                                         3     2453169,02
P1                                         4     2453178,56
P1                                         5     2453181,12
P1                                         6     2453181,15
P1                                         7     2453189,08
P1                                         8     2453194,03
P1                                         9     2453194,06
P1                                        10     2453194,09
P1                                        11      2453231,6
P1                                        12     2453231,61
P1                                        13     2453231,63
P1                                        14     2453231,65
P1                                        15     2453231,66
P1                                        16     2453231,68
P1                                        17     2453231,69
P1                                        18      2453231,7
P1                                        19     2453231,72
P1                                        20     2453231,73
P1                                        21     2453231,74
P1                                        22     2453231,76
P1                                        23     2453231,77
P1                                        24     2453242,67
P1                                        25        2453278
P1                                        26     2453292,84
P1                                        27     2453305,74
P1                                        28     2453321,81
P1                                        29     2453321,83
P1                                        30     2453321,86
P1                                        31     2453321,87
P1                                        32     2453322,82
P1                                        33     2453322,83
P1                                        34     2453323,05
P1                                        35     2453323,06
P1                                        36     2453323,06
P1                                        37     2453323,07
P1                                        38     2453323,08
P1                                        39     2453323,09
P1                                        40      2453323,1
P1                                        41      2453323,1
P1                                        42     2453323,11
P1                                        43     2453323,12
P1                                        44     2453323,13
P1                                        45     2453323,13
P1                                        46     2453323,14
P1                                        47     2453323,16
P1                                        48     2453323,17
P1                                        49     2453323,18
P1                                        50     2453323,18
P1                                        51     2453323,19
P1                                        52      2453323,2
P1                                        53     2453323,21
P1                                        54     2453323,21
P1                                        55     2453323,22
P1                                        56     2453323,23
P1                                        57     2453323,24
P1                                        58     2453323,25
P1                                        59     2453323,25
P1                                        60     2453323,26
P1                                        61     2453323,27
P1                                        62     2453323,27
P1                                        63     2453323,32
P1                                        64      2453329,8
P1                                        65     2453329,82
P1                                        66     2453336,67
P1                                        67      2453344,2
P1                                        68     2453368,54
P1                                        69     2453374,64
P1                                        70     2453385,64
P1                                        71     2453385,65
P1                                        72     2453385,66
P1                                        73     2453385,66
P1                                        74     2453385,67
P1                                        75     2453390,36
P2                                      6498        2816787

77 filas seleccionadas.

SQL> explain plan for
 2  SELECT * FROM TCUENTA
 3  WHERE FHASTA =TO_TIMESTAMP('31-DEC-2999','DD-MON-YYYY')
 4  /

----------------------------------------------------------------------------
-----------------------------------
| Id  | Operation                          |  Name               |
Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------
-----------------------------------
|   0 | SELECT STATEMENT                   |                     |
3 |   399 |     5  (20)|       |       |
|   1 |  PARTITION RANGE SINGLE            |                     |
  |       |            |   KEY |   KEY |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TCUENTA             |
3 |   399 |     5  (20)|   KEY |   KEY |
|*  3 |    INDEX RANGE SCAN                | ICTAFHASFDESPROSUB  |
3 |       |     4  (25)|   KEY |   KEY |
----------------------------------------------------------------------------
-----------------------------------

SQL> explain plan for
 2  SELECT * FROM TCUENTA PARTITION (P1)
 3  WHERE FHASTA =TO_TIMESTAMP('31-DEC-2999','DD-MON-YYYY')
 4  /

Explicado.

----------------------------------------------------------------------------
----------------------------------
| Id  | Operation                         |  Name               | Rows
 | Bytes | Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------
----------------------------------
|   0 | SELECT STATEMENT                  |                     |
1 |   129 |     5  (20)|       |       |
|   1 |  TABLE ACCESS BY LOCAL INDEX ROWID| TCUENTA             |
1 |   129 |     5  (20)|     1 |     1 |
|*  2 |   INDEX RANGE SCAN                | ICTAFHASFDESPROSUB  |
1 |       |     4  (25)|     1 |     1 |
----------------------------------------------------------------------------
----------------------------------

SQL> explain plan for
 2  SELECT * FROM TCUENTA PARTITION (P2)
 3  WHERE FHASTA =TO_TIMESTAMP('31-DEC-2999','DD-MON-YYYY')
 4  /

Explicado.

----------------------------------------------------------------------------
-------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost
(%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------
-------------
|   0 | SELECT STATEMENT     |             |  2334K|   300M| 14642
(55)|       |       |
|*  1 |  TABLE ACCESS FULL   | TCUENTA     |  2334K|   300M| 14642
(55)|     2 |     2 |
----------------------------------------------------------------------------
-------------




On Wed, 19 Jan 2005 20:19:17 +0100, Christian Antognini
<Christian.Antognini@xxxxxxxxxxxx> wrote:
> Hi Alfonso
>
> >I have a table where I have a date column with historic information 
> >in the same table, if date = '31-dec-2999' then is actual record if 
> >the date is different of 31-dec-2999 the is an historic record. Now 
> >there are about 5 million records and half of them are historic so 
> >there are
> >2.5 million records with 31-dec-2999 and it is confusing the 
> >optimizer.
>
> Exactly for this reason such thing should not be implemented.
>
> >Can anybody give me any suggestion to tell the optimizer this odd 
> >distribution? I already try using histograms and didn't get a good 
> >execution plan and I would not like to use hints.
>
> How did you create the histograms? Can you give us an example of a 
> query that doesn't work?
>
> Chris
>
>


--
Alfonso Leon
--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l

Other related posts: