RE: FW: Bitmap index not used when joining tables

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: Oracle-l List <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 2 Jun 2014 19:47:10 +0000


Gopal,

I should have looked in my book rather than trying to remember where I'd got 
the 80/20 from - here's how it appears on p.188:

"According to K. Gopalakrishnan et al. (Oracle Wait Interface: A Practical 
Guide to Oracle Performance Diagnostics and Tuning, Osborne McGraw-Hill, 2004), 
the optimizer assumes that 80% of the target data is tightly packed, and 20% of 
the target data is widely scattered."




Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf 
of K Gopalakrishnan [kaygopal@xxxxxxxxx]
Sent: 02 June 2014 20:25
To: chris.saxon@xxxxxxxxx
Cc: Oracle-l List
Subject: Re: FW: Bitmap index not used when joining tables

Chris,

(From my Oracle Wait Interface Book: Page 278...Bitmap index costing, changing 
the costing methods,etc)

//www.freelists.org/post/oracle-l/BITMAP-index-cost-10053-trace,4

-Gopal




On Sun, Jun 1, 2014 at 4:32 AM, Chris Saxon 
<chris.saxon@xxxxxxxxx<mailto:chris.saxon@xxxxxxxxx>> wrote:
Thanks for the explanation Jonathan.

Could you explain where the 20% value comes from please? Is this a default for 
bitmap indexes, or something you've inferred from the info posted?


On 31 May 2014 20:24, Mark W. Farnham <mwf@xxxxxxxx<mailto:mwf@xxxxxxxx>> wrote:
hotel email bounce; trying again

From: Mark W. Farnham [mailto:mwf@xxxxxxxx<mailto:mwf@xxxxxxxx>]
Sent: Saturday, May 31, 2014 3:14 PM
To: 'jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>'; 
'oracle-l@xxxxxxxxxxxxx<mailto:oracle-l@xxxxxxxxxxxxx>'
Subject: RE: Bitmap index not used when joining tables

I do wonder about what your results might be for these two queries if you 
modified your query slightly as:

select
--+ gather_plan_statistics
 s.* from sales_fact s
       where s.rowid in (
       select
       --+ no_merge
          s2.rowid from sales_fact s2
          where s2.date_id in (
          select
          --+ no_merge
             d.date_id from date_d d
             where calendar_date between date’2013-12-31’ and date’2014-01-01’
          )
       );

The idea being that if the only column it needs in the unmerged subquery 
(rowid) can be sourced from the index, this should be less total work.
Possibly this is also optimized if the subquery is ordered by rowid, but that’s 
not the curiosity at this point. Of course even if this works you would only 
want to form it up this way when you knew it was to your benefit (knowledge the 
CBO can only guess by way of the thumbrule quoted by JL, and some other 
thumbrule for a literal range.)

I just typed this in and didn’t test it, and working is release dependent, so 
you and I got lucky if this works via cut and paste. Once debugged I would sort 
of expect this to use the index in both cases to fetch back the rowid list.

mwf
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx> 
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Jonathan Lewis
Sent: Saturday, May 31, 2014 1:11 PM
To: oracle-l@xxxxxxxxxxxxx<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: RE: Bitmap index not used when joining tables


It's not surprising for the example you've given.

The basic principle is that the cost of a single table access via a bitmap 
index is likely to be different from the cost of access via the equivalent 
btree index because the bitmap index has no information about data clustering 
(i.e no clustering_factor) so it uses a guess.  Sometimes this means the cost 
of the bitmap will be higher, sometimes lower, sometimes the same as for a 
btree.  In your case the btree clustering factor will be very low (because of 
the order by in the CTAS) while the guess basically assumes that 20% of the 
data will be very widely scattered - in your case that probably means 400 
blocks (20% of 2000 rows => 400 blocks).




Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx> 
[oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>] on behalf 
of Chris Saxon [chris.saxon@xxxxxxxxx<mailto:chris.saxon@xxxxxxxxx>]
Sent: 31 May 2014 16:21
To: oracle-l@xxxxxxxxxxxxx<mailto:oracle-l@xxxxxxxxxxxxx>
Subject: Bitmap index not used when joining tables
Hi,

I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two tables 
on a column with a BTree index, the index is used in the execution plan. If 
this index is changed to a bitmap index, Oracle no longer uses the index when 
executing the query! It assigns a higher cost to using the bitmap index when 
joining, despite this being a cheaper approach (in terms of consistent gets).

To see this, I created a date dimension table with 515 days (rows) and a sales 
fact table with 1,000 rows for each day:

create table date_d (
  date_id integer not null primary key,
  calendar_date date not null unique
);

create table sales_fact (
  date_id integer not null
    references date_d (date_id),
  quantity number not null,
  total_value number not null
);

insert into date_d
  select rownum, date'2013-01-01'-1+rownum
  from   dual
  connect by level <= sysdate - date'2013-01-01';

insert into sales_fact
with rws as (select * from dual connect by level <= 1000)
  select d.date_id, round(dbms_random.value(1, 20)), 
round(dbms_random.value(10, 100), 2)
  from   date_d d
  cross  join rws
  order  by d.date_id;

begin
  dbms_stats.gather_table_stats(user, 'sales_fact');
  dbms_stats.gather_table_stats(user, 'date_d');
end;
/

If I create a BTree index on SALES_FACT.DATE_ID, then join from the date dim to 
the fact table, restricting to two days, Oracle uses the index on the fact 
table as I would expect (as we're fetching 2,000 of 515,000 rows):

create index safa_date_id on sales_fact (date_id);

set autotrace trace
select s.* from sales_fact s join date_d d
on     d.date_id = s.date_id
where  calendar_date between date'2013-12-31' and date'2014-01-01';

set autotrace off

Execution Plan
----------------------------------------------------------
Plan hash value: 2189554905<tel:2189554905>

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  3002 | 69046 |    21   
(0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |          
  |          |
|   2 |   NESTED LOOPS                |              |  3002 | 69046 |    21   
(0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | SAFA_DATE_ID |  1000 |       |     3   
(0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |     6   
(0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   5 - access("D"."DATE_ID"="S"."DATE_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        290  consistent gets
          6  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

However, if I drop the normal index and re-create it as a bitmap index the 
query above changes to a FTS on the SALES_FACT table:

set autotrace off
drop index safa_date_id;
create bitmap index safa_date_id on sales_fact (date_id);

set autotrace trace
select s.* from sales_fact s join date_d d
on     d.date_id = s.date_id
where  calendar_date between date'2013-12-31' and date'2014-01-01';

Execution Plan
----------------------------------------------------------
Plan hash value: 525754326

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  3002 | 69046 |   315   
(2)| 00:00:04 |
|*  1 |  HASH JOIN                   |              |  3002 | 69046 |   315   
(2)| 00:00:04 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | SALES_FACT   |   515K|  5532K|   310   
(1)| 00:00:04 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("D"."DATE_ID"="S"."DATE_ID")
   3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1267  consistent gets
          0  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

If we hint the query to use the bitmap index, we can see it has a higher cost. 
The autotrace stats report significantly fewer consistent gets though:

Execution Plan
----------------------------------------------------------
Plan hash value: 1520624055

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  3002 | 69046 |   416   
(0)| 00:00:05 |
|   1 |  NESTED LOOPS                 |              |       |       |          
  |          |
|   2 |   NESTED LOOPS                |              |  3002 | 69046 |   416   
(0)| 00:00:05 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|   5 |    BITMAP CONVERSION TO ROWIDS|              |       |       |          
  |          |
|*  6 |     BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID |       |       |          
  |          |
|   7 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |   416   
(0)| 00:00:05 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   6 - access("D"."DATE_ID"="S"."DATE_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        152  consistent gets
          1  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

If you remove the join to the date dim and just use the date ids, Oracle uses 
the index as expected:

select * from sales_fact
where  date_id between 365 and 366;

Execution Plan
----------------------------------------------------------
Plan hash value: 2749560877

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  3002 | 33022 |   221   
(0)| 00:00:03 |
|   1 |  TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  3002 | 33022 |   221   
(0)| 00:00:03 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |           
 |          |
|*  3 |    BITMAP INDEX RANGE SCAN   | SAFA_DATE_ID |       |       |           
 |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DATE_ID">=365 AND "DATE_ID"<=366)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        144  consistent gets
          0  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

Looking at the 10053 trace, I can see this is because the bm index join cost is 
calculated as higher than the FTS of SALES_FACT:

NL Join
  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12
Access path analysis for SALES_FACT
  Inner table: SALES_FACT  Alias: S
  Access Path: TableScan
    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1
      Cost_io: 920.00  Cost_cpu: 317629069
      Resp_io: 920.00  Resp_cpu: 317629069
  ****** trying bitmap/domain indexes ******
  Access Path: index (AllEqJoinGuess)
    Index: SAFA_DATE_ID
    resc_io: 1.00  resc_cpu: 8171
    ix_sel: 0.001942  ix_sel_with_filters: 0.001942
    NL Join : Cost: 6.00  Resp: 6.00  Degree: 1
      Cost_io: 6.00  Cost_cpu: 47359
      Resp_io: 6.00  Resp_cpu: 47359
  Bitmap nodes:
    Used SAFA_DATE_ID
      Cost = 6.001464, sel = 0.001942
  Access path: Bitmap index - accepted
    Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel: 0.001942
    Not Believed to be index-only
  ****** finished trying bitmap/domain indexes ******

  Best NL cost: 416.12
          resc: 416.12  resc_io: 416.00  resc_cpu: 3943626
          resp: 416.12  resp_io: 416.00  resc_cpu: 3943626

However, same section of the 10053 trace shows the following when using a BTree 
index on the SALES_FACT.DATE_ID column:

NL Join
  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12
Access path analysis for SALES_FACT
  Inner table: SALES_FACT  Alias: S
  Access Path: TableScan
    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1
      Cost_io: 920.00  Cost_cpu: 317629069
      Resp_io: 920.00  Resp_cpu: 317629069
  Access Path: index (AllEqJoinGuess)
    Index: SAFA_DATE_ID
    resc_io: 6.00  resc_cpu: 433579
    ix_sel: 0.001942  ix_sel_with_filters: 0.001942
    NL Join : Cost: 21.04  Resp: 21.04  Degree: 1
      Cost_io: 21.00  Cost_cpu: 1323580
      Resp_io: 21.00  Resp_cpu: 1323580

  Best NL cost: 21.04
          resc: 21.04  resc_io: 21.00  resc_cpu: 1323580
          resp: 21.04  resp_io: 21.00  resc_cpu: 1323580

Why does this happen? Is this a bug or expected behaviour?

Thanks,
Chris
www.sqlfail.com<http://www.sqlfail.com>


Other related posts: