RE: Bitmap index not used when joining tables

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 31 May 2014 17:11:11 +0000

It's not surprising for the example you've given.

The basic principle is that the cost of a single table access via a bitmap 
index is likely to be different from the cost of access via the equivalent 
btree index because the bitmap index has no information about data clustering 
(i.e no clustering_factor) so it uses a guess.  Sometimes this means the cost 
of the bitmap will be higher, sometimes lower, sometimes the same as for a 
btree.  In your case the btree clustering factor will be very low (because of 
the order by in the CTAS) while the guess basically assumes that 20% of the 
data will be very widely scattered - in your case that probably means 400 
blocks (20% of 2000 rows => 400 blocks).




Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf 
of Chris Saxon [chris.saxon@xxxxxxxxx]
Sent: 31 May 2014 16:21
To: oracle-l@xxxxxxxxxxxxx
Subject: Bitmap index not used when joining tables

Hi,

I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two tables 
on a column with a BTree index, the index is used in the execution plan. If 
this index is changed to a bitmap index, Oracle no longer uses the index when 
executing the query! It assigns a higher cost to using the bitmap index when 
joining, despite this being a cheaper approach (in terms of consistent gets).

To see this, I created a date dimension table with 515 days (rows) and a sales 
fact table with 1,000 rows for each day:

create table date_d (
  date_id integer not null primary key,
  calendar_date date not null unique
);

create table sales_fact (
  date_id integer not null
    references date_d (date_id),
  quantity number not null,
  total_value number not null
);

insert into date_d
  select rownum, date'2013-01-01'-1+rownum
  from   dual
  connect by level <= sysdate - date'2013-01-01';

insert into sales_fact
with rws as (select * from dual connect by level <= 1000)
  select d.date_id, round(dbms_random.value(1, 20)), 
round(dbms_random.value(10, 100), 2)
  from   date_d d
  cross  join rws
  order  by d.date_id;

begin
  dbms_stats.gather_table_stats(user, 'sales_fact');
  dbms_stats.gather_table_stats(user, 'date_d');
end;
/

If I create a BTree index on SALES_FACT.DATE_ID, then join from the date dim to 
the fact table, restricting to two days, Oracle uses the index on the fact 
table as I would expect (as we're fetching 2,000 of 515,000 rows):

create index safa_date_id on sales_fact (date_id);

set autotrace trace
select s.* from sales_fact s join date_d d
on     d.date_id = s.date_id
where  calendar_date between date'2013-12-31' and date'2014-01-01';

set autotrace off

Execution Plan
----------------------------------------------------------
Plan hash value: 2189554905

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  3002 | 69046 |    21   
(0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |          
  |          |
|   2 |   NESTED LOOPS                |              |  3002 | 69046 |    21   
(0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | SAFA_DATE_ID |  1000 |       |     3   
(0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |     6   
(0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   5 - access("D"."DATE_ID"="S"."DATE_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        290  consistent gets
          6  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

However, if I drop the normal index and re-create it as a bitmap index the 
query above changes to a FTS on the SALES_FACT table:

set autotrace off
drop index safa_date_id;
create bitmap index safa_date_id on sales_fact (date_id);

set autotrace trace
select s.* from sales_fact s join date_d d
on     d.date_id = s.date_id
where  calendar_date between date'2013-12-31' and date'2014-01-01';

Execution Plan
----------------------------------------------------------
Plan hash value: 525754326

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  3002 | 69046 |   315   
(2)| 00:00:04 |
|*  1 |  HASH JOIN                   |              |  3002 | 69046 |   315   
(2)| 00:00:04 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | SALES_FACT   |   515K|  5532K|   310   
(1)| 00:00:04 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("D"."DATE_ID"="S"."DATE_ID")
   3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1267  consistent gets
          0  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

If we hint the query to use the bitmap index, we can see it has a higher cost. 
The autotrace stats report significantly fewer consistent gets though:

Execution Plan
----------------------------------------------------------
Plan hash value: 1520624055

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  3002 | 69046 |   416   
(0)| 00:00:05 |
|   1 |  NESTED LOOPS                 |              |       |       |          
  |          |
|   2 |   NESTED LOOPS                |              |  3002 | 69046 |   416   
(0)| 00:00:05 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3   
(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2   
(0)| 00:00:01 |
|   5 |    BITMAP CONVERSION TO ROWIDS|              |       |       |          
  |          |
|*  6 |     BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID |       |       |          
  |          |
|   7 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |   416   
(0)| 00:00:05 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 
00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   6 - access("D"."DATE_ID"="S"."DATE_ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        152  consistent gets
          1  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

If you remove the join to the date dim and just use the date ids, Oracle uses 
the index as expected:

select * from sales_fact
where  date_id between 365 and 366;

Execution Plan
----------------------------------------------------------
Plan hash value: 2749560877

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost 
(%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  3002 | 33022 |   221   
(0)| 00:00:03 |
|   1 |  TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  3002 | 33022 |   221   
(0)| 00:00:03 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |           
 |          |
|*  3 |    BITMAP INDEX RANGE SCAN   | SAFA_DATE_ID |       |       |           
 |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DATE_ID">=365 AND "DATE_ID"<=366)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        144  consistent gets
          0  physical reads
          0  redo size
      36195  bytes sent via SQL*Net to client
       1839  bytes received via SQL*Net from client
        135  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2000  rows processed

Looking at the 10053 trace, I can see this is because the bm index join cost is 
calculated as higher than the FTS of SALES_FACT:

NL Join
  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12
Access path analysis for SALES_FACT
  Inner table: SALES_FACT  Alias: S
  Access Path: TableScan
    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1
      Cost_io: 920.00  Cost_cpu: 317629069
      Resp_io: 920.00  Resp_cpu: 317629069
  ****** trying bitmap/domain indexes ******
  Access Path: index (AllEqJoinGuess)
    Index: SAFA_DATE_ID
    resc_io: 1.00  resc_cpu: 8171
    ix_sel: 0.001942  ix_sel_with_filters: 0.001942
    NL Join : Cost: 6.00  Resp: 6.00  Degree: 1
      Cost_io: 6.00  Cost_cpu: 47359
      Resp_io: 6.00  Resp_cpu: 47359
  Bitmap nodes:
    Used SAFA_DATE_ID
      Cost = 6.001464, sel = 0.001942
  Access path: Bitmap index - accepted
    Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel: 0.001942
    Not Believed to be index-only
  ****** finished trying bitmap/domain indexes ******

  Best NL cost: 416.12
          resc: 416.12  resc_io: 416.00  resc_cpu: 3943626
          resp: 416.12  resp_io: 416.00  resc_cpu: 3943626

However, same section of the 10053 trace shows the following when using a BTree 
index on the SALES_FACT.DATE_ID column:

NL Join
  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12
Access path analysis for SALES_FACT
  Inner table: SALES_FACT  Alias: S
  Access Path: TableScan
    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1
      Cost_io: 920.00  Cost_cpu: 317629069
      Resp_io: 920.00  Resp_cpu: 317629069
  Access Path: index (AllEqJoinGuess)
    Index: SAFA_DATE_ID
    resc_io: 6.00  resc_cpu: 433579
    ix_sel: 0.001942  ix_sel_with_filters: 0.001942
    NL Join : Cost: 21.04  Resp: 21.04  Degree: 1
      Cost_io: 21.00  Cost_cpu: 1323580
      Resp_io: 21.00  Resp_cpu: 1323580

  Best NL cost: 21.04
          resc: 21.04  resc_io: 21.00  resc_cpu: 1323580
          resp: 21.04  resp_io: 21.00  resc_cpu: 1323580

Why does this happen? Is this a bug or expected behaviour?

Thanks,
Chris
www.sqlfail.com<http://www.sqlfail.com>

Other related posts: