FW: Bitmap index not used when joining tables

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 31 May 2014 15:24:17 -0400

hotel email bounce; trying again

 

From: Mark W. Farnham [mailto:mwf@xxxxxxxx] 
Sent: Saturday, May 31, 2014 3:14 PM
To: 'jonathan@xxxxxxxxxxxxxxxxxx'; 'oracle-l@xxxxxxxxxxxxx'
Subject: RE: Bitmap index not used when joining tables

 

I do wonder about what your results might be for these two queries if you
modified your query slightly as:

 

select 

--+ gather_plan_statistics

 s.* from sales_fact s

       where s.rowid in (

       select

       --+ no_merge

          s2.rowid from sales_fact s2

          where s2.date_id in (

          select 

          --+ no_merge

             d.date_id from date_d d

             where calendar_date between date'2013-12-31' and
date'2014-01-01'

          )

       );

  

The idea being that if the only column it needs in the unmerged subquery
(rowid) can be sourced from the index, this should be less total work.

Possibly this is also optimized if the subquery is ordered by rowid, but
that's not the curiosity at this point. Of course even if this works you
would only want to form it up this way when you knew it was to your benefit
(knowledge the CBO can only guess by way of the thumbrule quoted by JL, and
some other thumbrule for a literal range.)

 

I just typed this in and didn't test it, and working is release dependent,
so you and I got lucky if this works via cut and paste. Once debugged I
would sort of expect this to use the index in both cases to fetch back the
rowid list.

 

mwf

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Jonathan Lewis
Sent: Saturday, May 31, 2014 1:11 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: Bitmap index not used when joining tables

 

 

It's not surprising for the example you've given. 

 

The basic principle is that the cost of a single table access via a bitmap
index is likely to be different from the cost of access via the equivalent
btree index because the bitmap index has no information about data
clustering (i.e no clustering_factor) so it uses a guess.  Sometimes this
means the cost of the bitmap will be higher, sometimes lower, sometimes the
same as for a btree.  In your case the btree clustering factor will be very
low (because of the order by in the CTAS) while the guess basically assumes
that 20% of the data will be very widely scattered - in your case that
probably means 400 blocks (20% of 2000 rows => 400 blocks).

 

 

 

   
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle 

  _____  

From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on
behalf of Chris Saxon [chris.saxon@xxxxxxxxx]
Sent: 31 May 2014 16:21
To: oracle-l@xxxxxxxxxxxxx
Subject: Bitmap index not used when joining tables

Hi, 

 

I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two
tables on a column with a BTree index, the index is used in the execution
plan. If this index is changed to a bitmap index, Oracle no longer uses the
index when executing the query! It assigns a higher cost to using the bitmap
index when joining, despite this being a cheaper approach (in terms of
consistent gets).

 

To see this, I created a date dimension table with 515 days (rows) and a
sales fact table with 1,000 rows for each day:

 

create table date_d (

  date_id integer not null primary key,

  calendar_date date not null unique

);

 

create table sales_fact (

  date_id integer not null 

    references date_d (date_id),

  quantity number not null,

  total_value number not null

);

 

insert into date_d

  select rownum, date'2013-01-01'-1+rownum

  from   dual

  connect by level <= sysdate - date'2013-01-01';

  

insert into sales_fact

with rws as (select * from dual connect by level <= 1000)

  select d.date_id, round(dbms_random.value(1, 20)),
round(dbms_random.value(10, 100), 2)

  from   date_d d

  cross  join rws

  order  by d.date_id;

 

begin

  dbms_stats.gather_table_stats(user, 'sales_fact');

  dbms_stats.gather_table_stats(user, 'date_d');

end;

/

 

If I create a BTree index on SALES_FACT.DATE_ID, then join from the date dim
to the fact table, restricting to two days, Oracle uses the index on the
fact table as I would expect (as we're fetching 2,000 of 515,000 rows):

 

create index safa_date_id on sales_fact (date_id);

 

set autotrace trace

select s.* from sales_fact s join date_d d

on     d.date_id = s.date_id

where  calendar_date between date'2013-12-31' and date'2014-01-01';

 

set autotrace off

 

Execution Plan

----------------------------------------------------------

Plan hash value: 2189554905

 

----------------------------------------------------------------------------
------------------

| Id  | Operation                     | Name         | Rows  | Bytes | Cost
(%CPU)| Time     |

----------------------------------------------------------------------------
------------------

|   0 | SELECT STATEMENT              |              |  3002 | 69046 |    21
(0)| 00:00:01 |

|   1 |  NESTED LOOPS                 |              |       |       |
|          |

|   2 |   NESTED LOOPS                |              |  3002 | 69046 |    21
(0)| 00:00:01 |

|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3
(0)| 00:00:01 |

|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2
(0)| 00:00:01 |

|*  5 |    INDEX RANGE SCAN           | SAFA_DATE_ID |  1000 |       |     3
(0)| 00:00:01 |

|   6 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |     6
(0)| 00:00:01 |

----------------------------------------------------------------------------
------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
'syyyy-mm-dd

              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
00:00:00', 'syyyy-mm-dd

              hh24:mi:ss'))

   5 - access("D"."DATE_ID"="S"."DATE_ID")

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        290  consistent gets

          6  physical reads

          0  redo size

      36195  bytes sent via SQL*Net to client

       1839  bytes received via SQL*Net from client

        135  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2000  rows processed

 

However, if I drop the normal index and re-create it as a bitmap index the
query above changes to a FTS on the SALES_FACT table:

 

set autotrace off

drop index safa_date_id;

create bitmap index safa_date_id on sales_fact (date_id);

 

set autotrace trace

select s.* from sales_fact s join date_d d

on     d.date_id = s.date_id

where  calendar_date between date'2013-12-31' and date'2014-01-01';

 

Execution Plan

----------------------------------------------------------

Plan hash value: 525754326

 

----------------------------------------------------------------------------
-----------------

| Id  | Operation                    | Name         | Rows  | Bytes | Cost
(%CPU)| Time     |

----------------------------------------------------------------------------
-----------------

|   0 | SELECT STATEMENT             |              |  3002 | 69046 |   315
(2)| 00:00:04 |

|*  1 |  HASH JOIN                   |              |  3002 | 69046 |   315
(2)| 00:00:04 |

|   2 |   TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3
(0)| 00:00:01 |

|*  3 |    INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2
(0)| 00:00:01 |

|   4 |   TABLE ACCESS FULL          | SALES_FACT   |   515K|  5532K|   310
(1)| 00:00:04 |

----------------------------------------------------------------------------
-----------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   1 - access("D"."DATE_ID"="S"."DATE_ID")

   3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
'syyyy-mm-dd

              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
00:00:00', 'syyyy-mm-dd

              hh24:mi:ss'))

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

       1267  consistent gets

          0  physical reads

          0  redo size

      36195  bytes sent via SQL*Net to client

       1839  bytes received via SQL*Net from client

        135  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2000  rows processed

 

If we hint the query to use the bitmap index, we can see it has a higher
cost. The autotrace stats report significantly fewer consistent gets though:

 

Execution Plan

----------------------------------------------------------

Plan hash value: 1520624055

 

----------------------------------------------------------------------------
------------------

| Id  | Operation                     | Name         | Rows  | Bytes | Cost
(%CPU)| Time     |

----------------------------------------------------------------------------
------------------

|   0 | SELECT STATEMENT              |              |  3002 | 69046 |   416
(0)| 00:00:05 |

|   1 |  NESTED LOOPS                 |              |       |       |
|          |

|   2 |   NESTED LOOPS                |              |  3002 | 69046 |   416
(0)| 00:00:05 |

|   3 |    TABLE ACCESS BY INDEX ROWID| DATE_D       |     3 |    36 |     3
(0)| 00:00:01 |

|*  4 |     INDEX RANGE SCAN          | SYS_C0037151 |     3 |       |     2
(0)| 00:00:01 |

|   5 |    BITMAP CONVERSION TO ROWIDS|              |       |       |
|          |

|*  6 |     BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID |       |       |
|          |

|   7 |   TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  1000 | 11000 |   416
(0)| 00:00:05 |

----------------------------------------------------------------------------
------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
'syyyy-mm-dd

              hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
00:00:00', 'syyyy-mm-dd

              hh24:mi:ss'))

   6 - access("D"."DATE_ID"="S"."DATE_ID")

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        152  consistent gets

          1  physical reads

          0  redo size

      36195  bytes sent via SQL*Net to client

       1839  bytes received via SQL*Net from client

        135  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2000  rows processed

 

If you remove the join to the date dim and just use the date ids, Oracle
uses the index as expected:

 

select * from sales_fact

where  date_id between 365 and 366;

 

Execution Plan

----------------------------------------------------------

Plan hash value: 2749560877

 

----------------------------------------------------------------------------
-----------------

| Id  | Operation                    | Name         | Rows  | Bytes | Cost
(%CPU)| Time     |

----------------------------------------------------------------------------
-----------------

|   0 | SELECT STATEMENT             |              |  3002 | 33022 |   221
(0)| 00:00:03 |

|   1 |  TABLE ACCESS BY INDEX ROWID | SALES_FACT   |  3002 | 33022 |   221
(0)| 00:00:03 |

|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |
|          |

|*  3 |    BITMAP INDEX RANGE SCAN   | SAFA_DATE_ID |       |       |
|          |

----------------------------------------------------------------------------
-----------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   3 - access("DATE_ID">=365 AND "DATE_ID"<=366)

 

 

Statistics

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        144  consistent gets

          0  physical reads

          0  redo size

      36195  bytes sent via SQL*Net to client

       1839  bytes received via SQL*Net from client

        135  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2000  rows processed

 

Looking at the 10053 trace, I can see this is because the bm index join cost
is calculated as higher than the FTS of SALES_FACT:

 

NL Join

  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12

Access path analysis for SALES_FACT

  Inner table: SALES_FACT  Alias: S

  Access Path: TableScan

    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1

      Cost_io: 920.00  Cost_cpu: 317629069

      Resp_io: 920.00  Resp_cpu: 317629069

  ****** trying bitmap/domain indexes ******

  Access Path: index (AllEqJoinGuess)

    Index: SAFA_DATE_ID

    resc_io: 1.00  resc_cpu: 8171

    ix_sel: 0.001942  ix_sel_with_filters: 0.001942 

    NL Join : Cost: 6.00  Resp: 6.00  Degree: 1

      Cost_io: 6.00  Cost_cpu: 47359

      Resp_io: 6.00  Resp_cpu: 47359

  Bitmap nodes:

    Used SAFA_DATE_ID

      Cost = 6.001464, sel = 0.001942

  Access path: Bitmap index - accepted

    Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel:
0.001942

    Not Believed to be index-only

  ****** finished trying bitmap/domain indexes ******

 

  Best NL cost: 416.12

          resc: 416.12  resc_io: 416.00  resc_cpu: 3943626

          resp: 416.12  resp_io: 416.00  resc_cpu: 3943626

 

However, same section of the 10053 trace shows the following when using a
BTree index on the SALES_FACT.DATE_ID column:

 

NL Join

  Outer table: Card: 3.00  Cost: 3.00  Resp: 3.00  Degree: 1  Bytes: 12

Access path analysis for SALES_FACT

  Inner table: SALES_FACT  Alias: S

  Access Path: TableScan

    NL Join:  Cost: 929.82  Resp: 929.82  Degree: 1

      Cost_io: 920.00  Cost_cpu: 317629069

      Resp_io: 920.00  Resp_cpu: 317629069

  Access Path: index (AllEqJoinGuess)

    Index: SAFA_DATE_ID

    resc_io: 6.00  resc_cpu: 433579

    ix_sel: 0.001942  ix_sel_with_filters: 0.001942 

    NL Join : Cost: 21.04  Resp: 21.04  Degree: 1

      Cost_io: 21.00  Cost_cpu: 1323580

      Resp_io: 21.00  Resp_cpu: 1323580

 

  Best NL cost: 21.04

          resc: 21.04  resc_io: 21.00  resc_cpu: 1323580

          resp: 21.04  resp_io: 21.00  resc_cpu: 1323580

 

Why does this happen? Is this a bug or expected behaviour?

 

Thanks,

Chris

www.sqlfail.com

Other related posts: