Chris, (From my Oracle Wait Interface Book: Page 278...Bitmap index costing, changing the costing methods,etc) //www.freelists.org/post/oracle-l/BITMAP-index-cost-10053-trace,4 -Gopal On Sun, Jun 1, 2014 at 4:32 AM, Chris Saxon <chris.saxon@xxxxxxxxx> wrote: > Thanks for the explanation Jonathan. > > Could you explain where the 20% value comes from please? Is this a default > for bitmap indexes, or something you've inferred from the info posted? > > > On 31 May 2014 20:24, Mark W. Farnham <mwf@xxxxxxxx> wrote: > >> hotel email bounce; trying again >> >> >> >> *From:* Mark W. Farnham [mailto:mwf@xxxxxxxx] >> *Sent:* Saturday, May 31, 2014 3:14 PM >> *To:* 'jonathan@xxxxxxxxxxxxxxxxxx'; 'oracle-l@xxxxxxxxxxxxx' >> *Subject:* RE: Bitmap index not used when joining tables >> >> >> >> I do wonder about what your results might be for these two queries if you >> modified your query slightly as: >> >> >> >> select >> >> --+ gather_plan_statistics >> >> s.* from sales_fact s >> >> where s.rowid in ( >> >> select >> >> --+ no_merge >> >> s2.rowid from sales_fact s2 >> >> where s2.date_id in ( >> >> select >> >> --+ no_merge >> >> d.date_id from date_d d >> >> where calendar_date between date’2013-12-31’ and >> date’2014-01-01’ >> >> ) >> >> ); >> >> >> >> The idea being that if the only column it needs in the unmerged subquery >> (rowid) can be sourced from the index, this should be less total work. >> >> Possibly this is also optimized if the subquery is ordered by rowid, but >> that’s not the curiosity at this point. Of course even if this works you >> would only want to form it up this way when you knew it was to your benefit >> (knowledge the CBO can only guess by way of the thumbrule quoted by JL, and >> some other thumbrule for a literal range.) >> >> >> >> I just typed this in and didn’t test it, and working is release >> dependent, so you and I got lucky if this works via cut and paste. Once >> debugged I would sort of expect this to use the index in both cases to >> fetch back the rowid list. >> >> >> >> mwf >> >> *From:* oracle-l-bounce@xxxxxxxxxxxxx [ >> mailto:oracle-l-bounce@xxxxxxxxxxxxx <oracle-l-bounce@xxxxxxxxxxxxx>] *On >> Behalf Of *Jonathan Lewis >> *Sent:* Saturday, May 31, 2014 1:11 PM >> *To:* oracle-l@xxxxxxxxxxxxx >> *Subject:* RE: Bitmap index not used when joining tables >> >> >> >> >> >> It's not surprising for the example you've given. >> >> >> >> The basic principle is that the cost of a single table access via a >> bitmap index is likely to be different from the cost of access via the >> equivalent btree index because the bitmap index has no information about >> data clustering (i.e no clustering_factor) so it uses a guess. Sometimes >> this means the cost of the bitmap will be higher, sometimes lower, >> sometimes the same as for a btree. In your case the btree clustering >> factor will be very low (because of the order by in the CTAS) while the >> guess basically assumes that 20% of the data will be very widely scattered >> - in your case that probably means 400 blocks (20% of 2000 rows => 400 >> blocks). >> >> >> >> >> >> >> >> >> Regards >> Jonathan Lewis >> http://jonathanlewis.wordpress.com >> @jloracle >> ------------------------------ >> >> *From:* oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on >> behalf of Chris Saxon [chris.saxon@xxxxxxxxx] >> *Sent:* 31 May 2014 16:21 >> *To:* oracle-l@xxxxxxxxxxxxx >> *Subject:* Bitmap index not used when joining tables >> >> Hi, >> >> >> >> I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two >> tables on a column with a BTree index, the index is used in the execution >> plan. If this index is changed to a bitmap index, Oracle no longer uses the >> index when executing the query! It assigns a higher cost to using the >> bitmap index when joining, despite this being a cheaper approach (in terms >> of consistent gets). >> >> >> >> To see this, I created a date dimension table with 515 days (rows) and a >> sales fact table with 1,000 rows for each day: >> >> >> >> create table date_d ( >> >> date_id integer not null primary key, >> >> calendar_date date not null unique >> >> ); >> >> >> >> create table sales_fact ( >> >> date_id integer not null >> >> references date_d (date_id), >> >> quantity number not null, >> >> total_value number not null >> >> ); >> >> >> >> insert into date_d >> >> select rownum, date'2013-01-01'-1+rownum >> >> from dual >> >> connect by level <= sysdate - date'2013-01-01'; >> >> >> >> insert into sales_fact >> >> with rws as (select * from dual connect by level <= 1000) >> >> select d.date_id, round(dbms_random.value(1, 20)), >> round(dbms_random.value(10, 100), 2) >> >> from date_d d >> >> cross join rws >> >> order by d.date_id; >> >> >> >> begin >> >> dbms_stats.gather_table_stats(user, 'sales_fact'); >> >> dbms_stats.gather_table_stats(user, 'date_d'); >> >> end; >> >> / >> >> >> >> If I create a BTree index on SALES_FACT.DATE_ID, then join from the date >> dim to the fact table, restricting to two days, Oracle uses the index on >> the fact table as I would expect (as we're fetching 2,000 of 515,000 rows): >> >> >> >> create index safa_date_id on sales_fact (date_id); >> >> >> >> set autotrace trace >> >> select s.* from sales_fact s join date_d d >> >> on d.date_id = s.date_id >> >> where calendar_date between date'2013-12-31' and date'2014-01-01'; >> >> >> >> set autotrace off >> >> >> >> Execution Plan >> >> ---------------------------------------------------------- >> >> Plan hash value: 2189554905 >> >> >> >> >> ---------------------------------------------------------------------------------------------- >> >> | Id | Operation | Name | Rows | Bytes | >> Cost (%CPU)| Time | >> >> >> ---------------------------------------------------------------------------------------------- >> >> | 0 | SELECT STATEMENT | | 3002 | 69046 | >> 21 (0)| 00:00:01 | >> >> | 1 | NESTED LOOPS | | | | >> | | >> >> | 2 | NESTED LOOPS | | 3002 | 69046 | >> 21 (0)| 00:00:01 | >> >> | 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | >> 3 (0)| 00:00:01 | >> >> |* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | >> 2 (0)| 00:00:01 | >> >> |* 5 | INDEX RANGE SCAN | SAFA_DATE_ID | 1000 | | >> 3 (0)| 00:00:01 | >> >> | 6 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | >> 6 (0)| 00:00:01 | >> >> >> ---------------------------------------------------------------------------------------------- >> >> >> >> Predicate Information (identified by operation id): >> >> --------------------------------------------------- >> >> >> >> 4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', >> 'syyyy-mm-dd >> >> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 >> 00:00:00', 'syyyy-mm-dd >> >> hh24:mi:ss')) >> >> 5 - access("D"."DATE_ID"="S"."DATE_ID") >> >> >> >> >> >> Statistics >> >> ---------------------------------------------------------- >> >> 1 recursive calls >> >> 0 db block gets >> >> 290 consistent gets >> >> 6 physical reads >> >> 0 redo size >> >> 36195 bytes sent via SQL*Net to client >> >> 1839 bytes received via SQL*Net from client >> >> 135 SQL*Net roundtrips to/from client >> >> 0 sorts (memory) >> >> 0 sorts (disk) >> >> 2000 rows processed >> >> >> >> However, if I drop the normal index and re-create it as a bitmap index >> the query above changes to a FTS on the SALES_FACT table: >> >> >> >> set autotrace off >> >> drop index safa_date_id; >> >> create bitmap index safa_date_id on sales_fact (date_id); >> >> >> >> set autotrace trace >> >> select s.* from sales_fact s join date_d d >> >> on d.date_id = s.date_id >> >> where calendar_date between date'2013-12-31' and date'2014-01-01'; >> >> >> >> Execution Plan >> >> ---------------------------------------------------------- >> >> Plan hash value: 525754326 >> >> >> >> >> --------------------------------------------------------------------------------------------- >> >> | Id | Operation | Name | Rows | Bytes | >> Cost (%CPU)| Time | >> >> >> --------------------------------------------------------------------------------------------- >> >> | 0 | SELECT STATEMENT | | 3002 | 69046 | >> 315 (2)| 00:00:04 | >> >> |* 1 | HASH JOIN | | 3002 | 69046 | >> 315 (2)| 00:00:04 | >> >> | 2 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | >> 3 (0)| 00:00:01 | >> >> |* 3 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | >> 2 (0)| 00:00:01 | >> >> | 4 | TABLE ACCESS FULL | SALES_FACT | 515K| 5532K| >> 310 (1)| 00:00:04 | >> >> >> --------------------------------------------------------------------------------------------- >> >> >> >> Predicate Information (identified by operation id): >> >> --------------------------------------------------- >> >> >> >> 1 - access("D"."DATE_ID"="S"."DATE_ID") >> >> 3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', >> 'syyyy-mm-dd >> >> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 >> 00:00:00', 'syyyy-mm-dd >> >> hh24:mi:ss')) >> >> >> >> >> >> Statistics >> >> ---------------------------------------------------------- >> >> 1 recursive calls >> >> 0 db block gets >> >> 1267 consistent gets >> >> 0 physical reads >> >> 0 redo size >> >> 36195 bytes sent via SQL*Net to client >> >> 1839 bytes received via SQL*Net from client >> >> 135 SQL*Net roundtrips to/from client >> >> 0 sorts (memory) >> >> 0 sorts (disk) >> >> 2000 rows processed >> >> >> >> If we hint the query to use the bitmap index, we can see it has a higher >> cost. The autotrace stats report significantly fewer consistent gets though: >> >> >> >> Execution Plan >> >> ---------------------------------------------------------- >> >> Plan hash value: 1520624055 >> >> >> >> >> ---------------------------------------------------------------------------------------------- >> >> | Id | Operation | Name | Rows | Bytes | >> Cost (%CPU)| Time | >> >> >> ---------------------------------------------------------------------------------------------- >> >> | 0 | SELECT STATEMENT | | 3002 | 69046 | >> 416 (0)| 00:00:05 | >> >> | 1 | NESTED LOOPS | | | | >> | | >> >> | 2 | NESTED LOOPS | | 3002 | 69046 | >> 416 (0)| 00:00:05 | >> >> | 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | >> 3 (0)| 00:00:01 | >> >> |* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | >> 2 (0)| 00:00:01 | >> >> | 5 | BITMAP CONVERSION TO ROWIDS| | | | >> | | >> >> |* 6 | BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID | | | >> | | >> >> | 7 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | >> 416 (0)| 00:00:05 | >> >> >> ---------------------------------------------------------------------------------------------- >> >> >> >> Predicate Information (identified by operation id): >> >> --------------------------------------------------- >> >> >> >> 4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', >> 'syyyy-mm-dd >> >> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 >> 00:00:00', 'syyyy-mm-dd >> >> hh24:mi:ss')) >> >> 6 - access("D"."DATE_ID"="S"."DATE_ID") >> >> >> >> >> >> Statistics >> >> ---------------------------------------------------------- >> >> 1 recursive calls >> >> 0 db block gets >> >> 152 consistent gets >> >> 1 physical reads >> >> 0 redo size >> >> 36195 bytes sent via SQL*Net to client >> >> 1839 bytes received via SQL*Net from client >> >> 135 SQL*Net roundtrips to/from client >> >> 0 sorts (memory) >> >> 0 sorts (disk) >> >> 2000 rows processed >> >> >> >> If you remove the join to the date dim and just use the date ids, Oracle >> uses the index as expected: >> >> >> >> select * from sales_fact >> >> where date_id between 365 and 366; >> >> >> >> Execution Plan >> >> ---------------------------------------------------------- >> >> Plan hash value: 2749560877 >> >> >> >> >> --------------------------------------------------------------------------------------------- >> >> | Id | Operation | Name | Rows | Bytes | >> Cost (%CPU)| Time | >> >> >> --------------------------------------------------------------------------------------------- >> >> | 0 | SELECT STATEMENT | | 3002 | 33022 | >> 221 (0)| 00:00:03 | >> >> | 1 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 3002 | 33022 | >> 221 (0)| 00:00:03 | >> >> | 2 | BITMAP CONVERSION TO ROWIDS| | | | >> | | >> >> |* 3 | BITMAP INDEX RANGE SCAN | SAFA_DATE_ID | | | >> | | >> >> >> --------------------------------------------------------------------------------------------- >> >> >> >> Predicate Information (identified by operation id): >> >> --------------------------------------------------- >> >> >> >> 3 - access("DATE_ID">=365 AND "DATE_ID"<=366) >> >> >> >> >> >> Statistics >> >> ---------------------------------------------------------- >> >> 1 recursive calls >> >> 0 db block gets >> >> 144 consistent gets >> >> 0 physical reads >> >> 0 redo size >> >> 36195 bytes sent via SQL*Net to client >> >> 1839 bytes received via SQL*Net from client >> >> 135 SQL*Net roundtrips to/from client >> >> 0 sorts (memory) >> >> 0 sorts (disk) >> >> 2000 rows processed >> >> >> >> Looking at the 10053 trace, I can see this is because the bm index join >> cost is calculated as higher than the FTS of SALES_FACT: >> >> >> >> NL Join >> >> Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12 >> >> Access path analysis for SALES_FACT >> >> Inner table: SALES_FACT Alias: S >> >> Access Path: TableScan >> >> NL Join: Cost: 929.82 Resp: 929.82 Degree: 1 >> >> Cost_io: 920.00 Cost_cpu: 317629069 >> >> Resp_io: 920.00 Resp_cpu: 317629069 >> >> ****** trying bitmap/domain indexes ****** >> >> Access Path: index (AllEqJoinGuess) >> >> Index: SAFA_DATE_ID >> >> resc_io: 1.00 resc_cpu: 8171 >> >> ix_sel: 0.001942 ix_sel_with_filters: 0.001942 >> >> NL Join : Cost: 6.00 Resp: 6.00 Degree: 1 >> >> Cost_io: 6.00 Cost_cpu: 47359 >> >> Resp_io: 6.00 Resp_cpu: 47359 >> >> Bitmap nodes: >> >> Used SAFA_DATE_ID >> >> Cost = 6.001464, sel = 0.001942 >> >> Access path: Bitmap index - accepted >> >> Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel: >> 0.001942 >> >> Not Believed to be index-only >> >> ****** finished trying bitmap/domain indexes ****** >> >> >> >> Best NL cost: 416.12 >> >> resc: 416.12 resc_io: 416.00 resc_cpu: 3943626 >> >> resp: 416.12 resp_io: 416.00 resc_cpu: 3943626 >> >> >> >> However, same section of the 10053 trace shows the following when using a >> BTree index on the SALES_FACT.DATE_ID column: >> >> >> >> NL Join >> >> Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12 >> >> Access path analysis for SALES_FACT >> >> Inner table: SALES_FACT Alias: S >> >> Access Path: TableScan >> >> NL Join: Cost: 929.82 Resp: 929.82 Degree: 1 >> >> Cost_io: 920.00 Cost_cpu: 317629069 >> >> Resp_io: 920.00 Resp_cpu: 317629069 >> >> Access Path: index (AllEqJoinGuess) >> >> Index: SAFA_DATE_ID >> >> resc_io: 6.00 resc_cpu: 433579 >> >> ix_sel: 0.001942 ix_sel_with_filters: 0.001942 >> >> NL Join : Cost: 21.04 Resp: 21.04 Degree: 1 >> >> Cost_io: 21.00 Cost_cpu: 1323580 >> >> Resp_io: 21.00 Resp_cpu: 1323580 >> >> >> >> Best NL cost: 21.04 >> >> resc: 21.04 resc_io: 21.00 resc_cpu: 1323580 >> >> resp: 21.04 resp_io: 21.00 resc_cpu: 1323580 >> >> >> >> Why does this happen? Is this a bug or expected behaviour? >> >> >> >> Thanks, >> >> Chris >> >> www.sqlfail.com >> > >