> I'm currently working on a datawarehouse project (~5 Tb) where the decision > has been made to avoid performing outer joins. > > The reasons given for this seem to be - > > 1. Simplifies user navigation of the structures - i.e. avoids outer > joins. Education will fix that. > 2. Outer joins are slow and should be avoided at all costs. Really? I would expect outer joins to be faster. Let's find out. First create the test data. All tables will have 10,000 rows OJ1 and OJ2 have identical rows. OJ2 and OJ3 have no rows in common OJ2 and OJ4 have 50% of rows in common. Here's the SQL to create the test. --------------------------------------------------------- drop table oj1 cascade constraints; drop table oj2 cascade constraints; drop table oj3 cascade constraints; drop table oj4 cascade constraints; create table oj1 as select owner city , object_name last_name , object_type first_name from dba_objects / create table oj2 as select * from oj1 where rownum <= 10000 / create table oj3 as select * from oj1 where (city, last_name, first_name) not in ( select city, last_name, first_name from oj2 ) and rownum <= 10000 / drop table oj1 cascade constraints; create table oj1 as select * from oj2 / create table oj4 as select * from oj2 where rownum <= 5000 / insert into oj4 select * from oj3 where rownum <= 5000 / create index oj1_idx on oj1(city,last_name, first_name); create index oj2_idx on oj2(city,last_name, first_name); create index oj3_idx on oj3(city,last_name, first_name); create index oj4_idx on oj4(city,last_name, first_name); exec dbms_stats.gather_table_stats('JS001292','OJ1') exec dbms_stats.gather_table_stats('JS001292','OJ2') exec dbms_stats.gather_table_stats('JS001292','OJ3') exec dbms_stats.gather_table_stats('JS001292','OJ4') select 'OJ1' table_name, count(*) from oj1; select 'OJ2' table_name, count(*) from oj2; select 'OJ3' table_name, count(*) from oj3; select 'OJ4' table_name, count(*) from oj4; --------------------------------------------------------- Here are the test queries: --------------------------------------------------------- set timing on select count(*) oj2_oj1_join_count from oj2, oj1 where oj2.city = oj1.city and oj2.last_name = oj1.last_name and oj2.first_name = oj1.first_name / select count(*) oj2_oj3_join_count from oj2, oj3 where oj2.city = oj3.city and oj2.last_name = oj3.last_name and oj2.first_name = oj3.first_name / select count(*) oj2_oj3_outer_join_count from oj2, oj3 where oj2.city = oj3.city(+) and oj2.last_name = oj3.last_name(+) and oj2.first_name = oj3.first_name(+) / select count(*) oj2_oj4_outer_join_count from oj2, oj4 where oj2.city = oj4.city(+) and oj2.last_name = oj4.last_name(+) and oj2.first_name = oj4.first_name(+) / --------------------------------------------------------- Here are the results: --------------------------------------------------------- OJ2_OJ1_JOIN_COUNT ------------------ 10012 1 row selected. Elapsed: 00:00:00.03 OJ2_OJ3_JOIN_COUNT ------------------ 0 1 row selected. Elapsed: 00:00:00.03 OJ2_OJ3_OUTER_JOIN_COUNT ------------------------ 10000 1 row selected. Elapsed: 00:00:00.03 OJ2_OJ4_OUTER_JOIN_COUNT ------------------------ 10000 1 row selected. Elapsed: 00:00:00.03 --------------------------------------------------------- I ran these several times. Sometimes there was 0.01 seconds difference in the time, sometimes not. Could be due to a number of things. Tables are already in the buffer cach, so that is not much of a factor. It turns out I was wrong. Outer joins were not faster, but they also were not slower. This kind of alleviates the need to respond to item # 3. Jared ---------------------------------------------------------------- Please see the official ORACLE-L FAQ: http://www.orafaq.com ---------------------------------------------------------------- To unsubscribe send email to: oracle-l-request@xxxxxxxxxxxxx put 'unsubscribe' in the subject line. -- Archives are at //www.freelists.org/archives/oracle-l/ FAQ is at //www.freelists.org/help/fom-serve/cache/1.html -----------------------------------------------------------------