Re: Outer Joins are Evil? - NO, with detailed discussion

  • From: Dan Tow <dantow@xxxxxxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 31 Aug 2004 12:33:30 -0500

I strongly believe the whole business of outer joins being more expensive than
inner joins is simply a *myth*, albeit a myth based on a bit of mostly-outdated
truth behind it:

-It is certainly true that the now-outdated Oracle rule-based optimizer has at
least one important bug that makes it harder to get correct execution plans for
a (relatively small) subset of outer-join queries. Specifically, the RBO does
not attempt OR-decomposition in the same query block as an outer join, so, for
example

select ... from emp e, dept d where e.deptno=d.deptno(+) and e.empno in (1,2)

will not (on the RBO) follow the index into e.empno (OR-depcomposed into
e.empno=1 OR e.empno=2) followed by nested loops into dept on deptno. Where the
indexes are selective enough, and the tables big enough, this was a major
problem in the RBO, though it had workarounds.


-It is certainly true that a silly-and-unnecessary outer join can prevent a join
order that is best. For example,

select ... from emp e, dept d where e.deptno(+)=d.deptno and e.empno=1

will generally not reach emp first on the primary key empno, followed by a
nested loop to dept, since it is on the "(+)" side of the outer join. (The
optimizer could deduce that the outer join can be made inner, since the
condition "e.empno=1" can only be true in the inner case, anyway. I know that
older versions of Oracle couldn't do this, though. I confess I haven't tried it
on newer versions - it just hasn't come up - anyone out there know if the
optimizer automates this conversion, now?) This sort of outer join is generally
not necessary, though, since it is better to make the outer join an inner join
in a case like this where you clearly don't intend to get the outer case,
anyway. (I grant that this is inconvenient in the case of views, though - a
view that does the join "e.deptno(+)=d.deptno" will not be efficently usable in
a query on "empno=1" until this has been fixed in the optimizer (if it hasn't).
In such a case, you'd need to use a similar view with the join made inner, but
that would defeat some of the code-reduction goal of using views extensively.
(I point out in the book, though, that there are *many* performance penalties
to over-free use of views, especially complex views, so elimination of outer
joins still will not free you to use views indiscriminately.)

-DB2 had an even worse bug that resulted in super-long parses that finally
errored out after about a half hour when you had too many (more than about 15)
outer joins in the same query block. I expect this is fixed in more-recent
versions of DB2, but I haven't used these, so I'm not sure. I describe the
(rather complex) workaround for this bug on page 101 of my book (SQL Tuning,
O'Reilly).

I've never seen any evidence that outer joins perform badly except where they
were affected by at least one of the above issues. (I'm happy to hear of
measured, empirical evidence to prove me wrong, though. I think a sufficient
proof would be two queries that get essentially the *same* execution plans
(except for addition of "OUTER" to the join methods) against the same data,
where the outer-joined version is measurably, reproducibly slower in repeatable
tests that give no caching advantage to the inner-joined version.)

*In theory, at least*, there's not a single reason why outer joins should be
slower than inner joins with artificial default keys and rows for what would
have been the outer cases. *On the contrary*, the outer joins should be easier!
In particular, where a nested-loops join is best, the outer join will generally
find a NULL in the foreign key pointing toward the "(+)" side of the join, and
will not need any logical I/O at all to perform the join. In contrast, the
inner-join workaround design will find a real, non-null foreign key that points
to a real row of the other table (for example, a special
"This_employee_has_no_department" department created to avoid the outer case of
what you'd otherwise have to express as "e.deptno=d.deptno(+)"). That row will
surely be well-cached if the outer case is common, but you'll still need to do
logical I/O (with CPU costs) to both the index and the table to reach it,
repeatedly. Even worse, super-high-access blocks such as the index and table
blocks holding these default rows, where the outer case (which you are
eliminating) would have been common, can themselves be serialization points,
with bottlenecks on latches controlling access to the shared memory holding
these blocks. With hash and sort-merge joins (*where these are optimal*) the
difference between the inner and outer versions should virtually vanish - the
outer version must read a single extra does-not-apply default row for the table
on the "(+)" side, *once*, and that's unlikely to cost enough to measure, much
less to matter.

Even parses of outer-joined queries *should* be easier, correctly implemented,
because the optimizer can know with dead certainty that an outer join to a
primary key (followed by no filter on the outer-joined table, as is usual) will
result in *precisely* as many rows after the join as it had before the join,
and this greatly simplifies the question of where to put these joins in the
join order, and how many permutations in join-order need be examined.

Thanks,

Dan Tow
650-858-1557
www.singingsql.com


Quoting "Barr, Stephen" <Stephen.Barr@xxxxxxxxx>:

> I'm currently working on a datawarehouse project (~5 Tb) where the decision
> has been made to avoid performing outer joins.
>
> The reasons given for this seem to be -
>
> 1.    Simplifies user navigation of the structures - i.e. avoids outer
> joins.
> 2.    Outer joins are slow and should be avoided at all costs.
> 3.    If an FK is missing it is populated with a default value which will
> relate to an actual row in the target table, hence no rows will ever be
> dropped - again, supposedly this is to simply SQL and avoid outer joins.
>
> What they actually do is populate each table in the structure with three
> default rows with an SK of 0, 1 & 2. Any FK's which are missing, not
> applicable or invalid will point to one of these rows.
>
> Now....my question is....what is so inherently evil about outer joins that
> we go to this extreme to avoid them?
>
> AND...has anyone else seen something like this deployed in other places?
>
> Thanks,
>
> Steve.
>
>
>
>
> -----------------------------------------------------------------------
> Information in this email may be privileged, confidential and is
> intended exclusively for the addressee.  The views expressed may
> not be official policy, but the personal views of the originator.
> If you have received it in error, please notify the sender by return
> e-mail and delete it from your system.  You should not reproduce,
> distribute, store, retransmit, use or disclose its contents to anyone.
>
> Please note we reserve the right to monitor all e-mail
> communication through our internal and external networks.
> -----------------------------------------------------------------------
>
>
>
> ----------------------------------------------------------------
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> ----------------------------------------------------------------
> To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
> put 'unsubscribe' in the subject line.
> --
> Archives are at //www.freelists.org/archives/oracle-l/
> FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
> -----------------------------------------------------------------
>

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts:

  • » Re: Outer Joins are Evil? - NO, with detailed discussion