RE: should one use ANSI join syntax when writing an Oracle application?

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <mnavickas@xxxxxxxxx>, <breitliw@xxxxxxxxxxxxx>, "'oracle-l'" <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 18 Oct 2006 09:24:41 -0400

As we rejoin the essential stream of consciousness kicked off by the late,
great Dijkstra in his 1968 ACM article about the harmfulness of "GO TO" we
need to recognize what became apparent in building lexical parsers to
"repair" spaghetti code into structured code: You can write things that work
well but which are very difficult to read and understand in pretty much any
language system. With proper use "GO TO" can clearly present exactly the
same functional logic as IF-THEN-ELSE, DO-WHILE, and CASE. Dijkstra's great
point was that it was far too easy and common for "GO TO" to obfuscate the
meaning and intent of a program, while the so-called "structured" language
constructs promoted clarity. Many examples ensued (from both true believers
in the "GO TO" and from code comedians) of structured language programs that
were nearly impossible to read thereby "proving" that avoiding "GO TO" is
harmful.

So I suggest that rather than addressing the quasi religious debate
innocently set off by the OP, we focus on writing queries to be
intentionally clear. If a query author starts with a clear idea of what the
data sources are, what the join relationships are, and what the focusing
filters are the usual result is an easily readable query. Chapters of books
have been written on making your SQL readable. If you can more clearly state
your intention with "ANSI" join syntax, please do so. Likewise, if you can
state your intention more clearly with (+) <plus fill in any missing
partners with nulls>, please do so.

The result will be more readable code for people who follow after you. Which
syntax is "better" is dependent on both the language and the person using
it. I hope that this notion can be separated from the cost of execution.

Below, MN's example below demonstrates the power and clarity of a language
construct now available in "ANSI" (full outer join). I find it almost
disappointing to have the COST attached, because the point should be how
much more clear and concise the language of the query is. So in this
particular case for this particular purpose, I stand in praise of the ANSI
syntax. My guess is that it would not be too hard to do a non-outer equijoin
and slap that together with the pair of not-in or minuses and thereby
compete with ANSI's cost. Again, though, the COST competition is not the
point. Further, Wolfgang's point is not to be lost that (I paraphrase what I
think he's thinking) it is unfortunate when the architecture forces you into
outer joins at all. Without debating whether a full outer join should ever
be needed in a well designed system the ANSI syntax seems to most clearly
state the query intent by a huge margin in this case. Of course Oracle could
simply implement

select count(*) from 
(select * from test1 t1, test2 t2 
where t1.n (+) = t2.n (+));

as the same result set as
 
select count(*) from 
(select * from test1 t1 full outer join test2 t2 on t1.n = t2.n);

There is no ambiguity, but that syntax is not supported. Possibly it should
be, but I wouldn't burn any influence I have to suggest enhancements on that
one. My point here is that an intentionally missing feature (if I imagine
correctly this was omitted primarily because it would hide the potentially
horrible expense of how the engine of the time would have had to do that) in
the (+) syntax is what makes ANSI the clear winner in this case. It might
even be argued that having to type out the whole words "full outer join" is
a good thing to make you realize what you are requiring.

Even when COST becomes a real issue in test when you profile the performance
of an application, a query is identified as costly enough and frequently
enough executed to be worth looking at, and the resources spent to
accomplish the query appear to be out of whack with the amount of work it
seems should be done, I recommend retaining the original query's clear
statement of the intent as a comment if you need to carve up the query or
apply hints to overcome the apparent inability of some current generation of
CBO to get a better plan. (Yeah, there are a few presumptions in there,
including that the expense of the query was not due to unclear thinking in
the first place and that a clear iso-functional restatement "magically" gets
a great plan "naturally.")

I've taken far too long to try to make this point: Learn to read both. Write
in whichever one you can write in more effectively. This may vary from query
to query and as the calendar moves forward. I would not even straight jacket
the employees working on a single project to one or the other. If someone
who writes exclusively in the "other" syntax can't understand your query, it
probably needs improvement anyway.

Regards,

mwf 

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Mindaugas Navickas
Sent: Tuesday, October 17, 2006 11:53 PM
To: breitliw@xxxxxxxxxxxxx; 'oracle-l'
Cc: 'Mladen Gogala'; niall.litchfield@xxxxxxxxx
Subject: RE: should one use ANSI join syntax when writing an Oracle
application?

Hi,

I am afraid - I am not on "popular site of the table". 
Here I did a small test (Oracle 10.2.0.1.0):

create table test1 as 
with m as ( select /*+ materialize */ 1 from all_objects where rownum <=
1000) 
select TRUNC(dbms_random.value(400000,1000000)) n,dbms_random.string('A',10)
v
from m,m;
create table test2 as 
with m as ( select /*+ materialize */ 1 from all_objects where rownum <=
1000) 
select TRUNC(dbms_random.value(0,600000)) n,dbms_random.string('A',10) v
from m,m;
create index i_test1 on test1(n);
create index i_test2 on test2(n);
begin 
dbms_stats.gather_table_stats('MNTEST','TEST1',method_opt=>'FOR ALL COLUMNS
SIZE AUTO');
dbms_stats.gather_table_stats('MNTEST','TEST2',method_opt=>'FOR ALL COLUMNS
SIZE AUTO');
end;
/
select count(*) from 
(select * from test1 t1 full outer join test2 t2 on t1.n = t2.n);
-- COST by running autotrace = 6953 

select count(*) from 
(select * from test1 t1, test2 t2 
where t1.n (+) = t2.n 
union
select * from test1 t1, test2 t2
where t1.n = t2.n (+));
-- COST by running autotrace = 42375

Which syntax is more readable is matter of taste - isn't it?

Regards
Mindaugas Navickas


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Wolfgang Breitling
Sent: October 17, 2006 10:55 PM
To: oracle-l
Cc: Mladen Gogala; niall.litchfield@xxxxxxxxx
Subject: Re: should one use ANSI join syntax when writing an Oracle
application?


I am on Mladen's side (at least as far as this topic is concerned) and I
prefer the "old", or rather "traditional" (sounds better, doesn't it) style.
I don't share Niall's opinion that a join predicate is different from an
"ordinary" predicate. 
I must confess my complete ignorance re ANSI syntax but regarding your
preference for the ANSI syntax because it separates the two types of
predicates. How would you express the following with ANSI style join syntax:

select ... from A, B where A.x < B.y

or 

select ... from A, B where A.x between B.y and B.z

or any of the other non-equi joins.

Provided it is possible I have the feeling that will be hardly as concise
and clear as the traditional syntax.

As far as the subtopic of outer joins is concerned, I don't like them in
either syntax. Technically they are not necessary and are just a shortcut
for a union (all) of two sql - or three for a full outer join (shudder).
Maybe if the shortcut didn't exist there would be fewer of them.

At 07:18 AM 10/17/2006, Niall Litchfield wrote:

On 10/17/06, Mladen Gogala <mgogala@xxxxxxxxxxx> wrote: 



On 10/17/2006 01:43:14 AM, Niall Litchfield wrote:

> My personal preference is for the ANSI standard syntax, I really find that


> seperating out the joins from the other conditions helps me understand
more

> easily what a query is trying to achieve. It also helps debugging IMO.


How, exactly are joins different, in your opinion? 


 
it seems to me that where emp.deptno = dept.deptno is an entirely different
fish from where emp.deptno=10, in the former case you are saying how
employees and departments are related, in the latter you are specifying what
criteria you are using to choose data. 
 

Regards

Wolfgang Breitling
Centrex Consulting Corporation
www.centrexcc.com 

--
//www.freelists.org/webpage/oracle-l



--
//www.freelists.org/webpage/oracle-l


Other related posts: