RE: Re[4]: to_number question

  • From: "Mercadante, Thomas F" <thomas.mercadante@xxxxxxxxxxxxxxxxx>
  • To: "'oracle-l@xxxxxxxxxxxxx'" <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 19 Jul 2004 08:24:02 -0400

Jonathan,

Please please please keep distracting us with your unease about how things
work.  I know I learn a lot reading every single stinkin word you put on
this list, as well as what you publish.

I know I am never to old to learn.

Thanks again.

Tom Mercadante
Oracle Certified Professional


-----Original Message-----
From: Jonathan Gennick [mailto:jonathan@xxxxxxxxxxx] 
Sent: Monday, July 19, 2004 8:01 AM
To: Niall Litchfield
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re[4]: to_number question


NL> Thank you for the post. It seems to me that we always get further 
NL> when we post what we understand, even when we are sure that our 
NL> understanding is incomplete/ subject to error, rather than being 
NL> afraid to expose our own ignorance.

Thanks Niall. I've been pressing the issue quite a bit actually, but only
because I want to work towards an explanation that satisfies my inner soul.
It's taken me awhile to get to that point. However, I had a bit of an
epiphany over the weekend, and now I've got a draft of an article that I
hope will make the whole issue clearer to at least a few people besides
myself.

NL> It seems to me that what you describe below is a perfectly normal 
NL> way of reading a given sql statement - it just isn't correct :) .
...
NL> It took me a long time (I don't have a comp.sci background) to 
NL> decide that a sql statement was not a set of instructions to the 
NL> database to tell it how to find the data I was looking for, but a 
NL> description of the set of data that I wanted back. This may or may 
NL> not help your reflection.

Yes. Someone else said something that got me to thinking
along those lines. What you say here is a key point. Unfortunately, I
believe the SQL language designers have not done us any favors when it comes
to implementing a language and syntax that reflects the proper way of
thinking about queries. Verbs such as SELECT and keywords such as FROM seem
to work against a correct interpretation of a query by humans.

Bear with me a moment, and let me repeat your two examples. There's more
from me down below.

NL>  SELECT to_number(a.a), b.b
NL>   FROM
NL> ( SELECT a
NL>          FROM a
NL>          WHERE a.id = 1 ),
NL>  b 
NL> where b.id=a.id;

NL> as compared with

NL> select to_number(a.a), b.b
NL> from a,b
NL> where a.id=b.id
NL> and a.id=1;

NL> I hope that you'll agree that the second query doesn't limit the 
NL> optimizer to going straight to a, finding all records with an id of 
NL> 1 and then doing some sort of join to b.

Nor does the first query make that limitation.

NL> Your mental model however imposes this execution order
NL> on the database for the first query.

No. Not at all, and this is where I think many people misunderstand me. I
expect (or used to expect) that the results would be "as if" the subquery
had been materialized first. However, I never required that the subquery
actually be materialized. Nor did I require anything else. I only required
that, no matter what the database really did, that the end results be the
same "as if" the outer query had never seen rows other than those returned
from the inner query.

Dan Tow pointed out in one of his earlier notes that it's at least possible
to conceive of ways in which the database might operate in the manner I've
just described.

The question seems to boil down to whether a row source
should be treated as opaque with respect to the rest of the query.
Interestingly, joins, as in your example, are part of the epiphany I had
over the weekend, which has led to me the "correct" way of thinking. We
allow the evaluation of WHERE clause predicates either before or after rows
have been joined. This is the same issue, really, as the subquery issue
we've been discussing.

If you can accept that a WHERE clause predicate can be
applied to a row before that same row is tossed out by a
JOIN predicate, and thus the WHERE clause predicate was
applied to a row not returned by the join, and thus not in
the row source for the query, then you must also accept that
a predicate outside an inline view may be applied to a row
that is not ultimately returned by that view.

NL> These two queries should be identical

I still don't quite accept this statement. The results are identical. The
queries themselves are not. I can only accept the rewritten query as a valid
interpretation of the first if I can also rationalize that the first query
might result in "where b.id=a.id" being applied before the subquery result
set is fully defined. Fortunately<grin>, I'm now able to accept this.

Let me try and be a bit more clear on what I just said:

Query_1 -> optimization -> execution plan

Query_1 -> transformed to query_2 -> execution plan

I can only accept the transformation to query_2 if the same, ultimate
execution plan can legitimately result from query_1. I don't like to think
in terms of queries being rewritten. This is probably another topic, but I
believe the intent of SQL has always been:

Query -> some optimization process -> execution plan

If it helps those who program the optimizer to manipulate
and rewrite the SQL in order to come up with various
execution plans, that's fine, but such rewriting is, to me,
an implementation detail. The execution plan in the cases
we've been discussing needs to be (and is, I now believe) reachable without
necessarily rewriting the SQL.

I hope I have not driven you all to distraction by the way I think about
things. Rest assured, I've managed to adjust my mental model enough to allow
for the behavior we've all been discussing.

Best regards,

Jonathan Gennick --- Brighten the corner where you are http://Gennick.com *
906.387.1698 * mailto:jonathan@xxxxxxxxxxx

Join the Oracle-article list and receive one
article on Oracle technologies per month by 
email. To join, visit
http://five.pairlist.net/mailman/listinfo/oracle-article, 
or send email to Oracle-article-request@xxxxxxxxxxx and 
include the word "subscribe" in either the subject or body.


Monday, July 19, 2004, 5:55:13 AM, Niall Litchfield
(niall.litchfield@xxxxxxxxx) wrote:
NL> Hi Jonathan

NL> Thank you for the post. It seems to me that we always get further 
NL> when we post what we understand, even when we are sure that our 
NL> understanding is incomplete/ subject to error, rather than being 
NL> afraid to expose our own ignorance. My ignorance will as usual be 
NL> interposed with yours.

NL> On Fri, 16 Jul 2004 15:54:09 -0400, Jonathan Gennick 
NL> <jonathan@xxxxxxxxxxx> wrote:
>> I've been on the fence about whether to post the note below. I think 
>> I'll go ahead and post it, because it sums up as best I can why I'm 
>> so flummoxed over this issue. It'll give you an idea of where my head 
>> is at, though you might feel that I have a rather strange way of 
>> looking at things.

NL> It seems to me that what you describe below is a perfectly normal 
NL> way of reading a given sql statement - it just isn't correct :) .

>> To get any resolution, I'm just going to have to go off in a corner 
>> and reflect for awhile.

NL> It took me a long time (I don't have a comp.sci background) to 
NL> decide that a sql statement was not a set of instructions to the 
NL> database to tell it how to find the data I was looking for, but a 
NL> description of the set of data that I wanted back. This may or may 
NL> not help your reflection.


>> Neither Stephen, nor I are trying to impose execution plans on the 
>> optimizer.

NL> I think you are - see below

>> 
>> With SQL you have the conceptual execution of a query versus the 
>> actual execution performed by the database. Having the proper 
>> conceptual model in mind is critical, because that's what you use 
>> when you actually write a SQL statement.
>> 
>> My mental model for subqueries in the FROM clause has
>> always been:
>> 
>>     The outer SELECT operates against the rows
>>     and columns returned by the subquery

NL> here is an attempt at an example

NL>  SELECT to_number(a.a), b.b
NL>   FROM
NL> ( SELECT a
NL>          FROM a
NL>          WHERE a.id = 1 ),
NL>  b 
NL> where b.id=a.id;

NL> as compared with

NL> select to_number(a.a), b.b
NL> from a,b
NL> where a.id=b.id
NL> and a.id=1;


NL> These two queries should be identical (if I haven't screwed up the 
NL> translation to a straight join). In both cases I'm trying to say 
NL> that I want records from a and b where the id's are identical (and  
NL> are equal to 1). I hope that you'll agree that the second query 
NL> doesn't limit the optimizer to going straight to a, finding all 
NL> records with an id of 1 and then doing some sort of join to b. Your 
NL> mental model however imposes this execution order on the database 
NL> for the first query.


>> Perhaps the correct mental model is:
>> 
>>     The outer SELECT operates against the rows
>>     and columns returned by the subquery, except
>>     when the optimizer surprises you by doing
>>     otherwise.
>> 
>> This is certainly not very satisfactory. At least, I'm not at all 
>> happy with it. Better might be:

NL> Indeed, how horrible.

>>     The outer SELECT and subquery together describe the state of
>>     the data to be returned by the statement, but to get that
>>     state the database will perform various operations in some
>>     indeterminate order.

NL> This I like a lot better (it being close to my understanding :) ). 
NL> Why have the clause about the database operations at all. What the 
NL> database does to return the data is an implementation detail 
NL> (admittedly its kind of an important detail, but consider issuing 
NL> the equivalent statement against - for example - sql server, or 
NL> informix. The available operations would be different, the set of 
NL> data to be returned would be the same.

 
>> This I like better, but still leaves me vaguely unsatisfied. For one 
>> thing, the "state of the data" cannot be properly understood without 
>> also visualizing the results from the subquery, so I'm left with a 
>> bit of a chicken/egg problem.

NL> That depends on the query. I think you can visualize my example 
NL> above with either a subquery, or not. In *most* cases the query 
NL> could be rewritten without a subquery.

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx put
'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: