RE: ridiculously time-consuming subqueries

  • From: "Ric Van Dyke" <ric.van.dyke@xxxxxxxxxx>
  • To: <William.Threlfall@xxxxxxxxxxxxxxxxxxxxxxxx>, "TJ Kiernan" <tkiernan@xxxxxxxxxxx>, "Kim Berg Hansen" <kibeha@xxxxxxxxx>
  • Date: Fri, 14 Mar 2014 14:00:21 -0500

Sounds like an outer join of some sort is needed if you're missing data,
I think:

 

DOCS.DOC_ID  = SV.DOC_ID(+)

 

Should do it. 

 

From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of William Threlfall
Sent: Friday, March 14, 2014 2:04 PM
To: TJ Kiernan; Kim Berg Hansen
Cc: Oracle-L@xxxxxxxxxxxxx
Subject: RE: ridiculously time-consuming subqueries

 

Another update:

 

I pulled the subqueries out and did them as a separate query, as
follows:

 

SELECT PT.PAT_ID, SV.RECIPIENT_CMV_IGG, SV.RECIPIENT_EBV

FROM   PAT_TRANSPLANT PT, OTTRSDF.SDF_SEROLOGY_VERIFICATION_TO SV,

       DOCUMENTS DOCS, TX_DOCUMENTS TXD, LAST_TRANSPLANT_VIEW LTV

WHERE  PT.PAT_ID        = DOCS.PAT_ID

AND    PT.PAT_ID        = LTV.PAT_ID

AND    PT.TRANSPLANT_ID = TXD.TRANSPLANT_ID

AND    PT.DATE_OF_TRANS = LTV.LAST_TRANS_DATE

AND    DOCS.DOC_ID      = TXD.DOC_ID

AND    DOCS.DOC_ID      = SV.DOC_ID

AND    DOCS.DOC_KIND_ID = 10108

AND   PT.PAT_ID IN (<list of patient ID's>)

 

This executed fine in less than 5 minutes.

But then I had to combine these results with the other results, in
Excel.

That's when I discovered that the above query, for some reason that I'm
not understanding right now, doesn't include pat_id's in the results set
where the CMV and EBV are (null) (null).

Even if those patients don't have any 10108 document or any SV record,
it should still return <patient_id> (null) (null), right?

 

I swear Oracle is trying to drive me insane.  LOL

 

Cheers, - Bill.

 

 

From: TJ Kiernan [mailto:tkiernan@xxxxxxxxxxx] 
Sent: Friday, March 14, 2014 10:42 AM
To: William Threlfall; Kim Berg Hansen
Cc: Oracle-L@xxxxxxxxxxxxx; TJ Kiernan
Subject: RE: ridiculously time-consuming subqueries

 

If you're expecting a large number of records back from a query, scalar
subqueries are not your friend, especially version 10, as there' no
scalar subquery caching, if I recall correctly (this means that the
subquery is executed once per row, so more rows = more time spent).
You'll probably be well served to rewrite the scalar subqueries as
inline views or factored subqueries.  

 

Your other potential problem is what I'm guessing is a view (LTV =
LAST_TRANSPLANT_VIEW).  A view that is (I'm further guessing) running
some pseudo-analytical subqueries is probably not going to get merged
into your main query.  I'd be concerned that the scalar subqueries are
being executed for records that will eventually be filtered out.  An
examination of the execution plan would probably tell  you more about
this.

 

As far as learning resources go, Tom Kyte's book is great if you're
learning about designing applications, but for running reports, I'd
suggest Pro Oracle SQL by Karen Morton et.al.  It assumes that you know
the basics of writing SQL (which appears to be the case for you), and
digs in to many of the more advanced features & extensions that Oracle
offers with lots of examples.  

 

HTH,

T. J.

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of William Threlfall
Sent: Friday, March 14, 2014 10:10 AM
To: Kim Berg Hansen
Cc: Oracle-L@xxxxxxxxxxxxx
Subject: RE: ridiculously time-consuming subqueries

 

Kim, 

  

Yes, of course, that was it exactly. *sigh* 

Now I'm running it again, but now it is taking forever to run again.
*sigh* 

  

Do you have any suggestions as to general methodology for how I can get
query results in a better way?  Any recommendations for books or online
resources to better understand how to do the type of querying that I
need to do on a regular basis? 

  

The requests I get almost always require one row of
one-value-per-patient results, which is why I am doing them as several
scalar subqueries.  

However, the results I want are almost always buried behind several
layers of one-to-many table relationships (different ones for each
subquery, of course), which is why the scalar subquery method is
extremely inefficient, since many complex joins are involved for each
patient over and over again. 

I am constantly struggling to avoid that "single-row subquery returns
multiple rows" error. 

  

I got the book 'Expert Oracle' by Tom Kyte, as suggested by someone else
on this list, but only the chapter on Analytic Functions is of any use
to me, and even that chapter is not very in-depth.  It doesn't even talk
about the "KEEP" and "FIRST" keywords, for example, and doesn't really
explain very many of the analytic functions very well. 

  

The official Oracle SQL Reference Manual is not helpful at all in
understanding how to use various features, never mind how BEST to use
them. 

  

I am a smart guy, but Oracle is kicking my a$$ so far.  I need to find a
better way to get the results I want. 

  

Cheers, - Bill. 

  

  

  

From: Kim Berg Hansen [mailto:kibeha@xxxxxxxxx] 
Sent: Friday, March 14, 2014 1:50 AM
To: William Threlfall
Cc: Oracle-L@xxxxxxxxxxxxx
Subject: Re: ridiculously time-consuming subqueries 

  

Bill, 

  

Your modified subquery I notice you have added PATIENT_DEMO PD to the
FROM list. 

In your original post it is not present in the subquery FROM list, so it
looks like PD is supposed to be from the main query, right? 

  

So your new modified subquery does not actually contain any reference to
tables from the main query anymore, as far as I can tell. 

It is not correlated anymore - it evaluates the max expression on all
patients that have doc_kind_id = 10108 and returns that same max
expression for all rows in the main query. 

I think you get POSITIVE on all rows... 

I think you have been a little too quick in a cut-and-paste? ;-) 

  

  

On a side note - this is nonsense: 

  

ORDER BY PT.DATE_OF_TRANS     DESC NULLS LAST, 

                   SV.RECIPIENT_CMV_IGG DESC NULLS LAST, 

                   ROWNUM ASC 

  

ROWNUM is not numbering rows in a table - it is numbering the output
rows in whatever order they happen to be retrieved. 

In this case the output rows will be just one row - the MAX expression. 

That ROWNUM is always 1 in this case and makes no sense to put in that
ORDER BY expression. 

  

  

  

Regards 

  

  

Kim Berg Hansen 

  

http://dspsd.blogspot.com 

kibeha@xxxxxxxxx 

@kibeha 

  

  

  

On Thu, Mar 13, 2014 at 11:16 PM, William Threlfall
<William.Threlfall@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: 

Update for this issue: 

To answer the question, it is Oracle 10g EE (10.2.0.3.0) 

The original explain plans were extremely complex and time-consuming.  

  

I modified the first subquery to use an analytic function, as follows: 

  

  (SELECT MAX(SV.RECIPIENT_CMV_IGG) 

          KEEP (DENSE_RANK FIRST 

          ORDER BY PT.DATE_OF_TRANS     DESC NULLS LAST, 

                   SV.RECIPIENT_CMV_IGG DESC NULLS LAST, 

                   ROWNUM ASC) 

    FROM PATIENT_DEMO PD, PAT_TRANSPLANT PT, DOCUMENTS DOCS, 

         TX_DOCUMENTS TXD, OTTRSDF.SDF_SEROLOGY_VERIFICATION_TO SV 

    WHERE PD.PAT_ID = PT.PAT_ID 

    AND   DOCS.DOC_ID = TXD.DOC_ID 

    AND   DOCS.DOC_ID = SV.DOC_ID 

    AND   PT.TRANSPLANT_ID = TXD.TRANSPLANT_ID 

    AND   DOCS.DOC_KIND_ID = 10108 

   ) AS CMV, 

  

This now runs in a reasonable amount of time. 

However, it is giving me incorrect results. 

For the last patient returned, it is returning "POSITIVE", but for that
patient there is only one transplant (so no multiples there), one
document 10108 (so no multiples there), and one SV record (so no
multiples there), and the only CMV result in the database for that
patient in the SV table is "NEGATIVE". 

I'm having trouble even imagining where the result "POSITIVE" is coming
from. 

  

Cheers, - Bill. 

 

________________________________

This message and any attached documents are only for the use of the
intended recipient(s), are confidential and may contain privileged
information. Any unauthorized review, use, retransmission, or other
disclosure is strictly prohibited. If you have received this message in
error, please notify the sender immediately, and then delete the
original message. Thank you.

Other related posts: