Re: scalar subqueries getting rewritten to use sort aggregate?

From: Toon Koppelaars <toon@xxxxxxxxxxx>
To: nkodner@xxxxxxxxx
Date: Tue, 28 Dec 2010 05:44:49 +0100
Your example doesn't exhibit anything strange to me.

On Mon, Dec 27, 2010 at 9:59 PM, Neil Kodner <nkodner@xxxxxxxxx> wrote:

> 10.2.0.4, default optimizer parameters.
>
> I've come across many cases where queries that call a function in the
> select statement executes the function for each row returned.  What I did
> not expect was Oracle not performing the same action when dealing with a
> scalar subquery.
>

In your example this does happen: the scalar subquery, as is the function
call, is executed four times.


>  According to the plans I'm receiving, the scalar subquery has been
> rewritten and merged into the main query.  I wasn't expecting this behavior
> and it's making a performance issue rather difficult to diagnose.
>

Scalar subqueries that appear in the SELECT clause just get executed as many
times as the FROW/WHERE part of the query returns. Your execution plan shows
this.


> Here's an oversimplified example of the behavior.  Function cube_salaries
> simply returns the sum of the cubed salaries.   My simplified query contains
> a scalar subquery and a function call.  I expected both the scalar subquery
> and the function's sql to be run once for each of the rows in dept.  I've
> been reading about subquery caching but I wouldn't expect this to be the
> case because I'm using a different value for deptno each time the scalar
> subquery executes.
>
>
> Obviously, the queries I'm trying to troubleshoot are far more complicated
> than this one but I think the code below explains the behavior I'm
> experiencing:
>
> create or replace function cube_salaries(in_deptno in number)
> return number
> is
>   cube_salary number := 0;
> begin
>   select  sum(power(sal,3))
>     into  cube_salary
>     from  emp e
>    where  e.deptno = in_deptno;
>   return cube_salary;
> end;
> /
>
> select  dname
>      ,  (select sum(sal) from emp z where z.deptno = a.deptno) sal_sum
>      ,  cube_salaries(deptno)
>   from  dept a;
>
>
> select  dname
>      ,  (select sum(sal) from emp z where z.deptno = a.deptno) sal_sum
>      ,  cube_salaries(deptno)
>   from  dept a
>
> call     count       cpu    elapsed       disk      query    current
>  rows
> ------- ------  -------- ---------- ---------- ---------- ----------
>  ----------
> Parse        1      0.01       0.00          0          0          0
>     0
> Execute      1      0.00       0.00          0          0          0
>     0
> Fetch        2      0.00       0.00          0         36          0
>     4
> ------- ------  -------- ---------- ---------- ---------- ----------
>  ----------
> total        4      0.01       0.01          0         36          0
>     4
>
> Misses in library cache during parse: 1
> Optimizer mode: ALL_ROWS
> Parsing user id: 54
>
> Rows     Row Source Operation
> -------  ---------------------------------------------------
>       4  SORT AGGREGATE (cr=28 pr=0 pw=0 time=354 us)
>      14   TABLE ACCESS FULL EMP (cr=28 pr=0 pw=0 time=279 us)
>       4  TABLE ACCESS FULL DEPT (cr=8 pr=0 pw=0 time=136 us)
>
>
Note that lines 1 and 2 (the plan for your scalar subquery) are at the same
'level' as line 3 (the plan for the FROW/WHERE part of your query): this
indicates in this example that lines 1/2 get executed for every row that
line 3 generates. Which is four times as indicated under the ROWS column in
the FETCH line.



>
> Elapsed times include waiting on following events:
>   Event waited on                             Times   Max. Wait  Total
> Waited
>   ----------------------------------------   Waited  ----------
>  ------------
>   SQL*Net message to client                       2        0.00
>  0.00
>   SQL*Net message from client                     2        0.00
>  0.00
>
> ********************************************************************************
>
> SELECT SUM(POWER(SAL,3))
> FROM
>  EMP E WHERE E.DEPTNO = :B1
>

And here is the query inside your function, that was invoked four times
also.


>
>
> call     count       cpu    elapsed       disk      query    current
>  rows
> ------- ------  -------- ---------- ---------- ---------- ----------
>  ----------
> Parse        1      0.00       0.00          0          0          0
>     0
> Execute      4      0.00       0.00          0          0          0
>     0
> Fetch        4      0.00       0.00          0         28          0
>     4
> ------- ------  -------- ---------- ---------- ---------- ----------
>  ----------
> total        9      0.00       0.00          0         28          0
>     4
>
> Misses in library cache during parse: 1
> Misses in library cache during execute: 1
> Optimizer mode: ALL_ROWS
> Parsing user id: 54     (recursive depth: 1)
>
> Rows     Row Source Operation
> -------  ---------------------------------------------------
>       4  SORT AGGREGATE (cr=28 pr=0 pw=0 time=448 us)
>      14   TABLE ACCESS FULL EMP (cr=28 pr=0 pw=0 time=317 us)
>
> Since a SORT AGGREGATE operation is performed, does that mean Oracle is
> rewriting the scalar subquery, minus the parameters(deptno) and then merging
> it into the query results, similar to a join?
>

A SORT AGGREGATE is performed because your scalar subquery does a SUM(sal).
It is not rewriting the query.
If you do an EXPLAIN PLAN of your query, you'll see where DEPTNO (the
paramter) is used for filtering out the right employees.
Also apparently you either do not have DEPTNO in EMP indexed, or the CBO
concluded that doing a Full Table Scan would be better than doing an Index
Range Scan for your subquery.


-- 
Toon Koppelaars
RuleGen BV
Toon.Koppelaars@xxxxxxxxxxx
www.RuleGen.com
TheHelsinkiDeclaration.blogspot.com

(co)Author: "Applied Mathematics for Database Professionals"
www.RuleGen.com/pls/apex/f?p=14265:13
References:
- scalar subqueries getting rewritten to use sort aggregate?
  - From: Neil Kodner
Re: scalar subqueries getting rewritten to use sort aggregate?

Other related posts: