Lag function problem was: Never ending activity in temp file

  • From: Gints Plivna <gints.plivna@xxxxxxxxx>
  • To: Edgar Chupit <chupit@xxxxxxxxx>
  • Date: Wed, 13 Apr 2005 11:13:39 +0300

So I'v dig down a bit more. The problem was that all query seemed to
work well until it get to some analytic functions i.e. all hash joins
and full table scans showed up in longops and ended in a minute or
half. And simple count(*) without where clause most probably worked
well because Oracle simply didn't perform these analytic functions,
because simple count(*) didn't need them.

The problem seems to be in lag function with varying second argument,
i.e. how many rows to look back. I'v created a small test case that
worked similarly on my home 1.8GHz Celeron with 500Mb RAM and 10g, as
well as on work box with parameters I'v described in the first mail.
I'v switched from automatic sort/hash sizing to manual and gave almost
400 M to session that performed select below. As you can see select
with lag(1) worked fine, but select with varying lag(x) I had to kill
after almost 2 hours. And the most frustrating thing is that lag had
to look back 1 or 0 row as you can see from table create script as
well as later. And then you can see that with constant lag(0) and
lag(1) also everything works fine.
And one more thing from these almost 400M sort area the bad query took
only ~80M and no more.

Here is the scenario:

21:50:55 cmis_db@> create table source as select rownum id, 'LVA' code, 1 r=
n
21:51:09   2  from inner_join1
21:51:18   3  where rownum < 1500000;

Table created.

Elapsed: 00:00:06.81
21:51:51 cmis_db@> insert into source  select rownum, 'NGA', 2
21:52:27   2  from inner_join1
21:52:43   3  where rownum < 1500000;

1499999 rows created.

Elapsed: 00:01:11.07
21:53:58 cmis_db@> commit;

Commit complete.

Elapsed: 00:00:00.00
21:54:12 cmis_db@> select count(*) from source;

       COUNT(*)
---------------
2999998                                                              =20
                                            1 row selected.

Elapsed: 00:00:02.75
21:54:45 cmis_db@> alter session set sort_area_size =3D 400000000;

Session altered.

Elapsed: 00:00:00.00
21:55:06 cmis_db@> show parameter workar%

NAME                                 TYPE        VALUE               =20
                                                                     =20
                      ------------------------------------ -----------
------------------------------workarea_size_policy               =20
string      MANUAL

21:55:39 cmis_db@> show parameter sort_area_size

NAME                                 TYPE        VALUE               =20
                                                                     =20
                      ------------------------------------ -----------
------------------------------sort_area_size                     =20
integer     400000000

21:55:51 cmis_db@> ed
Wrote file afiedt.buf

  1  SELECT COUNT(*), code
  2  FROM (
  3    SELECT
  4      id,
  5      code,
  6      CASE WHEN first_code =3D 'LVA' THEN 0
  7           ELSE 1
  8      END x
  9    FROM (
 10        SELECT
 11          id,
 12          code,
 13          lag(code, 1, 'AAA') over (order by id, rn) first_code
 14        FROM source
 15        WHERE rownum < 2000000
 16    )
 17  )
 18  WHERE x =3D 1
 19  GROUP BY code
 20* ORDER BY COUNT(*) desc, code
21:56:14 cmis_db@> /

       COUNT(*) COD                    =20
--------------- ---
         500001 LVA
1 row selected.

Elapsed: 00:00:08.46
21:56:24 cmis_db@> ed
Wrote file afiedt.buf

  1  SELECT COUNT(*), code
  2  FROM (
  3    SELECT
  4      id,
  5      code,
  6      CASE WHEN first_code =3D 'LVA' THEN 0
  7           ELSE 1
  8      END x
  9    FROM (
 10        SELECT
 11          id,
 12          code,
 13          lag(code, rn - 1, 'AAA') over (order by id, rn) first_code
 14        FROM source
 15        WHERE rownum < 2000000
 16    )
 17  )
 18  WHERE x =3D 1
 19  GROUP BY code
 20* ORDER BY COUNT(*) desc, code
21:57:23 cmis_db@> /
      FROM source
           *
ERROR at line 14:
ORA-00028: your session has been killed=20

Elapsed: 01:58:18.81
23:55:46 cmis_db@> connect cmis_db/cmis_db
Connected.
23:56:12 cmis_db@> alter session set sort_area_size =3D 400000000;

Session altered.

Elapsed: 00:00:00.00
23:56:19 cmis_db@> show parameter workar%

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------------------=
----
workarea_size_policy     string    MANUAL

23:57:41 cmis_db@> select max(rn -1), min(rn-1) from source;

      MAX(RN-1)       MIN(RN-1)
--------------- ---------------
              1               0

1 row selected.

Elapsed: 00:00:03.40
23:58:57 cmis_db@> ed
Wrote file afiedt.buf

  1  SELECT COUNT(*), code
  2  FROM (
  3    SELECT
  4      id,
  5      code,
  6      CASE WHEN first_code =3D 'LVA' THEN 0
  7           WHEN the_same_code =3D 'LVA' THEN 1
  8           ELSE 1
  9      END x
 10    FROM (
 11        SELECT
 12          id,
 13          code,
 14          lag(code, 1, 'AAA') over (order by id, rn) first_code,
 15          lag(code, 0, 'AAA') over (order by id, rn) the_same_code
 16        FROM source
 17        WHERE rownum < 2000000
 18    )
 19  )
 20  WHERE x =3D 1
 21  GROUP BY code
 22* ORDER BY COUNT(*) desc, code
00:01:43 cmis_db@> /

       COUNT(*) COD
--------------- ---
         500001 LVA

1 row selected.

On 4/13/05, Edgar Chupit <chupit@xxxxxxxxx> wrote:
> Dear Gints,
> =20
> First of all nested loops aren't that bad, I would first gather statistic=
s
> for all the tables/indexes, remove hints and see what optimizer can sugge=
st,
> because in 99% of cases optimizer can get it correct, if not than I would
> think how can I improve results. The 'direct path read' is coming exactly
> from hash join, if you will switch from hash join, because hash join uses
> temp tablespace to store intermediate results (when they are bigger than
> hash_area_size). And off course you have to compare execution plans befor=
e
> inserting where flag =3D 1 and after inserting flag =3D 1. I know that th=
ose are
> only generic suggestions, but because you have quite complex query I can'=
t
> create test case to see exactly the same problem that you get. Hope that
> this will somehow help.=20
> =20
> Have a nice day.
> >   Edgar                =20
>
--
//www.freelists.org/webpage/oracle-l

Other related posts: