Re: Performance issue on Oracle 9i

From: Martin Berger <martin.a.berger@xxxxxxxxx>
To: sbecker6925@xxxxxxxxx
Date: Sat, 13 Jan 2024 19:57:47 +0100

Hi Sandy,

I understand you can not share any data from your company, so I decided to
create my own model and play with this one. Obviously my datas distribution
doesn't match yours - if it seems to be relevant somewhere you might be
able to improve the generator so we have a better understanding and you
never shared anything regarding your company. We are only playing with the
model.

All my tests here are done with 23c but with the hint
OPTIMIZER_FEATURES_ENABLE('9.2')  - so let's hope it comes close to your
env.
My sandbox is 23c free (23.2) on linux with 8k blocksize, default USERS
tablespace.

First the table:

CREATE TABLE txn_proc
(
  ID NUMBER NOT NULL
, TX_REF NUMBER NOT NULL
, SEND_DATE DATE
, EXP_CMNT VARCHAR2(20)
, PROJNO VARCHAR2(20)
, TNO VARCHAR2(20)
, PAD VARCHAR2(999)
);

INSERT INTO txn_proc
select level as id
     , trunc(level/10) as tx_ref
     , case when mod (level,29)=1 then sysdate else null end send_date
     , case mod (level, 11)
       when 1 then 'A'
       when 2 then 'B'
       when 3 then 'C'
       when 4 then 'D'
       when 5 then 'PR'
       else 'X'
      end as exp_cmnt
    , case when mod(level, 7) = 1 then 'PR' else '0' end  projno
    , case when mod(level,13) = 1 then 'PR' else '1' end tno
    , rpad ('X', 42, 'U') pad
from dual
connect by level < 380000
;

commit;
begin
    DBMS_STATS.GATHER_TABLE_STATS (ownname => '"BERX"', tabname =>
'"TXN_PROC"',
          estimate_percent => 100 );
end;
/

with these statistics:
(less important data removed)

TABLE: TXN_PROC
ROWS         :379999
SAMPLE SIZE  :379999
Columns
NAME       LOW_VALUE                 HIGH_VALUE               NUM_DISTINCT
ID            1                         379999                   379999

TX_REF        0                         37999                    38000

SEND_DATE     2024.01.13.18.29.25       2024.01.13.18.29.25      1

EXP_CMNT      A                         X                        6

PROJNO        0                         PR                       2

TNO           1                         PR                       2

PAD           XU....                    XUU....                  1

The +/- original statement
select /*+ qb_name(MAIN) OPTIMIZER_FEATURES_ENABLE('9.2')
DB_VERSION('9.2') */
-- BX1
    *
from txn_proc t1
where t1.tx_ref not in
        (select t.tx_ref
         from txn_proc t
         where t.send_date is null
           and t.exp_cmnt not like 'PR%'
           and t.exp_cmnt != projno||tno
        )
  and t1.send_date is null
/

generates an execution plan

------------------------------------------------------------------------------------
| Id  | Operation          | Name     | Starts | E-Rows | Cost  | A-Rows |
Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |      1 |        |  *1895 *|      0
|    *6938 *|
|*  1 |  HASH JOIN ANTI    |          |      1 |      1 |  1895 |      0 |
   6938 |
|*  2 |   TABLE ACCESS FULL| TXN_PROC |      1 |    366K|   536 |    366K|
   3469 |
|*  3 |   TABLE ACCESS FULL| TXN_PROC |      1 |    254K|   536 |    333K|
   3469 |
------------------------------------------------------------------------------------
with no index - nothing spectacular.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

with these 2 indices things gets more interesting:

create index txn_proc_bx_A on txn_proc( send_date, exp_cmnt, projno, tno,
tx_ref);
create index txn_proc_bx_B on txn_proc( send_date,  tx_ref);

leads to

--------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Starts | E-Rows | Cost  |
A-Rows | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |      1 |        |  *1548 *|
     0 |    *3853 *|
|*  1 |  HASH JOIN ANTI       |               |      1 |      1 |  1548 |
   0 |    3853 |
|*  2 |   TABLE ACCESS FULL   | TXN_PROC      |      1 |    366K|   536 |
366K|    3469 |
|*  3 |   INDEX FAST FULL SCAN| TXN_PROC_BX_A |      1 |    254K|   189 |
99816 |     384 |
--------------------------------------------------------------------------------------------

A slightly decrease at Costs, but buffers went down by ~45%
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Playing with hints changes the situation a little bit:

select /*+ qb_name(MAIN) *INDEX(@"SEL$A3F38ADC" "T1"@"MAIN" txn_proc_bx_B )*
OPTIMIZER_FEATURES_ENABLE('9.2')  DB_VERSION('9.2')   */
-- BX3
    *
from txn_proc t1
where t1.tx_ref not in
        (select t.tx_ref
         from txn_proc t
         where t.send_date is null
           and t.exp_cmnt not like 'PR%'
           and t.exp_cmnt != projno||tno
        )
  and t1.send_date is null
/

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Starts | E-Rows |
Cost  | A-Rows | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |      1 |        |  *9066
*|      0 |    *5245 *|
|*  1 |  HASH JOIN ANTI              |               |      1 |      1 |
9066 |      0 |    5245 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TXN_PROC      |      1 |    366K|
8054 |    366K|    4861 |
|*  3 |    INDEX RANGE SCAN          | TXN_PROC_BX_B |      1 |    366K|
871 |    366K|     859 |
|*  4 |   INDEX FAST FULL SCAN       | TXN_PROC_BX_A |      1 |    254K|
189 |  99816 |     384 |
---------------------------------------------------------------------------------------------------

is worse in my env,

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

select /*+ qb_name(MAIN)
          * NO_UNNEST(@"SEL$1") INDEX(@"SEL$A3F38ADC" "T1"@"MAIN"
txn_proc_bx_B ) *
           OPTIMIZER_FEATURES_ENABLE('9.2')  DB_VERSION('9.2')   */
-- BX4
    *
from txn_proc t1
...

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Starts | E-Rows |
Cost  | A-Rows | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |      1 |        |   *541
*|      0 |     *166K*|
|*  1 |  FILTER                      |               |      1 |        |
    |      0 |     166K|
|*  2 |   TABLE ACCESS FULL          | TXN_PROC      |      1 |  18345 |
536 |    366K|    3469 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| TXN_PROC      |  40828 |      7 |
  5 |  40828 |     163K|
|*  4 |    INDEX SKIP SCAN           | TXN_PROC_BX_B |  40828 |     10 |
  4 |  44391 |     122K|
---------------------------------------------------------------------------------------------------

shows a very nice Cost but the Buffers are ... worse.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

another approach by rewriting the query slightly (it was just a test to see
if I can optimize the indexes)
SELECT /*+ QB_NAME ("MAIN")    OPTIMIZE R_FEATURES_ENABLE('9.2')  DB_V
ERSION('9.2')   */
-- BX6
      *
FROM
    TXN_PROC A1
Where A1.rowid = (
  select /*+ qb_name(RID) materialize */ rowid
  from TXN_PROC A2
WHERE
    "A2"."TX_REF" <> ALL (
        SELECT
            "A3"."TX_REF" "TX_REF"
        FROM
            TXN_PROC A3
        WHERE
            "A3"."SEND_DATE" IS NULL
            AND "A3"."EXP_CMNT" NOT LIKE 'PR%'
            AND "A3"."EXP_CMNT" <> "A3"."PROJNO" || "A3"."TNO"
    )
    AND "A2"."SEND_DATE" IS NULL)
;

leads to
------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name          | Starts |
E-Rows | Cost (%CPU)| A-Rows  | Buffers |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |               |      1 |
     |   307 (100)|      0  |     207K|
|   1 |  TABLE ACCESS BY USER ROWID           | TXN_PROC      |      1 |
   1 |     1   (0)|      0  |     207K|
|   2 |   NESTED LOOPS ANTI                   |               |      1 |
   1 |   306   (0)|      0  |     207K|
|*  3 |    INDEX RANGE SCAN                   | TXN_PROC_BX_A |      1 |
101 |     3   (0)|    366K |    1182 |
|*  4 |    TABLE ACCESS BY INDEX ROWID BATCHED| TXN_PROC      |    238K|
254K|     3   (0)|    238K |     206K|
|*  5 |     INDEX RANGE SCAN                  | TXN_PROC_BX_B |    238K|
  10 |     2   (0)|    259K |     148K|
------------------------------------------------------------------------------------------------------------------
. even better costs, but worse Buffers.

There are many more possibilities and most of them depend on the data
distribution you have.

My examples also show how different goals (cost? buffers? execution time?)
can lead to different plans - choose your goals according to your needs.

I generated my plans with DBMS_XPLAN.DISPLAY_CURSOR('<SQL_ID>', ...) - in
9.2 there is only DBMS_XPLAN.DISPLAY which required a kind of plan_table.
To mitigate this limitation, there is a very nice solution provided by Tom
Kyte:
https://asktom.oracle.com/ords/f?p=100:11:0::::P11_QUESTION_ID:230338600346782894
- I hope this helps in your investigations.

let's see where our investigations leads us,
Martin

Am Do., 11. Jan. 2024 um 14:57 Uhr schrieb Sandra Becker <
sbecker6925@xxxxxxxxx>:

Unfortunately, company policy prohibits me from providing the sql plan.  I
can tell you that pre-index the query was doing full table scans for both
the query and subquery.  Post-index build, it was using the index for
both.  I can create new indexes and/or rewrite the query.   The sub-query
returns 29k rows on average.    The t.exp_cmnt not like replacement since
it can be a combination of any two letters of the alphabet.  I can play
with that and see what I can come up with, possibly combining with the date
column?

Thanks for your assistance.

Sandy

On Thu, Jan 11, 2024 at 6:05 AM Jonathan Lewis <jlewisoracle@xxxxxxxxx>
wrote:

Are you allowed to rewrite the query or do you have to live with it and
tweak the database.
How long does it take to run (pre- and post- index creation), and what
did the two execution plans look like.
Was the index on Just the send_date, or was it on (send_date, some
non-null declared column).  If the former then it shouldn't have made any
difference to the plan (in the general case - but there are some "exotic"
anomalies) which would suggest that the improvement was from side effects
(like better statistics, or the effects of blocks being cleaned out during
the index build).

If you can rewrite the query there are several possibilities, but the
best strategy depends on know the effects of each of the data predicates
and combinations of predicates. For example, how many rows have a null
send_date, how many rows have exp_cmnt != projno||tno.  (It would be better
to include the "t." table alias in that predicate - even though it should
make no difference in this case (unless you've been getting the wrong
results because it was supposed to be a predicate correlating to "t1.".))

Example step - if rewrites are allowed and necessary:
    t.exp_cmnt not like 'PR%'
is equivalent to
    t.exp_cmnt >= 'PS' or t.exp_cmnt < 'PR'

(Note: greater than or equal to, strictly less than)
This means the subquery could be rewritten as a UNION ALL of two query
blocks that could operate through an index.

Regards
Jonathan Lewis

On Wed, 10 Jan 2024 at 22:58, Sandra Becker <sbecker6925@xxxxxxxxx>
wrote:

OS:   SunOS 5.8
DB:   Oracle 9.2.0.5

We're sitting on really old hardware with a really old version of
Oracle.  There is a project to migrate to another application, but they
estimate it will take another 18 months given the human resources that are
available.  That being said, this is a production financial application and
is performing extremely poorly for queries against a specific table.  It's
not a huge table, 388,000 rows, but given the age of the hardware/software,
I'm surprised we don't have more issues.

The query itself is poorly written, but I haven't figured out how to
make it more efficient.  I did manage to reduce the cost and execution time
of the query by 50% by creating an index on the SEND_TO_DATE column--which
can contain nulls--but it's still very slow.  I also set the degree on the
table to 8, which gave us a minor bump in performance.  Any suggestions
would be appreciated, specifically on how I can change the "not like" and
"!=" predicates.

select *
from aps1.txn_proc t1
where t1.tx_ref not in
        (select t.tx_ref
         from aps1.txn_proc t
         where t.send_date is null
           and t.exp_cmnt not like 'PR%'
           and t.exp_cmnt != projno||tno
        )
  and t1.send_date is null
/

Thank you,
--
Sandy B.

--
Sandy B.

--
Martin Berger Oracle ♠
martin.a.berger@xxxxxxxxx @martinberx <https://twitter.com/martinberx>
^∆x http://berxblog.blogspot.com

References:
- Performance issue on Oracle 9i
  - From: Sandra Becker
- Re: Performance issue on Oracle 9i
  - From: Jonathan Lewis
- Re: Performance issue on Oracle 9i
  - From: Sandra Becker

Re: Performance issue on Oracle 9i

Other related posts: