RE: Fixing Performance issue with less selective columns
- From: "Mark W. Farnham" <mwf@xxxxxxxx>
- To: "'Lok P'" <loknath.73@xxxxxxxxx>
- Date: Wed, 25 Aug 2021 13:19:53 -0400
unfortunately you keep nearly all the rows of both MA_FLG and D_UNMTCH, so this
query is the opposite of those indexes being useful.
IF you were looking for ‘Y’ instead of not ‘Y’ on either one it would be
extremely good. I didn’t see initially that these two columns are extremely
inclusive.
I think Sayan was checking that in his query request. MA_FLG could reject at
most about 6 million rows, so that’s pretty worthless.
From: Lok P [mailto:loknath.73@xxxxxxxxx] ;
Sent: Wednesday, August 25, 2021 1:09 PM
To: Mark W. Farnham
Cc: Sayan Malakshinov; Oracle L
Subject: Re: Fixing Performance issue with less selective columns
Thank You Mark.
I may be wrong but in this situation I was unable to think of any other way we
could make this query faster , so I was thinking of creating a new index. If
there exists any other way to make this query faster without creating any new
index that would really be helpful.
I am not able to get your point fully, If you can help me understand it a bit
more here please. Below is the data pattern for MA_FLG and D_UNMTCH.
Thus , in this query condition " NVL (I.MA_FLG, 'N') <> 'Y' results in
~105million and NVL (I.D_UNMTCH, 'N') <> 'Y' results in ~111million. So how
should I create index or modify code to make it the best access/filter criteria
so as to make the query faster?
MA_FLG
Count(*)
N
105228656
Y
6000938
643566
D_UNMTCH
Count(*)
Y
13715
111859445
On Wed, Aug 25, 2021 at 8:00 PM Mark W. Farnham <mwf@xxxxxxxx> wrote:
The other thing, for flag values like AND NVL (I.MA_FLG, 'N') <> 'Y' AND
NVL (I.D_UNMTCH, 'N') <> 'Y'
if you’re thinking about adding an index, and even if you need a virtual column
to do this because you have too much code depending on values ‘N’ and ‘Y’,
define the final status (the one where nearly all of them land) as NULL, being
the ones you are NOT interested in most of the time. In both these cases it
looks like ‘Y’ would then be NULL, so
i.ma_flg_v is defined decoding Y to NULL and anything else to N and your code
becomes and i.ma_flg_v = ‘N’ and you deal with variability in non-nulls that
are not ‘Y’ on the original,
or
i.ma_flg_v decodes Y to NULL, NULL to ‘N’ and anything else unchanged and your
code becomes i.ma_flg_v is NOT NULL,
or
you make a functional index on i.ma_flg that does the equivalent.
I can’t remember off the top of my head whether either way gives you a real
advantage over the other in stats collections and the CBO doing something smart
and that probably changed over the releases. That might be in one of my papers.
When you then index that column the nulls disappear, leaving you with a very
tiny index to prune your result set immediately to very small and you can
usually filter the rest fast without an index.
Remember, ORACLE cannot assign a value to NULL in anything they do. But YOU can.
When this is appropriate, it is one of the neatest and easiest “magic tricks”
in the Oracle kit.
Good luck,
mwf
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Sayan Malakshinov
Sent: Wednesday, August 25, 2021 9:40 AM
To: Lok P
Cc: Oracle L
Subject: Re: Fixing Performance issue with less selective columns
Hi Lok,
SUBSTR(:B8,0.50)
Looks like this query should be analyzed and tested better.
You haven't provided histograms and bind values statistics, so not enough info
to analyze it properly.
For now it looks like "I.WOF_DATE IS NULL" is one of the most selective
predicates - it gives only 83154 nulls.
In addition to histogram statistics(dba_tab_histograms) and most often binds
values, I would like also to see what does return this query:
select
NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE,count(*)
FROM PP_IN_TAB I
group by NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE;
On Wed, Aug 25, 2021 at 4:14 PM Lok P <loknath.73@xxxxxxxxx> wrote:
Hello , This database has version 11.2.0.4 of Oracle. We have the below query
which is executed thousands of times. It's used in a plsql function which in
turn gets called from a procedure. And this procedure gets called from java
thousands of times. And I see from dba_hist_sqlstat , for most of the runs
this below query results in zero rows. We see from the active session history
for the overall process this query is consuming most time/resources and making
the process run longer. So wanted to understand if we can make this individual
query execution faster which would eventually make the process faster?
The base table- PP_IN_TAB is holding ~111million rows and is ~43GB in size.
Column PP_ID is the primary key here. The filter predicates used in this query
are below. Many of them were not very selective in nature. So I am not able to
conclude if any composite index is going to help us here. Can you please guide
me , what is the correct approach to tune this process in such a scenario?
Below is the column data pattern used as filter predicate in this query. Most
of these are less selective in nature.
TABLE_NAME
COLUMN_NAME
NUM_DISTINCT
NUM_NULLS
PP_IN_TAB
EF_ID
39515
6151686
PP_IN_TAB
PE
103074806
647050
PP_IN_TAB
PT_Code
24
0
PP_IN_TAB
PT_MCODE
20
0
PP_IN_TAB
D_CUR_CODE
13
592784
PP_IN_TAB
ED_AMT
320892
6
PP_IN_TAB
WOF_DATE
2572
83154
PP_IN_TAB
PR_CTGRY
2
86
PP_IN_TAB
PDE_RSN_CAT
6
0
PP_IN_TAB
MA_FLG
2
648172
PP_IN_TAB
M_TXT
29460248
9118572
PP_IN_TAB
D_UNMTCH
1
111766716
SELECT NVL (I.PP_ID, 0)
FROM PP_IN_TAB I
WHERE TRIM(I.M_TXT) = TRIM (SUBSTR ( :B8, 0.50)) AND I.PT_Code = :B7
AND NVL ( :B6, I.PT_MCODE) = NVL ( :B6, :B5) AND I.DC_CODE = :B4
AND I.D_CUR_CODE = :B3 AND I.ED_AMT = :B2
AND I.PR_CTGRY = :B1 AND I.PE IS NOT NULL
AND I.EF_ID IS NULL AND I.WOF_DATE IS NULL
AND NVL (I.MA_FLG, 'N') <> 'Y' AND NVL (I.D_UNMTCH, 'N') <> 'Y'
AND ROWNUM = 1;
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
SQL Execution ID : 16777216
Execution Started : 08/25/2021 03:53:25
First Refresh Time : 08/25/2021 03:53:25
Last Refresh Time : 08/25/2021 03:53:28
Duration : 3s
Module/Action : SQL*Plus/-
Program : sqlplus.exe
Fetch Calls : 1
Global Stats
=========================================================================================
| Elapsed | Cpu | IO | Application | Fetch | Buffer | Read | Read |
Cell |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
Offload |
=========================================================================================
| 3.30 | 1.15 | 2.15 | 0.00 | 1 | 6M | 44379 | 43GB |
99.99% |
=========================================================================================
SQL Plan Monitoring Details (Plan Hash Value=1096440065)
==========================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost |
Time | Start | Execs | Rows | Read | Read | Cell | Mem |
Activity | Activity Detail |
| | | | (Estim) | |
Active(s) | Active | | (Actual) | Reqs | Bytes | Offload | (Max) | (%)
| (# samples) |
==========================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | |
| | 1 | | | | | | |
|
| 1 | COUNT STOPKEY | | | |
| | 1 | | | | | | |
|
| 2 | TABLE ACCESS STORAGE FULL | PP_IN_TAB | 1 | 128K |
3 | +2 | 1 | 0 | 44379 | 43GB | 99.99% | 6M | 100.00 |
cell smart table scan (3) |
==========================================================================================================================================================================================
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM=1)
2 - storage("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
"I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
"I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y' AND
NVL("I"."D_UNMTCH",'N')<>'Y')
filter("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
"I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
"I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y' AND
NVL("I"."D_UNMTCH",'N')<>'Y')
--
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate
http://orasql.org
Other related posts: