Rakesh,
You got pretty close - but you're right, you didn't create the index that would
allow Oracle to take full advantage of the two virtual columns.
You need an index on (not_deleted_group, not_deleted_user) --- probably
"compress 1", possibly just "compress" to compress on both columns unless you
have tens of millions of groups that have only one user.
You'll notice that your execution plan spent very little time on checking the
users table because it never visited the table, the benefit we hope to get from
the two-virtual-columns index on user_group has the same target - not visiting
the table, and making re-use of the smallest number of index leaf blocks to
acquire the necessary results.
I've given you two or three different strategies for addressing/investigating
the problem - they're not mutually compatible, so the point I'd emphasize about
this indexing attempt is: DON'T create histograms on the virtual columns,
Oracle will do a very bad job of gathering a histogram on the "group" column
because it will use a very small random sample on a column with a very large
number of distinct values and an extreme skew; the first target is to see if an
access path that uses just the two indexes can be fast enough for the worst
cases.
Regards
Jonathan Lewis
________________________________________
From: Rakesh Ra <rakeshra.tr@xxxxxxxxx>
Sent: 24 April 2019 08:08:41
To: Jonathan Lewis
Subject: Re: Query Performance Issue
I think I got it wrong while freatingvindex for user_group.. I will check and
update you..
Thanks,
RRA
On Wed, Apr 24, 2019, 11:55 Rakesh Ra
<rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx>> wrote:
Hi Jonathan,
I tried the virtual column way .. Details as below... The query ran for 51
seconds...
alter table TMS.USERS add (not_deleted_user generated always as (case when
delete_flag = 'N' then user_key end));
create index TMS.users_fbi1 on TMS.USERS(not_deleted_user);
alter table TMS.USER_GROUP add (not_deleted_group generated always as (case
when delete_flag = 'N' then GROUP_TREE_KEY end));
alter table TMS.USER_GROUP add (not_deleted_user generated always as (case when
delete_flag = 'N' then USER_KEY end));
The above columns are already indexed.
Collected stats
begin
dbms_stats.gather_table_stats(
ownname=> 'TMS',
tabname=> 'USER_GROUP' ,
estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE,
cascade=> DBMS_STATS.AUTO_CASCADE,
degree=> DBMS_STATS.AUTO_DEGREE,
no_invalidate=> false,
granularity=> 'AUTO',
method_opt=> 'FOR ALL COLUMNS SIZE REPEAT');
end;
/
begin
dbms_stats.gather_table_stats(
ownname=> 'TMS',
tabname=> 'USERS' ,
estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE,
cascade=> DBMS_STATS.AUTO_CASCADE,
degree=> DBMS_STATS.AUTO_DEGREE,
no_invalidate=> false,
granularity=> 'AUTO',
method_opt=> 'FOR ALL COLUMNS SIZE REPEAT');
end;
/
SELECT COUNT(DISTINCT U.not_deleted_user)
FROM TMS.USERS U
JOIN TMS.USER_GROUP UG ON U.not_deleted_USER = UG.not_deleted_USER
WHERE UG.not_deleted_GROUP = 'ia744d7790000015b347a4749dd5889b8';
COUNT(DISTINCTU.NOT_DELETED_USER)
---------------------------------
256828
Elapsed: 00:00:51.29
Plan hash value: 250484473
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1
|00:00:51.28 | 1574K| 849K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1
|00:00:51.28 | 1574K| 849K| | | |
| 2 | VIEW | VW_DAG_0 | 1 | 256 |
256K|00:00:51.27 | 1574K| 849K| | | |
| 3 | HASH GROUP BY | | 1 | 256 |
256K|00:00:51.24 | 1574K| 849K| 29M| 6062K| 19M (0)|
| 4 | NESTED LOOPS | | 1 | 256 |
389K|00:00:50.92 | 1574K| 849K| | | |
|* 5 | TABLE ACCESS FULL| USER_GROUP | 1 | 256 |
389K|00:00:49.25 | 849K| 849K| | | |
|* 6 | INDEX RANGE SCAN | USERS_FBI1 | 389K| 1 |
389K|00:00:01.53 | 725K| 0 | | | |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter(CASE "DELETE_FLAG" WHEN 'N' THEN "GROUP_TREE_KEY" END =:SYS_B_0)
6 - access("U"."NOT_DELETED_USER"=CASE "DELETE_FLAG" WHEN 'N' THEN
"USER_KEY" END )
On Tue, Apr 23, 2019 at 9:32 PM Rakesh Ra
<rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx>> wrote:
Hi Jonathan,
Thanks,.
Below is the data distribution for USER_GROUP and USERS table. I am working out
on your virtual column suggestion..
TMS.USER_GROUP
DE COUNT(*)
-- ----------
N 57725516
Y 485
Regards,
RRA
TMS.USERS
DE COUNT(*)
-- ----------
N 4052893
Y 9
On Tue, Apr 23, 2019 at 6:54 PM Jonathan Lewis
<jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>> wrote:
First problem - you've enabled cursor sharing: is this set to FORCE or SIMILAR
? The answer doesn't really matter because you don't have any histograms in
place to encourage the optimizer into thinking it should investigate
adaptive_cursor sharing.
2nd problem - there's nothing to help the optimizer work out the impact of the
"delete_flag" or take advantage of it. What fraction of the user_group data has
delete_flag = 'N' (or nvl(delete_flag,'N') = 'N' as you have it in the query ?
What impact does the delete flag have on the worst cases of group_tree_key -
is it possible that by the time you've restricted yourself to the delete_flag =
'N' rows there are a relatively small number of rows left for even the worst
case group_tree_key.
If (after allowing for the effects of the delete_flag) you still have a massive
skew in the number of rows for a few group_tree_key values then you're going to
have to think about either using materialized views, or giving the optimizer a
histogram (or two) to work with and persuading it to use adaptive cursor
sharing.
Please tell us:
a) number of users where delete_flag (or the nvl() version in the plan) = 'N',
b) dittor for user_group
c) if the number of rows left in in user_group is significantly reduced, show
us the counts of the top 90 again after applying the predicate.
I wouldn't take the clustering strategy suggested by Mladen - too much risk of
a huge increase in the size of the data segments. But I would consider creating
an index on the user_group table that would allow the run-time engine to avoid
visiting the table for this query - in the simplest case
user_group(group_tree_key, delete_flag, user_key) - probably compress 2,
maybe compress 3.
users(user_key, delete_flag)
Note that you can use the second index to cover the primary key on the users
table in place of the xpkusers index
Note, also, that your "foreign key index" on the user_group table on (user_key)
is technically redundant as it is covered by the leading column of the primary
key.
Note, finally, that if you haven't enabled compression on any of your indexes
you should look for cases where it will apply.
For a more complex, but potentially much more efficient, indexing strategy - if
you find that a large fraction of the data is eliminated by your "delete_flag =
'N'" predicate then you could create function-based indexes (or, for easier
reading, virtual columns which you index) so that you create indexes only for
rows you are interested in, for example:
alter table users add not_deleted_user generated always as
case when delete_flag = 'N' then user_key end
;
create index users_fbi1 on users(not_deleted_user);
Similarly you could create a couple of virtual columns on user_group for
"not_deleted_group" and "not_deleted_user", then your query becomes:
SELECT
COUNT (DISTINCT U.not_deleted_user)
FROM
TMS.USERS U
JOIN
TMS.USER_GROUP UG
ON
U.not_deleted_USER = UG.not_deleted_USER
WHERE
UG.not_deleted_GROUP = 'ia744d7790000015b347a4749dd5889b8'
;
Regards
Jonathan Lewis
________________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
<oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>> on behalf
of Rakesh Ra <rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx>>
Sent: 23 April 2019 13:14:43
To: Oracle-L Freelists
Cc: Rakesh RA
Subject: Re: Query Performance Issue
Sorry again...
Missed to give the the execution plan.
Plan hash value: 1838219834
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | |
774 (100)| |
| 1 | SORT AGGREGATE | | 1 | 68 |
| |
| 2 | VIEW | VW_DAG_0 | 318 | 21624 |
774 (1)| 00:00:10 |
| 3 | HASH GROUP BY | | 318 | 33708 |
774 (1)| 00:00:10 |
| 4 | NESTED LOOPS | | 318 | 33708 |
773 (0)| 00:00:10 |
| 5 | NESTED LOOPS | | 318 | 33708 |
773 (0)| 00:00:10 |
|* 6 | TABLE ACCESS BY INDEX ROWID| USER_GROUP | 318 | 22260 |
137 (0)| 00:00:02 |
|* 7 | INDEX RANGE SCAN | UG_FK_GROUPKEY | 635 | |
10 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | XPKUSER | 1 | |
1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS BY INDEX ROWID | USERS | 1 | 36 |
2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter(NVL("UG"."DELETE_FLAG",'N')=:SYS_B_0)
7 - access("UG"."GROUP_TREE_KEY"=:1)
8 - access("U"."USER_KEY"="UG"."USER_KEY")
9 - filter(NVL("U"."DELETE_FLAG",'N')=:SYS_B_1)
On Tue, Apr 23, 2019 at 5:40 PM Rakesh Ra
<rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx><mailto:rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx>>>
wrote:
BTW forgot to mention that the database is of version 11.2.0.4.1 running on OEL
6.3.
Thanks,
Rakesh RA
On Tue, Apr 23, 2019 at 5:38 PM Rakesh Ra
<rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx><mailto:rakeshra.tr@xxxxxxxxx<mailto:rakeshra.tr@xxxxxxxxx>>>
wrote:
Hi All,
I have a query where in application has a read timeout set for 20 seconds.
SELECT COUNT (DISTINCT U.USER_KEY) FROM TMS.USERS U JOIN
TMS.USER_GROUP UG ON U.USER_KEY = UG.USER_KEY WHERE
UG.GROUP_TREE_KEY = 'ia744d7790000015b347a4749dd5889b8' and UG.delete_flag ='N'
and U.delete_flag ='N';
With most bind values of group_tree_key the query runs within 20 seconds. But
with GROUP_TREE_KEY having more number of records in TMS.USER_GROUP the query
runs for more than 20 seconds and then application fails with "Socket Read
Timeout".
Below are few details related to table stats and column stats. There is huge
skewness in data for GROUP_TREE_KEY for the table TMS.USER_GROUP. I tried to
collect the column stats for the table TMS.USER_GROUP for the column
GROUP_TREE_KEY but that didn't have much effect. Is there any way we can
optimize the query to have a better run time?
TMS.USER_GROUP DETAILS
==========================
OWNER TABLE_NAME LAST_ANALYZED
NUM_ROWS
------------------------------ ------------------------------
----------------------------- ----------
TMS USER_GROUP 01-FEB-2019 23:21:25
55479760
OWNER TABLE_NAME COLUMN_NAME
NUM_DISTINCT NUM_BUCKETS HISTOGRAM
------------------------------ ------------------------------
------------------------------ ------------ ----------- ---------------
TMS USER_GROUP UPDATED_TIMESTAMP
2851921 1 NONE
TMS USER_GROUP STOP_DATE
4166 1 NONE
TMS USER_GROUP START_DATE
3301 1 NONE
TMS USER_GROUP GROUP_TREE_KEY
87351 1 NONE
TMS USER_GROUP USER_KEY
2112700 1 NONE
TMS USER_GROUP DELETE_FLAG
2 1 NONE
TABLE_NAME COLUMN_NAME INDEX_NAME
COLUMN_POSITION
--------------------------- ---------------------------
------------------------------------ ---------------
USER_GROUP USER_KEY XPKUSER_GROUP
1
USER_GROUP GROUP_TREE_KEY XPKUSER_GROUP
2
USER_GROUP START_DATE XPKUSER_GROUP
3
USER_GROUP STOP_DATE XPKUSER_GROUP
4
USER_GROUP GROUP_TREE_KEY UG_FK_GROUPKEY
1
USER_GROUP USER_KEY UG_USER_KEY
1
USER_GROUP UPDATED_TIMESTAMP USER_GROUP_IDX
1
7 rows selected.
OWNER INDEX_NAME
LAST_ANALYZED NUM_ROWS NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------------
----------------------------- ---------- ---------- -----------------
TMS XPKUSER_GROUP 01-FEB-2019
23:23:32 55821380 55821380 50385340
TMS UG_FK_GROUPKEY 01-FEB-2019
23:24:36 56195400 56195400 11056540
TMS UG_USER_KEY 01-FEB-2019
23:25:55 55064620 55064620 45906520
TMS USER_GROUP_IDX 01-FEB-2019
23:26:54 55041170 55041170 15547360
TMS.USERS DETAILS
==================
OWNER TABLE_NAME LAST_ANALYZED
NUM_ROWS
------------------------------ ---------------------------
----------------------------- ----------
TMS USERS 16-APR-2019 23:23:18
4093040
OWNER TABLE_NAME COLUMN_NAME
NUM_DISTINCT NUM_BUCKETS HISTOGRAM
------------------------------ ---------------------------
--------------------------- ------------ ----------- ---------------
TMS USERS ENCRYPTED_PASSWORD
0 0 NONE
TMS USERS DISABLE_GRP_IPR
1 1 NONE
TMS USERS NEW_USER
2 1 NONE
TMS USERS IP_RESTRICTED
1 1 NONE
TMS USERS REASON_CODE_KEY
4 1 NONE
TMS USERS LOGIN_LIMIT
3 1 NONE
TMS USERS SHARED
1 1 NONE
TMS USERS USER_PWD_CLUE
0 0 NONE
TMS USERS STOP_DATE
111803 1 NONE
TMS USERS START_DATE
289426 1 NONE
TMS USERS INACTIVE_STATE
1 1 NONE
TMS USERS DOMAIN_KEY
1 1 NONE
TMS USERS FAILED_LOGIN_CNT
52 1 NONE
TMS USERS PWD_EXPIRED
1 1 NONE
TMS USERS USER_GUID
4093040 1 NONE
TMS USERS USER_CREATE_DATE
426536 1 NONE
TMS USERS USER_LASTMOD_DATE
596781 1 NONE
TMS USERS USER_PWD_RESPONSE
0 0 NONE
TMS USERS USER_PWD_CHALLENGE
0 0 NONE
TMS USERS USER_EMAIL
26162 1 NONE
TMS USERS USER_ID
4093040 1 NONE
TMS USERS USER_PASSWORD
2479152 1 NONE
TMS USERS USER_MIDDLE_NAME
0 0 NONE
TMS USERS USER_FIRST_NAME
11933 1 NONE
TMS USERS USER_LAST_NAME
33227 1 NONE
TMS USERS USER_KEY
4093040 1 NONE
TMS USERS DELETE_FLAG
2 1 NONE
TMS USERS SYS_NC00027$
1000 1 NONE
TABLE_NAME COLUMN_NAME INDEX_NAME
COLUMN_POSITION
--------------------------- ---------------------------
------------------------------------ ---------------
USERS USER_KEY XPKUSER
1
USERS USER_GUID PSU_USER_GUID_IDX
1
USERS USER_ID PSU_USER_ID_DOM_UQ_IDX
1
USERS DOMAIN_KEY PSU_USER_ID_DOM_UQ_IDX
2
USERS DOMAIN_KEY PSU_DKEY_EMAIL
1
USERS USER_EMAIL PSU_DKEY_EMAIL
2
USERS INACTIVE_STATE PSU_INACTIVE_STATE
1
USERS REASON_CODE_KEY XIF5USER
1
USERS SYS_NC00027$ MOD_IDX
1
9 rows selected.
OWNER INDEX_NAME
LAST_ANALYZED NUM_ROWS NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------------
----------------------------- ---------- ---------- -----------------
TMS XPKUSER 16-APR-2019
23:23:18 4237790 4237790 1774550
TMS PSU_USER_GUID_IDX 16-APR-2019
23:23:19 4001180 4001180 1689690
TMS PSU_USER_ID_DOM_UQ_IDX 16-APR-2019
23:23:19 3900400 3900400 1994620
TMS PSU_DKEY_EMAIL 16-APR-2019
23:23:27 4268590 4268590 275130
TMS PSU_INACTIVE_STATE 16-APR-2019
23:23:27 8694 8694 6123
TMS XIF5USER 16-APR-2019
23:23:27 8840 8840 6337
TMS MOD_IDX 16-APR-2019
23:23:31 3972540 3972540 3902610
I understand all details are not supplied however, I am ready to get that if
asked for.,
Thanks,
Rakesh RA
--
//www.freelists.org/webpage/oracle-l