Kunwar,
I can't really give any concrete suggestions. Clearly something is happening
that should not be happening and Oracle thinks it's to do with CPU utilisation.
A couple of random thoughts - are you still seeing the same short stack when
you don't include the gather_plan_statistics hint. This may mean that someone
has set parameter statistics_level = all, or _rowsource_execution_statistics =
true so that the stats are still being collected.
Apart from that, the only other guess I can give is that there's something in
the memory configuration that makes passing data from one function call up the
chain to the parent, or makes following the linked list that corresponds to a
single hash bucket in the hash table very expensive. The only things I can
come up with there are page allocation size or something to do the NUMA.
You haven't said anything about the hardware or O/S, and whether the two
instances are on the same actual machine, whether virtual machines are
involved, or anything else that might give someone a clue about the odd CPU
consumption.
Regards
Jonathan Lewis
________________________________________
From: kunwar singh <krishsingh.111@xxxxxxxxx>
Sent: 26 January 2020 14:59
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Is hash join slower in 12.2 compared to 11.2.0.3 or am i hitting a
bug?
Thanks Jonathan.
yes i do see good performance with px_join_filter as well.
SQL Monitoring Report
SQL Text
------------------------------
select /*+ gather_plan_statistics monitor leading(a,b) use_hash(a,b) full(a)
full(b) px_join_filter(b) */ *from TMP_emp a, tmp_dba_objects b where
a.empno=b.object_id
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : KUNWAR (32:40079)
SQL ID : f88k408dc9qn4
SQL Execution ID : 16777216
Execution Started : 01/26/2020 15:37:23
First Refresh Time : 01/26/2020 15:37:23
Last Refresh Time : 01/26/2020 15:37:23
Duration : .31806s
Module/Action : SQL*Plus/-
Service : SYS$USERS
Fetch Calls : 4
Global Stats
===========================================================================
| Elapsed | Cpu | IO | Other | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
===========================================================================
| 0.32 | 0.16 | 0.14 | 0.02 | 4 | 8735 | 92 | 68MB |
===========================================================================
SQL Plan Monitoring Details (Plan Hash Value=1496359809)
=============================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time |
Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) |
Active | | (Actual) | Reqs | Bytes | (Max) | (%) | (# samples) |
=============================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 1 |
+0 | 1 | 32 | | | . | | |
| 1 | HASH JOIN | | 32 | 2482 | 1 |
+0 | 1 | 32 | | | 297KB | | |
| 2 | JOIN FILTER CREATE | :BF0000 | 1 | 3 | 1 |
+0 | 1 | 1 | | | . | | |
| 3 | TABLE ACCESS FULL | TMP_EMP | 1 | 3 | 1 |
+0 | 1 | 1 | 2 | 16384 | . | | |
| 4 | JOIN FILTER USE | :BF0000 | 640K | 2477 | 1 |
+0 | 1 | 32 | | | . | | |
| 5 | TABLE ACCESS FULL | TMP_DBA_OBJECTS | 640K | 2477 | 1 |
+0 | 1 | 32 | 81 | 68MB | . | | |
=============================================================================================================================================================
I will check into the OS & h/w difference. Is there a reference blog or step by
step checks that you can suggest?
Yes removing gather_plan_statistics didnt make a difference to the performance.
On Sun, Jan 26, 2020 at 5:03 AM Jonathan Lewis
<jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>> wrote:
I've run up your test on 11.2.0.4 and 12.2.0.1 - there is basically no
difference between the two (though 12.2.0.1 is slightly faster, but the
variation by each version is similar to the difference between versions, so not
significant). CPU time is ca. 0.32 - 0.35 seconds.
Tested with and without direct path reads, with and without
gather_plan_statistics.
I would have expected the impact of gathering rowsource execution statistics to
be minimal in all your test cases because I believe the counters and timers are
checked at the start and end of each call to a child operation - and a hash
join calls it's child operations just once each.
I suspect any problem may relate to O/S or hardware differences, that's
resulting in a difference with memory access as rows are moved up from child to
parent. (This might explain why the parallel query with Bloom filter is faster
- very few rows get moved from child to parent - you could test this by adding
the hint px_join_filter(b) to the SQL to force the Bloom filter to appear in
the (12.2) serial query.)
Regards
Jonathan Lewis
________________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>
<oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>> on behalf
of kunwar singh <krishsingh.111@xxxxxxxxx<mailto:krishsingh.111@xxxxxxxxx>>
Sent: 26 January 2020 02:24
To: ORACLE-L
Subject: Is hash join slower in 12.2 compared to 11.2.0.3 or am i hitting a bug?
Hi Listers,
I am doing a testing of migrating app for a customer from 11.2.0.3 to 12.2 and
find the queries doing hash join are taking upto 4x more time. Hash join step
is the one taking the time , more cpu and more elapsed time.
Is this a known issue or any bug? I searched in MOS , no hits or maybe i am
not using correct keywords to match.
Note that i am doing testing on base releases of 11.2.0.3 & 12.2 for
simplifying my testcase.
1. Testcase
create table tmp_dba_objects AS SELECT *from dba_objects where rownum <20001;
insert into tmp_dba_objects select *from tmp_dba_objects;
insert into tmp_dba_objects select *from tmp_dba_objects;
insert into tmp_dba_objects select *from tmp_dba_objects;
insert into tmp_dba_objects select *from tmp_dba_objects;
insert into tmp_dba_objects select *from tmp_dba_objects;
create table TMP_EMP ( EMPNO NUMBER);
insert into TMP_EMP select min(object_id) from tmp_dba_objects;
commit;
alter system flush shared_pool;
alter system flush buffer_cache;
select /*+ gather_plan_statistics monitor leading(a,b) use_hash(a,b) full(a)
full(b) */ *from TMP_emp a, tmp_dba_objects b where a.empno=b.object_id;
set lin 300 pages 9999
select *from table(dbms_xplan.display_cursor(NULL,NULL,'IOSTATS LAST'));
2. SQL monitor report outputs
SQL Monitoring Report -12.2.0.1
SQL Text
------------------------------
select /*+ gather_plan_statistics monitor leading(a,b) use_hash(a,b) full(a)
full(b) */ *from TMP_emp a, tmp_dba_objects b where a.empno=b.object_id
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : KUNWAR (40:7639)
SQL ID : 0257vnkax6zak
SQL Execution ID : 16777216
Execution Started : 01/26/2020 02:52:50
First Refresh Time : 01/26/2020 02:52:50
Last Refresh Time : 01/26/2020 02:52:58
Duration : 8s <<<<<<< 8
seconds
Module/Action : SQL*Plus/-
Service : SYS$USERS
Program : sqlplus@devserver1 (TNS V1-V3)
Fetch Calls : 4
Global Stats
=========================================================================================
| Elapsed | Cpu | IO | Application | Other | Fetch | Buffer | Read
| Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs
| Bytes |
=========================================================================================
| 7.83 | 5.18 | 2.46 | 0.01 | 0.18 | 4 | 8742 | 91
| 68MB |
=========================================================================================
SQL Plan Monitoring Details (Plan Hash Value=1398591261)
============================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time |
Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) |
Active | | (Actual) | Reqs | Bytes | (Max) | (%) | (# samples) |
============================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 8 |
+1 | 1 | 32 | | | . | | |
| 1 | HASH JOIN | | 32 | 2482 | 8 |
+1 | 1 | 32 | | | 708KB | | |
-- Just as a side note that ASH is not getting populated, due to
some other reason
| 2 | TABLE ACCESS FULL | TMP_EMP | 1 | 3 | 1 |
+1 | 1 | 1 | 2 | 16384 | . | | |
| 3 | TABLE ACCESS FULL | TMP_DBA_OBJECTS | 640K | 2477 | 8 |
+1 | 1 | 640K | 80 | 68MB | . | | |
============================================================================================================================================================
---
SQL Monitoring Report -11.2.0.3
SQL Text
------------------------------
select /*+ gather_plan_statistics monitor leading(a,b) use_hash(a,b) full(a)
full(b) */ *from TMP_emp a, tmp_dba_objects b where a.empno=b.object_id
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : KUNWAR (147:18275)
SQL ID : 0257vnkax6zak
SQL Execution ID : 16777216
Execution Started : 01/26/2020 03:12:22
First Refresh Time : 01/26/2020 03:12:22
Last Refresh Time : 01/26/2020 03:12:24
Duration : 2s <<<<<<< 2
seconds
Module/Action : SQL*Plus/-
Service : SYS$USERS
Program :
sqlplus@xxxxxxxxxxxxxxxxxxxxxxxx<mailto:sqlplus@xxxxxxxxxxxxxxxxxxxxxxxx><mailto:sqlplus@xxxxxxxxxxxxxxxxxxxxxxxx<mailto:sqlplus@xxxxxxxxxxxxxxxxxxxxxxxx>>
(TNS V1-V3)
Fetch Calls : 4
Global Stats
================================================================
| Elapsed | Cpu | IO | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
================================================================
| 1.96 | 0.53 | 1.43 | 4 | 8844 | 83 | 69MB |
================================================================
SQL Plan Monitoring Details (Plan Hash Value=1398591261)
=================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time |
Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail
|
| | | | (Estim) | | Active(s) |
Active | | (Actual) | Reqs | Bytes | (Max) | (%) | (# samples)
|
=================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 3 |
+0 | 1 | 32 | | | | |
|
| 1 | HASH JOIN | | 32 | 2494 | 3 |
+0 | 1 | 32 | | | 734K | |
|
| 2 | TABLE ACCESS FULL | TMP_EMP | 1 | 3 | 1 |
+0 | 1 | 1 | 2 | 49152 | | |
|
| 3 | TABLE ACCESS FULL | TMP_DBA_OBJECTS | 640K | 2487 | 3 |
+0 | 1 | 640K | 81 | 69MB | | 50.00 | direct path read (1)
|
=================================================================================================================================================================
Not sure what diagnostic to dig deeper, strace/perf? i am not much familiar
with these.
oradebug oradebug short_stack in 12.2 shows lots of rows like below :
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000392A8918<-6D70726B00796669
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-kpusattr()+58060<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-kzftfpc()+108<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-dmsglLRPersist()+1480<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000016646A58<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000001CD11420<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002305FE48<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002928CA8A<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002F6031FE<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000359BB229<-6B6D70726B007966
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000003FAF84<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-kfclUnlinkBh()+1586<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-kdstf111010100001000km()+14<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000012BCB837<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000018D8C1B8<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000001EFA8E9F<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000252DCF2E<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002B574A8C<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000317E01AF<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000037C374DF<-736B6D70726B0079
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-kdiss_fetch()+4710<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-qmxxtiAdjMKXTIArgs()+384<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-qesgvslice_0_MIN_MO_DA_F()+6004<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000014D30516<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000001B002C0A<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002125D372<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-00000000276006C7<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-000000002D93D64B<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000033C3E57D<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-0000000039EC64C6<-76736B6D70726B00
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-clock_gettime()+331<-clock_gettime()+54<-xsspfc_Comp504()+575<-5F76736B6D70726B
I dont see these in 11.2
While researching i am saw some twitter conversation of Frank Pachot and
thought this link could help , but i am not familar with tmux/gdb , so still
reading :) :
https://github.com/FranckPachot/demo/blob/master/TVD-JoinMethods.sql
Maybe i will have to go to one of conferences like POUG for such advanced
techniques :D
--
Cheers,
Kunwar
--
Cheers,
Kunwar
--
//www.freelists.org/webpage/oracle-l