12.1 there is (was) a bug with a patch available where child parses can get a
different plan (multiple times), it doesn’t match, repeat, and hilarity ensues.
Sorry I don’t have any physical notes on that due to site security AND I just
don’t remember long strings like bug numbers and MOS notes any more.
I think that matches your symptoms, so worth a look. If memory serves the bug
look-up has Concurrency cursor: pin S wait on X in it.
Good luck!
mwf
PS: 12.2 seems a lot more reliable if you can get there (at least from my
experience, but I don’t have a large sample size of different sites.)
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of k.hadd
Sent: Friday, June 07, 2019 7:39 PM
To: Oracle-L@xxxxxxxxxxxxx
Subject: Fwd: 12c Parallelism in windows potentially behind library cache lock/
PX Deq: Parse Reply/ PX Deq: Execution Msg
Hi Oracle mates,
I have been tracing sessions this week where a query with multiple unions
caused user sessions to freeze subsequently .There seemed to be a contention
in the library cache (locks) but -Oddly- when I checked the trace file report
using sqlt a huge time For these queries was unaccounted for or Idle Time (PX
Deq: Parse Reply )
Environment : OS : (VM) Windows server 2012 16 VCPU [ CPU and RAM aren't
reserved ]
DB : 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
Also worth mentioning is that sometimes the optimizer has a knack for wrongly
choosing parallel over serial execution when the later is 50% faster.
Any clue or insight on best practices to tackle parallelism issues on VM server
(other than completely disabling parallelism )?
Note: OPTIMIZER_ADAPTIVE_FEATURES to FALSE didn't help But disabling
parallelism kept the execution in sequence . I don' know why the Optimizer
keeps choosing a PX plan .
Back when the application ran on 10g we used to have Px deq : credit wait
events a lot but it's other parallel events since the upgrade to 12c.
Really appreciate any input .
Thank you
• Context and findings :
Below are the informations collected so far between ORACLE-S-ASH views and the
sqlt (TRCANLZR) report on the trace File ( 30 minutes).
1- S-ASH Top events report for the related period (times are in Secs)
BEGIN_TIME END_TIME WAIT_CLASS EVENT
SQL_ID COUNT TIME_WAITED AVG_TIME_WAITED
-------------------- -------------------- ------------
---------------------------------------- ------------- ---------- -----------
---------------
31-may-2019 10:00:00 31-may-2019 10:14:59 Concurrency library cache lock
11wythn1bf8c5 190968 556700 2,91
31-may-2019 10:00:03 31-may-2019 10:14:29 Concurrency library cache: mutex X
11wythn1bf8c5 1690 15 0
31-may-2019 10:01:04 31-may-2019 10:01:07 Concurrency cursor: pin S wait on X
11wythn1bf8c5 4 10 2,5
2- The TraceFile for one of the users
Relevant Executions
Performance stats for for one of the executions . Both Idle (PX deq :
Parse Reply which isn't really idle time) and non accounted for times make for
almost 100% of the response time.
I suspect the freeze comes from a parallel CPU en-queue during library
cache access but don't know how I could avoid this .
Rank
Trace
ET
Pct1
Self
Elapsed
Time2
CPU Time
Non-Idle
Wait Time
Recursive
Elapsed
Time3
Exec
Count
User
Depth
SQL Text
Hash Value
SQL ID
Plan
Hash
Value
1:
79.1%
841.060
22.094
0.814
0.850
3
115
0
SELECT 1*1 TypeEnregistrement ,1* EMPLH.ID_R ID_
<https://mail.google.com/mail/u/0/#sql298> 45556101
11wythn1bf8c5
3867173512
Call
Response Time
Accounted-for
Elapsed
Time
CPU Time
Non-Idle
Wait Time
Elapsed Time
Unaccounted-for
Idle
Wait Time
Parse:
0.042
0.032
0.016
0.001
0.016
0.010
Execute:
7.153
7.153
6.906
0.123
0.124
0.000
Fetch:
826.337
413.562
0.703
0.308
412.551
412.775
Total:
833.532
420.747
7.625
0.432
412.690
412.785
Fetch
Call
Response Time
Accounted-for
Elapsed
Time
CPU Time
Non-Idle
Wait Time
Elapsed Time
Unaccounted-for
Idle
Wait Time
Call Timestamp
First:
826.080
413.810
0.984
0.306
412.529
412.541
MAY-31 10:44:33.191
Rank 1 : 11wythn1bf8c5 3867173512
Event Name
Wait Class
Non-Idle
Wait Time
Times
Waited
Non-Idle
Idle
Wait Time
Times
Waited
Idle
Average
Wait Time
Max
Wait Time
Blocks
Average
Blocks
PX Deq: Parse Reply:
Idle
412.533
64
6.445834
85.928264
enq: PS - contention:
Other
0.301
64
0.004699
0.008888
SQL*Net message from client:
Idle
0.246
56
0.004394
0.011639
db file sequential read:
User I/O
0.128
127
0.001011
0.014514
127
1
PX Deq: Join ACK:
Idle
0.004
127
0.000029
0.000645
PX Deq Credit: send blkd:
Idle
0.002
6
0.000392
0.000937
SQL*Net more data to client:
Network
0.002
65
0.000027
0.000272
log file sync:
Commit
0.001
1
0.000595
0.000595
SQL*Net message to client:
Network
0.000
57
0.000001
0.000003
latch: row cache objects:
Concurrency
0.000
4
0.000010
0.000020
SQL*Net more data from client:
Network
0.000
1
0.000013
0.000013
latch: shared pool:
Concurrency
0.000
1
0.000005
0.000005
Total:
0.432
320
412.785
253
Fetch
Call
Recursive
Call
Count
OS
Buffer Gets
(disk)
BG Consistent
Read Mode
(query)
BG Current
Mode
(current)
Rows
Processed
or Returned
Library
Cache
Misses
Times
Waited
Non-Idle
Times
Waited
Idle
Call Timestamp
First:
3000
10
198855
0
100
0
78
198
MAY-31 10:44:33.191
Second:
0
0
0
0
100
0
2
1
MAY-31 10:44:33.197
Third:
0
0
0
0
100
0
2
1
MAY-31 10:44:33.201
Last:
0
0
0
0
66
0
2
1
MAY-31 10:44:33.457
overview of the data exchange between the producers/consumer during the
execution
Username QC/Slave SlaveSet SID Slave INS STATE WAIT_EVENT
QC SID QC INS Req. DOP Actual DOP
------------ -------- -------- ------ --------- --------
------------------------------ ------ ------ -------- ----------
APP1 QC 596 1 WAIT PX Deq: Parse Reply
596
- p00k (Slave) 1 16 1 WAIT PX Deq: Execution Msg
596 1 32 32
- p006 (Slave) 1 62 1 WAIT PX Deq: Execution Msg
596 1 32 32
- p00l (Slave) 1 211 1 WAIT PX Deq: Execution Msg
596 1 32 32
- p00a (Slave) 1 224 1 WAIT PX Deq: Execution Msg
596 1 32 32
- p00m (Slave) 1 384 1 WAIT PX Deq: Execution Msg
596 1 32 32
- p00p (Slave) 1 405 1 WAIT library cache lock
596 1 32 32
- p008 (Slave) 1 421 1 WAIT cursor: pin S wait on
X 596 1 32 32
Explain Plan : did not match the the plan used for execution which was had
parallel processes (adaptive_plan ).
Again , I always wanted to ask the client to reserve esx cpu and the ram for
the vm but lacked concrete evidence of it's impact on Oracle parallelism to
back it up.
Thank you very much for reading
Kouss