Would suggest to check network latency on the private network
On Tue, 28 Jan 2020 at 9:40 PM, Chris Stephens <cstephens16@xxxxxxxxx>
wrote:
3-node Oracle 19.3 RAC
Centos 7
We have a SQLAlchemy/Python based application workload that is running the
exact same steps with widely varying response times which appear to be
related to varying "log file sync" wait times.
Here is a profile of a "fast" run:
CALL-NAME DURATION % CALLS MEAN
MIN MAX
------------------------------ --------- ------ ------ --------
-------- ---------
SQL*Net message from client 53.197782 91.8% 10,092 0.005271
0.000177 28.568493
EXEC 3.759177 6.5% 9,816 0.000383
0.000000 0.239592
row cache lock 0.233153 0.4% 541 0.000431
0.000113 0.000941
PARSE 0.140399 0.2% 4,867 0.000029
0.000000 0.006620
DLM cross inst call completion 0.137330 0.2% 956 0.000144
0.000004 0.000505
library cache lock 0.100171 0.2% 215 0.000466
0.000151 0.002133
library cache pin 0.079729 0.1% 216 0.000369
0.000056 0.000710
FETCH 0.058253 0.1% 1,062 0.000055
0.000000 0.004148
log file sync 0.048217 0.1% 149 0.000324
0.000259 0.000505
CLOSE 0.045416 0.1% 4,929 0.000009
0.000000 0.000073
20 others 0.135624 0.2% 11,854 0.000011
0.000000 0.001700
------------------------------ --------- ------ ------ --------
-------- ---------
TOTAL (30) 57.935251 100.0% 44,697 0.001296
0.000000 28.568493
Here is a profile of a "slow" run:
CALL-NAME DURATION % CALLS MEAN
MIN MAX
------------------------------ ---------- ------ ------ --------
-------- ----------
SQL*Net message from client 131.186118 61.0% 10,092 0.012999
0.000212 106.789360
log file sync 79.291166 36.8% 150 0.528608
0.000264 2.986575
EXEC 3.728402 1.7% 9,816 0.000380
0.000000 0.221403
row cache lock 0.248868 0.1% 542 0.000459
0.000111 0.001036
PARSE 0.164267 0.1% 4,867 0.000034
0.000000 0.004652
DLM cross inst call completion 0.146981 0.1% 957 0.000154
0.000005 0.001188
library cache lock 0.104354 0.0% 218 0.000479
0.000160 0.000728
library cache pin 0.082504 0.0% 202 0.000408
0.000157 0.000672
FETCH 0.056687 0.0% 1,062 0.000053
0.000000 0.003969
CLOSE 0.043590 0.0% 4,929 0.000009
0.000000 0.000180
20 others 0.142044 0.1% 11,866 0.000012
0.000000 0.001792
------------------------------ ---------- ------ ------ --------
-------- ----------
TOTAL (30) 215.194981 100.0% 44,701 0.004814
0.000000 106.789360
looking at even histogram for that event:
SQL> @evh "log file sync"
EVH_EVENT EVH_WAIT_TIME_MILLI WAIT_COUNT EVH_EST_TIME
LAST_UPDATE_TIME
________________ ______________________ _____________ _______________
______________________________________
log file sync < 1 200051 100.026
27-JAN-20 11.39.57.344734 PM -06:00
log file sync < 2 165 0.248
28-JAN-20 12.18.10.429089 AM -06:00
log file sync < 4 150 0.45
27-JAN-20 11.18.31.158102 PM -06:00
log file sync < 8 199 1.194
27-JAN-20 11.19.14.209947 PM -06:00
log file sync < 16 253 3.036
28-JAN-20 08.03.17.851328 AM -06:00
log file sync < 32 472 11.328
27-JAN-20 11.20.22.746033 PM -06:00
log file sync < 64 728 34.944
28-JAN-20 01.13.37.364541 AM -06:00
log file sync < 128 691 66.336
27-JAN-20 11.31.37.400504 PM -06:00
log file sync < 256 414 79.488
28-JAN-20 12.18.10.423987 AM -06:00
log file sync < 512 405 155.52
28-JAN-20 03.27.50.540383 AM -06:00
log file sync < 1024 459 352.512
27-JAN-20 11.35.14.378363 PM -06:00
log file sync < 2048 482 740.352
28-JAN-20 01.18.20.556248 AM -06:00
log file sync < 4096 576 1769.472
27-JAN-20 11.21.05.084998 PM -06:00
log file sync < 8192 89 546.816
27-JAN-20 11.57.36.436460 AM -06:00
log file sync < 16384 60 737.28
25-JAN-20 07.48.31.460408 AM -06:00
log file sync < 32768 39 958.464
27-JAN-20 11.59.09.869286 AM -06:00
log file sync < 65536 27 1327.104
25-JAN-20 09.49.13.856563 AM -06:00
The weird thing is that I don't see corresponding log I/O waits (awaits)
in iostat output.
I have a ticket open w/ oracle but does anyone have any suggestions to
discover root cause and/or solution?