Re: intermittent long "log file sync" waits

  • From: Purav Chovatia <puravc@xxxxxxxxx>
  • To: cstephens16@xxxxxxxxx
  • Date: Tue, 28 Jan 2020 21:57:24 +0530

Would suggest to check network latency on the private network

On Tue, 28 Jan 2020 at 9:40 PM, Chris Stephens <cstephens16@xxxxxxxxx>
wrote:

3-node Oracle 19.3 RAC
Centos 7

We have a SQLAlchemy/Python based application workload that is running the
exact same steps with widely varying response times which appear to be
related to varying "log file sync" wait times.

Here is a profile of a "fast" run:

CALL-NAME                        DURATION       %   CALLS      MEAN
MIN        MAX
------------------------------  ---------  ------  ------  --------
 --------  ---------
SQL*Net message from client     53.197782   91.8%  10,092  0.005271
 0.000177  28.568493
EXEC                             3.759177    6.5%   9,816  0.000383
 0.000000   0.239592
row cache lock                   0.233153    0.4%     541  0.000431
 0.000113   0.000941
PARSE                            0.140399    0.2%   4,867  0.000029
 0.000000   0.006620
DLM cross inst call completion   0.137330    0.2%     956  0.000144
 0.000004   0.000505
library cache lock               0.100171    0.2%     215  0.000466
 0.000151   0.002133
library cache pin                0.079729    0.1%     216  0.000369
 0.000056   0.000710
FETCH                            0.058253    0.1%   1,062  0.000055
 0.000000   0.004148
log file sync                    0.048217    0.1%     149  0.000324
 0.000259   0.000505
CLOSE                            0.045416    0.1%   4,929  0.000009
 0.000000   0.000073
20 others                        0.135624    0.2%  11,854  0.000011
 0.000000   0.001700
------------------------------  ---------  ------  ------  --------
 --------  ---------
TOTAL (30)                      57.935251  100.0%  44,697  0.001296
 0.000000  28.568493

Here is a profile of a "slow" run:

CALL-NAME                         DURATION       %   CALLS      MEAN
MIN         MAX
------------------------------  ----------  ------  ------  --------
 --------  ----------
SQL*Net message from client     131.186118   61.0%  10,092  0.012999
 0.000212  106.789360
log file sync                    79.291166   36.8%     150  0.528608
 0.000264    2.986575
EXEC                              3.728402    1.7%   9,816  0.000380
 0.000000    0.221403
row cache lock                    0.248868    0.1%     542  0.000459
 0.000111    0.001036
PARSE                             0.164267    0.1%   4,867  0.000034
 0.000000    0.004652
DLM cross inst call completion    0.146981    0.1%     957  0.000154
 0.000005    0.001188
library cache lock                0.104354    0.0%     218  0.000479
 0.000160    0.000728
library cache pin                 0.082504    0.0%     202  0.000408
 0.000157    0.000672
FETCH                             0.056687    0.0%   1,062  0.000053
 0.000000    0.003969
CLOSE                             0.043590    0.0%   4,929  0.000009
 0.000000    0.000180
20 others                         0.142044    0.1%  11,866  0.000012
 0.000000    0.001792
------------------------------  ----------  ------  ------  --------
 --------  ----------
TOTAL (30)                      215.194981  100.0%  44,701  0.004814
 0.000000  106.789360

looking at even histogram for that event:

SQL> @evh "log file sync"
       EVH_EVENT    EVH_WAIT_TIME_MILLI    WAIT_COUNT    EVH_EST_TIME
                  LAST_UPDATE_TIME
________________ ______________________ _____________ _______________
______________________________________
log file sync                < 1               200051         100.026
27-JAN-20 11.39.57.344734 PM -06:00
log file sync                < 2                  165           0.248
28-JAN-20 12.18.10.429089 AM -06:00
log file sync                < 4                  150            0.45
27-JAN-20 11.18.31.158102 PM -06:00
log file sync                < 8                  199           1.194
27-JAN-20 11.19.14.209947 PM -06:00
log file sync               < 16                  253           3.036
28-JAN-20 08.03.17.851328 AM -06:00
log file sync               < 32                  472          11.328
27-JAN-20 11.20.22.746033 PM -06:00
log file sync               < 64                  728          34.944
28-JAN-20 01.13.37.364541 AM -06:00
log file sync              < 128                  691          66.336
27-JAN-20 11.31.37.400504 PM -06:00
log file sync              < 256                  414          79.488
28-JAN-20 12.18.10.423987 AM -06:00
log file sync              < 512                  405          155.52
28-JAN-20 03.27.50.540383 AM -06:00
log file sync             < 1024                  459         352.512
27-JAN-20 11.35.14.378363 PM -06:00
log file sync             < 2048                  482         740.352
28-JAN-20 01.18.20.556248 AM -06:00
log file sync             < 4096                  576        1769.472
27-JAN-20 11.21.05.084998 PM -06:00
log file sync             < 8192                   89         546.816
27-JAN-20 11.57.36.436460 AM -06:00
log file sync            < 16384                   60          737.28
25-JAN-20 07.48.31.460408 AM -06:00
log file sync            < 32768                   39         958.464
27-JAN-20 11.59.09.869286 AM -06:00
log file sync            < 65536                   27        1327.104
25-JAN-20 09.49.13.856563 AM -06:00

The weird thing is that I don't see corresponding log I/O waits (awaits)
in iostat output.

I have a ticket open w/ oracle but does anyone have any suggestions to
discover root cause and/or solution?

Other related posts: