Re: Deadlock ITL Waits

From: Stalin <stalinsk@xxxxxxxxx>
To: oracle-l <oracle-l@xxxxxxxxxxxxx>
Date: Tue, 2 Aug 2011 14:08:16 -0700
Cc'ing list now.

On Tue, Aug 2, 2011 at 2:07 PM, Stalin <stalinsk@xxxxxxxxx> wrote:

> Apparently we had an issue with controller/array. Oracle finally agreed to
> the problem and provided the replacement.
>
>
> On Mon, Jul 25, 2011 at 2:10 PM, Stalin <stalinsk@xxxxxxxxx> wrote:
>
>> It is 15K RPM, 300G drives.
>>
>> Thanks Harel for the pointers. I will report back when i hear from storage
>> vendor.
>>
>>
>> On Mon, Jul 25, 2011 at 12:20 PM, Harel Safra <harel.safra@xxxxxxxxx>wrote:
>>
>>> Stalin,
>>> You haven't specified if the drives are 15k or 10k RPM or the size and
>>> configuration of the SAN cache, so lets assume 15k, write through cache and
>>> do some back of the napkin calculations:
>>> As a rule of thumb a 15k RPM SAS drive can do about 180 IOPS. Since you
>>> have 22 drives in your array the whole array can do 180*22=3960 IOPS, lets
>>> call that 4000 IOPS.
>>> Your array is RAID 1+0 so every database write IO means twice the write
>>> IO on the drives, so your 1769 writes/s mean ~3500 IOPS to the array. Add
>>> the ~250 reads/s and you're indeed getting real close to the limit of the
>>> array.
>>> Even if the SAN is writing to cache only, if you're sustaining ~1750 w/s
>>> the cache quite possibly won't be able to be flushed fast enough.
>>>
>>> Grill your storage vendor, they should have the metrics to test if the
>>> array is reaching its limits.
>>>
>>> Harel Safra
>>>
>>>
>>> On Mon, Jul 25, 2011 at 8:34 PM, Stalin <stalinsk@xxxxxxxxx> wrote:
>>>
>>>> Well this is a T5220 Cool thread server, apparently good for OLTP type
>>>> applications but not good for batch or warehouse type application, unless
>>>> you use parallel query options.
>>>>
>>>> I had got the IOstat numbers during the slowness period,  which seems
>>>> little puzzling to me.
>>>>
>>>>                     extended device statistics
>>>>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>>>   253.7 1769.0 2048.6 15844.8 222.5 253.2  110.0  125.2  94 100 /data
>>>>
>>>> With 16MB/s writes, we are seeing service time of 125ms. And also
>>>> looking the wait time in the Queue, seems like pushing the array to its
>>>> limits, which i can't believe. Is this normal for an array with 22 disks in
>>>> Raid 1+0 (300G SAS drives, FC attached, SAN  storagetek 2540). We have a
>>>> ticket opened with Sun/Oracle, but no progress made thus far.
>>>>
>>>> We had a bad drive, however spare kicked in, scheduled for replacement.
>>>> And no errors seen in the path to the array. Any clues what might be
>>>> happening.
>>>>
>>>>  On Thu, Jul 21, 2011 at 8:47 PM, Chitale, Hemant Krishnarao <
>>>> Hemant.Chitale@xxxxxx> wrote:
>>>>
>>>>>
>>>>> This seems to be similar to this thread :
>>>>> http://forums.oracle.com/forums/thread.jspa?threadID=2256521&tstart=0
>>>>>
>>>>>
>>>>> 1.4million commits and 1.4million 'log file sync' waits of 3seconds
>>>>> each ?!!!
>>>>>
>>>>>
>>>>> Given that you have reported (from another email)
>>>>>
>>>>> Event                      Waits  <1ms  <2ms  <4ms  <8ms <16ms <32ms
>>>>>  <=1s   >1s
>>>>> -------------------------- ----- ----- ----- ----- ----- ----- -----
>>>>> ----- -----
>>>>> log file parallel write      38K  72.5  15.4   5.4   2.0    .8    .4
>>>>> 1.3   2.2
>>>>> log file sync               838K   2.9   1.0    .5   1.7   1.7    .8
>>>>> 7.6  83.8
>>>>>
>>>>> I would guess that are are certain very very large spikes in I/O
>>>>> response times  (or that there's a bug in the timed_statistics)
>>>>>
>>>>> (A 64 CPU install without the Diagnostic Pack licence ?)
>>>>>
>>>>>
>>>>> Hemant K Chitale
>>>>>
>>>>> ________________________________________
>>>>> From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:
>>>>> oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Stalin
>>>>> Sent: Thursday, July 21, 2011 6:37 AM
>>>>> To: oracle-l
>>>>> Subject: Deadlock ITL Waits
>>>>>
>>>>> We have been seeing lots of deadlock errors lately in load testing
>>>>> environments and they all have been due to enq: TX - allocate ITL entry. 
>>>>> In
>>>>> reviewing the statspack report for the periods of deadlock, i see that, 
>>>>> log
>>>>> file sync wait being the top consumer with a terrible wait time. That 
>>>>> makes
>>>>> to me think the deadlock, is just a symptom of high log file sync wait
>>>>> times.  Below is the snippet from statspack and looking at these numbers,
>>>>> especially CPU not being heavily loaded, wondering if this could be a case
>>>>> of storage issue. Sys Admins are checking the storage layer but thought
>>>>> would check here get any opinions/feedback.
>>>>>
>>>>> Top 5 Timed Events
>>>>>  Avg %Total
>>>>> ~~~~~~~~~~~~~~~~~~
>>>>> wait   Call
>>>>> Event                                            Waits    Time (s)
>>>>> (ms)   Time
>>>>> ----------------------------------------- ------------ -----------
>>>>> ------ ------
>>>>> log file sync                                1,400,773   4,357,902
>>>>> 3111   91.4
>>>>> db file sequential read                        457,568     334,834
>>>>>  732    7.0
>>>>> db file parallel write                         565,843      27,573
>>>>> 49     .6
>>>>> read by other session                           16,168       7,395
>>>>>  457     .2
>>>>> enq: TX - allocate ITL entry                       575       6,854
>>>>>  11919     .1
>>>>>          -------------------------------------------------------------
>>>>> Host CPU  (CPUs: 64  Cores: 8  Sockets: 1)
>>>>> ~~~~~~~~              Load Average
>>>>>                      Begin     End      User  System    Idle     WIO
>>>>>   WCPU
>>>>>                    ------- -------   ------- ------- ------- -------
>>>>> --------
>>>>>                       3.13    7.04      2.26    3.30   94.44    0.00
>>>>>  7.81
>>>>>
>>>>> Statistic                                      Total     per Second
>>>>>  per Trans
>>>>> --------------------------------- ------------------ --------------
>>>>> ------------
>>>>> redo synch time                          435,852,302      120,969.3
>>>>>    309.7
>>>>> redo synch writes                          1,400,807          388.8
>>>>>      1.0
>>>>> redo wastage                               5,128,804        1,423.5
>>>>>      3.6
>>>>> redo write time                              357,414           99.2
>>>>>      0.3
>>>>> redo writes                                    9,935            2.8
>>>>>      0.0
>>>>> user commits                               1,400,619          388.7
>>>>>      1.0
>>>>>
>>>>>
>>>>> Environment : 11gr2 EE (11.2.0.1), Sol 10 Sparc
>>>>>
>>>>> Thanks,
>>>>> Stalin
>>>>>
>>>>> This email and any attachments are confidential and may also be
>>>>> privileged.  If you are not the addressee, do not disclose, copy, 
>>>>> circulate
>>>>> or in any other way use or rely on the information contained in this email
>>>>> or any attachments.  If received in error, notify the sender immediately 
>>>>> and
>>>>> delete this email and any attachments from your system.  Emails cannot be
>>>>> guaranteed to be secure or error free as the message and any attachments
>>>>> could be intercepted, corrupted, lost, delayed, incomplete or amended.
>>>>>  Standard Chartered PLC and its subsidiaries do not accept liability for
>>>>> damage caused by this email or any attachments and may monitor email
>>>>> traffic.
>>>>>
>>>>> Standard Chartered PLC is incorporated in England with limited
>>>>> liability under company number 966425 and has its registered office at 1
>>>>> Aldermanbury Square, London, EC2V 7SB.
>>>>>
>>>>> Standard Chartered Bank ("SCB") is incorporated in England with limited
>>>>> liability by Royal Charter 1853, under reference ZC18.  The Principal 
>>>>> Office
>>>>> of SCB is situated in England at 1 Aldermanbury Square, London EC2V 7SB. 
>>>>> In
>>>>> the United Kingdom, SCB is authorised and regulated by the Financial
>>>>> Services Authority under FSA register number 114276.
>>>>>
>>>>> If you are receiving this email from SCB outside the UK, please click
>>>>> http://www.standardchartered.com/global/email_disclaimer.html to refer
>>>>> to the information on other jurisdictions.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Stalin
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Stalin
>>
>
>
>
> --
> Thanks,
>
> Stalin
>



-- 
Thanks,

Stalin
Re: Deadlock ITL Waits

Other related posts: