Cc'ing list now. On Tue, Aug 2, 2011 at 2:07 PM, Stalin <stalinsk@xxxxxxxxx> wrote: > Apparently we had an issue with controller/array. Oracle finally agreed to > the problem and provided the replacement. > > > On Mon, Jul 25, 2011 at 2:10 PM, Stalin <stalinsk@xxxxxxxxx> wrote: > >> It is 15K RPM, 300G drives. >> >> Thanks Harel for the pointers. I will report back when i hear from storage >> vendor. >> >> >> On Mon, Jul 25, 2011 at 12:20 PM, Harel Safra <harel.safra@xxxxxxxxx>wrote: >> >>> Stalin, >>> You haven't specified if the drives are 15k or 10k RPM or the size and >>> configuration of the SAN cache, so lets assume 15k, write through cache and >>> do some back of the napkin calculations: >>> As a rule of thumb a 15k RPM SAS drive can do about 180 IOPS. Since you >>> have 22 drives in your array the whole array can do 180*22=3960 IOPS, lets >>> call that 4000 IOPS. >>> Your array is RAID 1+0 so every database write IO means twice the write >>> IO on the drives, so your 1769 writes/s mean ~3500 IOPS to the array. Add >>> the ~250 reads/s and you're indeed getting real close to the limit of the >>> array. >>> Even if the SAN is writing to cache only, if you're sustaining ~1750 w/s >>> the cache quite possibly won't be able to be flushed fast enough. >>> >>> Grill your storage vendor, they should have the metrics to test if the >>> array is reaching its limits. >>> >>> Harel Safra >>> >>> >>> On Mon, Jul 25, 2011 at 8:34 PM, Stalin <stalinsk@xxxxxxxxx> wrote: >>> >>>> Well this is a T5220 Cool thread server, apparently good for OLTP type >>>> applications but not good for batch or warehouse type application, unless >>>> you use parallel query options. >>>> >>>> I had got the IOstat numbers during the slowness period, which seems >>>> little puzzling to me. >>>> >>>> extended device statistics >>>> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >>>> 253.7 1769.0 2048.6 15844.8 222.5 253.2 110.0 125.2 94 100 /data >>>> >>>> With 16MB/s writes, we are seeing service time of 125ms. And also >>>> looking the wait time in the Queue, seems like pushing the array to its >>>> limits, which i can't believe. Is this normal for an array with 22 disks in >>>> Raid 1+0 (300G SAS drives, FC attached, SAN storagetek 2540). We have a >>>> ticket opened with Sun/Oracle, but no progress made thus far. >>>> >>>> We had a bad drive, however spare kicked in, scheduled for replacement. >>>> And no errors seen in the path to the array. Any clues what might be >>>> happening. >>>> >>>> On Thu, Jul 21, 2011 at 8:47 PM, Chitale, Hemant Krishnarao < >>>> Hemant.Chitale@xxxxxx> wrote: >>>> >>>>> >>>>> This seems to be similar to this thread : >>>>> http://forums.oracle.com/forums/thread.jspa?threadID=2256521&tstart=0 >>>>> >>>>> >>>>> 1.4million commits and 1.4million 'log file sync' waits of 3seconds >>>>> each ?!!! >>>>> >>>>> >>>>> Given that you have reported (from another email) >>>>> >>>>> Event Waits <1ms <2ms <4ms <8ms <16ms <32ms >>>>> <=1s >1s >>>>> -------------------------- ----- ----- ----- ----- ----- ----- ----- >>>>> ----- ----- >>>>> log file parallel write 38K 72.5 15.4 5.4 2.0 .8 .4 >>>>> 1.3 2.2 >>>>> log file sync 838K 2.9 1.0 .5 1.7 1.7 .8 >>>>> 7.6 83.8 >>>>> >>>>> I would guess that are are certain very very large spikes in I/O >>>>> response times (or that there's a bug in the timed_statistics) >>>>> >>>>> (A 64 CPU install without the Diagnostic Pack licence ?) >>>>> >>>>> >>>>> Hemant K Chitale >>>>> >>>>> ________________________________________ >>>>> From: oracle-l-bounce@xxxxxxxxxxxxx [mailto: >>>>> oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Stalin >>>>> Sent: Thursday, July 21, 2011 6:37 AM >>>>> To: oracle-l >>>>> Subject: Deadlock ITL Waits >>>>> >>>>> We have been seeing lots of deadlock errors lately in load testing >>>>> environments and they all have been due to enq: TX - allocate ITL entry. >>>>> In >>>>> reviewing the statspack report for the periods of deadlock, i see that, >>>>> log >>>>> file sync wait being the top consumer with a terrible wait time. That >>>>> makes >>>>> to me think the deadlock, is just a symptom of high log file sync wait >>>>> times. Below is the snippet from statspack and looking at these numbers, >>>>> especially CPU not being heavily loaded, wondering if this could be a case >>>>> of storage issue. Sys Admins are checking the storage layer but thought >>>>> would check here get any opinions/feedback. >>>>> >>>>> Top 5 Timed Events >>>>> Avg %Total >>>>> ~~~~~~~~~~~~~~~~~~ >>>>> wait Call >>>>> Event Waits Time (s) >>>>> (ms) Time >>>>> ----------------------------------------- ------------ ----------- >>>>> ------ ------ >>>>> log file sync 1,400,773 4,357,902 >>>>> 3111 91.4 >>>>> db file sequential read 457,568 334,834 >>>>> 732 7.0 >>>>> db file parallel write 565,843 27,573 >>>>> 49 .6 >>>>> read by other session 16,168 7,395 >>>>> 457 .2 >>>>> enq: TX - allocate ITL entry 575 6,854 >>>>> 11919 .1 >>>>> ------------------------------------------------------------- >>>>> Host CPU (CPUs: 64 Cores: 8 Sockets: 1) >>>>> ~~~~~~~~ Load Average >>>>> Begin End User System Idle WIO >>>>> WCPU >>>>> ------- ------- ------- ------- ------- ------- >>>>> -------- >>>>> 3.13 7.04 2.26 3.30 94.44 0.00 >>>>> 7.81 >>>>> >>>>> Statistic Total per Second >>>>> per Trans >>>>> --------------------------------- ------------------ -------------- >>>>> ------------ >>>>> redo synch time 435,852,302 120,969.3 >>>>> 309.7 >>>>> redo synch writes 1,400,807 388.8 >>>>> 1.0 >>>>> redo wastage 5,128,804 1,423.5 >>>>> 3.6 >>>>> redo write time 357,414 99.2 >>>>> 0.3 >>>>> redo writes 9,935 2.8 >>>>> 0.0 >>>>> user commits 1,400,619 388.7 >>>>> 1.0 >>>>> >>>>> >>>>> Environment : 11gr2 EE (11.2.0.1), Sol 10 Sparc >>>>> >>>>> Thanks, >>>>> Stalin >>>>> >>>>> This email and any attachments are confidential and may also be >>>>> privileged. If you are not the addressee, do not disclose, copy, >>>>> circulate >>>>> or in any other way use or rely on the information contained in this email >>>>> or any attachments. If received in error, notify the sender immediately >>>>> and >>>>> delete this email and any attachments from your system. Emails cannot be >>>>> guaranteed to be secure or error free as the message and any attachments >>>>> could be intercepted, corrupted, lost, delayed, incomplete or amended. >>>>> Standard Chartered PLC and its subsidiaries do not accept liability for >>>>> damage caused by this email or any attachments and may monitor email >>>>> traffic. >>>>> >>>>> Standard Chartered PLC is incorporated in England with limited >>>>> liability under company number 966425 and has its registered office at 1 >>>>> Aldermanbury Square, London, EC2V 7SB. >>>>> >>>>> Standard Chartered Bank ("SCB") is incorporated in England with limited >>>>> liability by Royal Charter 1853, under reference ZC18. The Principal >>>>> Office >>>>> of SCB is situated in England at 1 Aldermanbury Square, London EC2V 7SB. >>>>> In >>>>> the United Kingdom, SCB is authorised and regulated by the Financial >>>>> Services Authority under FSA register number 114276. >>>>> >>>>> If you are receiving this email from SCB outside the UK, please click >>>>> http://www.standardchartered.com/global/email_disclaimer.html to refer >>>>> to the information on other jurisdictions. >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks, >>>> >>>> Stalin >>>> >>> >>> >> >> >> -- >> Thanks, >> >> Stalin >> > > > > -- > Thanks, > > Stalin > -- Thanks, Stalin