Re: CRS-1615:voting device hang at 50% fatal, termination in 99620 ms

  • From: Marko Sutic <marko.sutic@xxxxxxxxx>
  • To: "D'Hooge Freek" <Freek.DHooge@xxxxxxxxx>
  • Date: Thu, 25 Aug 2011 10:51:00 +0200

Errors messages from another node:

2011-08-25 10:38:33.563
[cssd(18117)]CRS-1612:node l01ora3 (1) at 50% heartbeat fatal, eviction in
14.000 seconds
2011-08-25 10:38:40.558
[cssd(18117)]CRS-1611:node l01ora3 (1) at 75% heartbeat fatal, eviction in
7.010 seconds
2011-08-25 10:38:41.560
[cssd(18117)]CRS-1611:node l01ora3 (1) at 75% heartbeat fatal, eviction in
6.010 seconds
2011-08-25 10:38:45.558
[cssd(18117)]CRS-1610:node l01ora3 (1) at 90% heartbeat fatal, eviction in
2.010 seconds
2011-08-25 10:38:46.560
[cssd(18117)]CRS-1610:node l01ora3 (1) at 90% heartbeat fatal, eviction in
1.010 seconds
2011-08-25 10:38:47.562
[cssd(18117)]CRS-1610:node l01ora3 (1) at 90% heartbeat fatal, eviction in
0.010 seconds
2011-08-25 10:38:47.574
[cssd(18117)]CRS-1607:CSSD evicting node l01ora3. Details in
/u01/app/crs/log/l01ora4/cssd/ocssd.log.
2011-08-25 10:39:01.579
[cssd(18117)]CRS-1601:CSSD Reconfiguration complete. Active nodes are
l01ora4 .


Regards,
Marko

On Thu, Aug 25, 2011 at 10:48 AM, Marko Sutic <marko.sutic@xxxxxxxxx> wrote:

> Statistics during import:
>
> [root@l01ora3 ~]# sar -u 2 15
> Linux 2.6.18-238.el5 (l01ora3.ot.hr)    08/25/2011
>
> 10:36:55 AM       CPU     %user     %nice   %system   %iowait    %steal
> %idle
> 10:36:57 AM       all      5.59      0.00      0.06      3.12      0.00
> 91.22
> 10:36:59 AM       all      1.84      0.00      0.09      4.43      0.00
> 93.63
> 10:37:01 AM       all      1.50      0.00      0.09      4.81      0.00
> 93.60
> 10:37:03 AM       all      1.44      0.00      0.09      4.78      0.00
> 93.69
> 10:37:05 AM       all      1.47      0.00      0.41      4.87      0.00
> 93.25
> 10:37:07 AM       all      1.47      0.00      0.06      4.75      0.00
> 93.72
> 10:37:09 AM       all      1.22      0.00      0.09      5.18      0.00
> 93.51
> 10:37:11 AM       all      0.22      0.00      0.03      6.15      0.00
> 93.60
> 10:37:13 AM       all      0.28      0.00      0.06      8.72      0.00
> 90.94
> 10:37:15 AM       all      1.53      0.00      0.19      4.93      0.00
> 93.35
> 10:37:17 AM       all      1.47      0.00      0.09      4.72      0.00
> 93.72
> 10:37:19 AM       all      6.28      0.00      0.06      0.00      0.00
> 93.66
> 10:37:21 AM       all      0.31      0.00      0.03      6.03      0.00
> 93.63
> 10:37:23 AM       all      0.00      0.00      0.03     11.31      0.00
> 88.66
> 10:37:25 AM       all      0.06      0.00      0.06     12.48      0.00
> 87.39
> Average:          all      1.65      0.00      0.10      5.75      0.00
> 92.50
>
>
> (I've excluded inactive devices from output)
> $ iostat -xd 5 3
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda               0.00    21.40  0.00 20.80     0.00   337.60    16.23
> 3.47  167.02   3.59   7.46
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 0.00    0.00   0.00   0.00
> sda2              0.00    21.40  0.00 20.80     0.00   337.60    16.23
> 3.47  167.02   3.59   7.46
> sdb               0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.17   1.17   0.14
> sdb1              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.17   1.17   0.14
> sdc               0.00     0.00 23.80  0.20 24371.20     6.40  1015.73
> 0.17    7.28   4.46  10.70
> sdc1              0.00     0.00 23.80  0.20 24371.20     6.40  1015.73
> 0.17    7.28   4.46  10.70
> sde               0.00     0.00  0.40  0.60     0.80     0.80     1.60
> 0.00    1.40   1.40   0.14
> sde1              0.00     0.00  0.40  0.60     0.80     0.80     1.60
> 0.00    1.40   1.40   0.14
> sdo               0.00     0.00 22.80  0.00 23347.20     0.00  1024.00
> 0.17    7.36   4.46  10.16
> sdo1              0.00     0.00 22.80  0.00 23347.20     0.00  1024.00
> 0.17    7.36   4.46  10.16
> sdt               0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.50   1.50   0.18
> sdt1              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.50   1.50   0.18
> sdu               0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> sdu1              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> sdx               0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.33   1.33   0.16
> sdx1              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.33   1.33   0.16
> sdaa              0.00     0.00 20.40  0.60 20691.20    19.20   986.21
> 0.14    6.56   4.00   8.40
> sdaa1             0.00     0.00 20.40  0.60 20691.20    19.20   986.21
> 0.14    6.56   4.00   8.40
> sdai              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> sdai1             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> sdam              0.00     0.00 23.60  0.00 24166.40     0.00  1024.00
> 0.17    7.22   4.45  10.50
> sdam1             0.00     0.00 23.60  0.00 24166.40     0.00  1024.00
> 0.17    7.22   4.45  10.50
> sday              0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 0.17    7.19   4.43  10.64
> sday1             0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 0.17    7.19   4.43  10.64
> sdaz              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.04
> sdaz1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.04
> sdbb              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.04
> sdbb1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.04
> sdbi              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> sdbi1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> sdbk              0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 1.16    6.85  41.68 100.02
> sdbk1             0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 1.16    6.85  41.68 100.02
> sdbr              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 2.00    0.00   0.00 100.02
> sdbr1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 2.00    0.00   0.00 100.02
> sdbw              0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 0.16    6.78   4.24  10.18
> sdbw1             0.00     0.00 24.00  0.00 24576.00     0.00  1024.00
> 0.16    6.78   4.24  10.18
> sdcd              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> sdcd1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> sdci              0.00     0.00 23.60  0.40 23968.00    12.80   999.20
> 0.17    7.14   4.38  10.52
> sdci1             0.00     0.00 23.60  0.40 23968.00    12.80   999.20
> 0.17    7.14   4.38  10.52
> sdcm              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> sdcm1             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> dm-0              0.00     0.00  0.00 42.20     0.00   337.60     8.00
> 4.93  116.84   1.77   7.46
> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> dm-3              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.33   1.33   0.16
> dm-4              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> dm-5              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> dm-6              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> dm-7              0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.50   1.50   0.18
> dm-8              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> dm-9              0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> dm-10             0.00     0.00  0.60  0.80     1.20     1.00     1.57
> 0.00    1.29   1.29   0.18
> dm-11             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> dm-12             0.00     0.00 186.40  1.20 190476.80    38.40  1015.54
>   2.32    7.04   5.33 100.02
> dm-13             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.17   1.17   0.14
> dm-14             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> dm-15             0.00     0.00 186.40  1.20 190476.80    38.40  1015.54
>   2.32    7.04   5.33 100.02
> dm-16             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 3.00    0.00   0.00 100.02
> dm-17             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.17   1.17   0.14
> dm-18             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.33   1.33   0.16
> dm-19             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    1.50   1.50   0.18
> dm-20             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> dm-21             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> dm-22             0.00     0.00  0.60  0.60     1.20     0.60     1.50
> 0.00    0.67   0.67   0.08
> dm-23             0.00     0.00  0.60  0.80     1.20     1.00     1.57
> 0.00    1.29   1.29   0.18
> dm-24             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
> dm-25             0.00     0.00  0.00  0.00     0.00     0.00     0.00
> 1.00    0.00   0.00 100.02
>
>
> Hm... disk utilization is 100% for several devices.
>
> "dm-12" and "dm-12" are devices with database files.
>
>
> FC:
> 24:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
> Express HBA (rev 02)
> 24:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
> Express HBA (rev 02)
>
>
> Regards,
> Marko
>
>
> 2011/8/25 D'Hooge Freek <Freek.DHooge@xxxxxxxxx>
>
>> Marco,
>>
>> Did you notice a high io wait during the import?
>> Do you see the same error message on the other node (on which the import
>> is not running)?
>>
>>
>> Freek D'Hooge
>> Uptime
>> Oracle Database Administrator
>> email: freek.dhooge@xxxxxxxxx
>> tel +32(0)3 451 23 82
>> http://www.uptime.be
>> disclaimer: www.uptime.be/disclaimer
>> ----
>> From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
>> On Behalf Of Marko Sutic
>> Sent: donderdag 25 augustus 2011 9:08
>> To: oracle-l@xxxxxxxxxxxxx
>> Subject: CRS-1615:voting device hang at 50% fatal, termination in 99620 ms
>>
>> Hello all,
>>
>> I have a problem with diagnosing "CRS-1615 voting device hang" error.
>>
>> OS: OEL 5.6 (64-bit)
>> DB: Oracle 11.1.0.7
>> Storage: IBM XIV
>> HW: Blade server 8 cores with Hyper-Threading enabled (64 GB RAM).
>>
>> This is active/passive failover cluster with OCFS2 used as cluster file
>> system.
>> OCR and VotingDisks are placed on OCFS2 mounts.
>>
>>
>> Problem occurs during "impdp" import of 700GB database from NFS.
>> First part of an import, loading data, is done without any problem.
>>
>> During an index creation, after hour or two, node restarts due to the
>> problem with availability of voting disks:
>> "[cssd(28868)]CRS-1615:voting device hang at 50% fatal, termination in
>> 99620 ms, disk (0//ocfs2/voting1/votingdisk)"
>>
>> Impdp parameter file:
>> --
>> userid=xxx/yyy
>> directory=export_dir
>> dumpfile=exportdb_%U.dmp
>> logfile=import.log
>> parallel=16
>> exclude=statistics
>> schemas=(..list of shemas...)
>> exclude=DB_LINK
>> --
>>
>> Database is in noarchivelog mode with "memory_target=7621050368" parameter
>> specified.
>> I've noticed that utilization of the server is not significant during
>> import.
>>
>> As this error is reproducible I'm trying to find how to efficiently
>> diagnose this problem and to trace the cause.
>>
>>
>> If you have any suggestions I would appreciate any help.
>>
>>
>> Regards,
>> Marko
>>
>> --
>> Marko Sutic, dipl.ing.rač.
>> My LinkedIn Profile
>>
>
>
>
> --
> Marko Sutic, dipl.ing.rač.
> My LinkedIn Profile <http://hr.linkedin.com/in/markosutic>
>
>


-- 
Marko Sutic, dipl.ing.rač.
My LinkedIn Profile <http://hr.linkedin.com/in/markosutic>

Other related posts: