[Linuxtrent] warning S.M.A.R.T.: disco morente?

  • From: Emanuele Olivetti <emanuele@xxxxxxxxxxxxxx>
  • To: linuxtrent@xxxxxxxxxxxxx
  • Date: Sun, 15 Nov 2009 00:29:28 +0100


su un portatile che uso ho installato smart-notifier, per avere notifiche
dello stato di salute del disco tramite tecnologia S.M.A.R.T.

Appena configurato e riavviato il demone smartmontools mi arrivano
due notifiche:
This email was generated by the smartd daemon running on:

  host name: portatile
 DNS domain: [Unknown]
 NIS domain: (none)

The following warning/error was logged by the smartd daemon:

Device: /dev/sda, 234001649 Currently unreadable (pending) sectors

For details see host's SYSLOG (default: /var/log/messages).

You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.
This email was generated by the smartd daemon running on:

  host name: portatile
 DNS domain: [Unknown]
 NIS domain: (none)

The following warning/error was logged by the smartd daemon:

Device: /dev/sda, 23406950 Offline uncorrectable sectors

For details see host's SYSLOG (default: /var/log/messages).

You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.

I due messaggi non mi dicono niente di buono. In sintesi:
1) Device: /dev/sda, 234001649 Currently unreadable (pending) sectors
2) Device: /dev/sda, 23406950 Offline uncorrectable sectors

Inoltre se riavvio il demone smartmontools arrivano altre notifiche.
Se capisco bene, il significato e' che il numero di settori "unreadable"
e "uncorrectable" sembra aumentare costantemente...

Ora ci sono due possibilita':
A) non sto capendo bene quello che succede e/o smartmontools da i numeri.
B) il disco sta morendo.

Ditemi che non e' la B...


P.S.: allego l'output dettagliato di "smartctl -F samsung2 -a /dev/sda"

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device Model:     SAMSUNG HS122JC
Serial Number:    S18GJ16Q218038
Firmware Version: GQ100-01
User Capacity:    120,034,123,776 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 1
Local Time is:    Sun Nov 15 00:24:38 2009 CET

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                 (   0) seconds.
Offline data collection
capabilities:                    (0x59) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  62) minutes.

SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       
-       0
  3 Spin_Up_Time            0x0007   091   091   025    Pre-fail  Always       
-       2281
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       
-       12113
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       
-       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       
-       2845
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       
-       730
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       
-       4317
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       
-       768
194 Temperature_Celsius     0x0022   130   070   000    Old_age   Always       
-       36 (Lifetime Min/Max 10/56)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       
-       2845
197 Current_Pending_Sector  0x0032   014   014   000    Old_age   Always       
-       234006505
198 Offline_Uncorrectable   0x0032   092   092   000    Old_age   Always       
-       23407161
199 UDMA_CRC_Error_Count    0x0032   097   097   000    Old_age   Always       
-       9500054
200 Multi_Zone_Error_Rate   0x0032   092   092   000    Old_age   Always       
-       23555200

SMART Error Log Version: 1
ATA Error Count: 7941 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 7941 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or 

  After command completion occurred, registers were:
  -- -- -- -- -- -- --
  00 00 00 00 00 00 00  

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 01 01 00 00 a0 0a      00:00:02.316  NOP [Abort queued commands]
  ef fc f7 7f f6 4b 5b bd  49d+16:45:10.223  SET FEATURES [Reserved for CFA]
  ff 9f 4f fe fd ff f6 f6  43d+11:51:38.417  [VENDOR SPECIFIC]
  ff bf 9e f7 3d df 3b 7c  49d+16:14:43.667  [VENDOR SPECIFIC]
  f7 7f fe fb 47 ff 3f ef  49d+15:38:57.407  [VENDOR SPECIFIC]

Error 7940 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or 

  After command completion occurred, registers were:
  -- -- -- -- -- -- --
  00 00 00 00 00 00 00  

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 01 01 00 00 a0 0a      00:00:02.323  NOP [Abort queued commands]
  ef fc f7 3f f6 0b 5b 9d  49d+16:40:48.079  SET FEATURES [Reserved for CFA]
  ff 9f 4f fe f9 ff f6 f6  43d+11:51:38.417  [VENDOR SPECIFIC]
  ff bf 9e f7 3c df 3b 7c  49d+16:14:43.667  [VENDOR SPECIFIC]
  f6 7f fe e9 47 ff 3f ee  43d+10:31:02.379  SECURITY DISABLE PASSWORD

Error 7939 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or 

  After command completion occurred, registers were:
  -- -- -- -- -- -- --
  00 00 00 00 00 00 00  

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 01 01 00 00 a0 0a      00:00:01.990  NOP [Abort queued commands]
  ca 00 08 07 02 58 e0 08      00:01:52.088  WRITE DMA
  ca 00 08 8f 01 58 e0 08      00:01:52.088  WRITE DMA
  ca 00 10 8f 00 58 e0 08      00:01:52.087  WRITE DMA
  ca 00 10 3f 00 58 e0 08      00:01:52.087  WRITE DMA

Error 7938 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or 

  After command completion occurred, registers were:
  -- -- -- -- -- -- --
  00 00 00 00 00 00 00  

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 00 01 01 00 00 a0 0a      00:00:02.325  NOP [Abort queued commands]
  ef fc f7 3f e6 0b 5b bd  49d+16:40:48.079  SET FEATURES [Reserved for CFA]
  ff 9f 4f fe f9 ff f6 f6  43d+11:51:38.417  [VENDOR SPECIFIC]
  ff bf 9e f7 3c df 3b 7c  49d+16:14:43.667  [VENDOR SPECIFIC]
  f6 7f fe e9 47 ff 3f ee  42d+15:17:36.361  SECURITY DISABLE PASSWORD

Error 7937 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
  When the command that caused the error occurred, the device was active or 

  After command completion occurred, registers were:
  -- -- -- -- -- -- --
  00 50 3f ba cd 04 e1  

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 ff 01 01 00 00 a0 0a      00:00:16.871  NOP [Reserved subcommand]
  c8 ff 3f ba cd 04 e1 08      00:00:10.406  READ DMA
  c8 ff 3f 7b cd 04 e1 08      00:00:10.405  READ DMA
  c8 ff 3f 3c cd 04 e1 08      00:00:10.405  READ DMA
  c8 ff 3f fd cc 04 e1 08      00:00:10.392  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  
# 1  Short offline       Completed without error       00%      2845         -
# 2  Short offline       Interrupted (host reset)      90%      1720         -
# 3  Extended offline    Completed without error       00%      1190         -
# 4  Extended offline    Completed without error       00%       336         -
# 5  Short offline       Completed without error       00%         2         -
# 6  Short offline       Interrupted (host reset)      80%         2         -
# 7  Short offline       Completed without error       00%         1         -
# 8  Short offline       Interrupted (host reset)      80%         0         -

SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure 
revision number = 1
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

