I have never had a good experience with
S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology). It usually only reports the drive is about to fail after the drive has already failed and is completely unreadable. My latest drive failure is yet another case of this, but interesting in that this failure seems to have been quite easily predictable. The drive was obviously failing from the output of the S.M.A.R.T. monitoring system, with 197 new defects, and over 50 uncorrected errors. Yet the software in the drive still reports
SMART Health Status: OK
.
Device: SEAGATE ST336607LW Version: 0007
Serial number: XXXXXXXXXXXXXXXXX
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Thu Feb 1 21:25:04 2007 EST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 28 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 197
Vendor (Seagate) cache information
Blocks sent to initiator = 183468359
Blocks received from initiator = 3071358557
Blocks read from cache and sent to initiator = 53688899
Number of read and write commands whose size <= segment size = 918209810
Number of read and write commands whose size > segment size = 319145
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 18768.45
number of minutes until next internal SMART test = 102
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 62105 40 6 62151 73380 661.309 56
write: 0 0 31 31 2263 224.643 55
Non-medium error count: 302
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background long Completed, segment failed - 18716 - [0x4 0x3e 0x3]
# 2 Background long Completed, segment failed - 18716 - [- - -]
# 3 Background short Completed - 0 - [- - -]
# 4 Background short Completed - 0 - [- - -]
Long (extended) Self Test duration: 768 seconds [12.8 minutes]
The S.M.A.R.T. system was useful in showing that the drive hardware, and not any components at a higher level were failing, but as this demonstrates, never trust the S.M.A.R.T. health status itself. Look into the details and verify your data.