[SGVLUG] hd errors in /var/log/message

Wed May 14 16:48:37 PDT 2008

On Wed, May 14, 2008 at 7:04 AM, James Neff <jneff at tethyshealth.com> wrote:
> Greetings,
>
> I am seeing this in my /var/log/message:
>
> May 11 04:05:59 private-gateway kernel: hda: dma_intr: status=0x51 {
> DriveReady SeekComplete Error }
> May 11 04:05:59 private-gateway kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=79957639, sector=79957632
> May 11 04:05:59 private-gateway kernel: ide: failed opcode was: unknown
> May 11 04:05:59 private-gateway kernel: end_request: I/O error, dev hda,
> sector 79957632
>
>
> It only occurs at the time 4:05am when my daily crontab runs.  Also, it is
> always the same sector.
>
> Should I be concerned or take any action at this time?
> Thanks in advance,
> James

Have you checked the SMART attributes using smartctl or equiv?  Run
any self tests?

see http://www.ntfs.com/disk-monitor-smart-attributes.htm

here's what I get on my machine here at work

smartctl --health /dev/sda
smartctl version 5.37 [i586-mandriva-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Have you done a quick google search on "DriveReady SeekComplete Error"?

Aren't SMART drives suppose to remap when you get a bad sector?
According a few sites I looked at it might require a reformat or
something to force the remapping.

One place suggests checking the current_Pending_Sector to see if it's
anything but zero:

smartctl --all /dev/sda | grep Sector
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       0

Check here for how to run self tests:
http://www.captain.at/howto-linux-smartmontools-smartctl.php

Be carefull, though.  I once ran some of these when I was tracking
down a bad cable and boy did that make a mess of things...

smartctl -t short  /dev/sda
Please wait 2 minutes for test to complete.
Test will complete after Wed May 14 16:24:05 2008
...
smartctl -l selftest  /dev/sda
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     22494         -

Hey, does anyone use smartd and have an example configuration file in
use?  I'm a little confused on the use of the DEVICESCAN option.
Should it be used or commented out?

btw, do you monitor the temperature of the drive to see if it's
getting significantly warmer during this time?  At home, two of the
drives climb a few degrees but the one at the top jumps about 4
degrees for a few hours while it's running msec and stuff.