[SGVLUG] hd errors in /var/log/message
Claude Felizardo
cafelizardo at gmail.com
Wed May 14 16:48:37 PDT 2008
On Wed, May 14, 2008 at 7:04 AM, James Neff <jneff at tethyshealth.com> wrote:
> Greetings,
>
> I am seeing this in my /var/log/message:
>
> May 11 04:05:59 private-gateway kernel: hda: dma_intr: status=0x51 {
> DriveReady SeekComplete Error }
> May 11 04:05:59 private-gateway kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=79957639, sector=79957632
> May 11 04:05:59 private-gateway kernel: ide: failed opcode was: unknown
> May 11 04:05:59 private-gateway kernel: end_request: I/O error, dev hda,
> sector 79957632
>
>
> It only occurs at the time 4:05am when my daily crontab runs. Also, it is
> always the same sector.
>
> Should I be concerned or take any action at this time?
> Thanks in advance,
> James
Have you checked the SMART attributes using smartctl or equiv? Run
any self tests?
see http://www.ntfs.com/disk-monitor-smart-attributes.htm
here's what I get on my machine here at work
smartctl --health /dev/sda
smartctl version 5.37 [i586-mandriva-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Have you done a quick google search on "DriveReady SeekComplete Error"?
Aren't SMART drives suppose to remap when you get a bad sector?
According a few sites I looked at it might require a reformat or
something to force the remapping.
One place suggests checking the current_Pending_Sector to see if it's
anything but zero:
smartctl --all /dev/sda | grep Sector
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
Check here for how to run self tests:
http://www.captain.at/howto-linux-smartmontools-smartctl.php
Be carefull, though. I once ran some of these when I was tracking
down a bad cable and boy did that make a mess of things...
smartctl -t short /dev/sda
Please wait 2 minutes for test to complete.
Test will complete after Wed May 14 16:24:05 2008
...
smartctl -l selftest /dev/sda
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 22494 -
Hey, does anyone use smartd and have an example configuration file in
use? I'm a little confused on the use of the DEVICESCAN option.
Should it be used or commented out?
btw, do you monitor the temperature of the drive to see if it's
getting significantly warmer during this time? At home, two of the
drives climb a few degrees but the one at the top jumps about 4
degrees for a few hours while it's running msec and stuff.
More information about the SGVLUG
mailing list