[SGVLUG] Recommendation of an open source hardware diagnostic tool
Scott Packard
spackard at gmail.com
Mon Mar 3 08:44:36 PST 2014
http://unix.stackexchange.com/q/117742?atw=1
Given the above diagram I doubt there's one diagnostic tool that does it
all.
Regards, Scott
On Sunday, March 2, 2014, Matthew Campbell <dvdmatt at gmail.com> wrote:
> Does anyone have a hardware diagnostic tool they like, preferably open
> source? I have been fighting a host for two weeks now and after finding
> and submitted 2 kernel bugs have begun to suspect that the problems I am
> running into are being exposed by a hardware failure.
>
> The system appears to be running fine, but every 10-15 seconds will zone
> out for a couple of seconds. At first I thought it was a BTRFS bug, and
> the errors I was seeing turned out to be just that.
>
> Once they were fixed the freezing kept on. Further poking uncovered a NFS
> bug in its interaction with the underlying filesystem, but having also
> patched the kernel for that the poor performance continues.
>
> Now I'm starting to see errors of this sort in my syslog:
>
> 2014-03-02T22:39:00.262Z cpu6:34527)WARNING: LinScsi:
> SCSILinuxQueueCommand:1207: queuecommand failed with status = 0x1056
> Unknown status vmhba0:0:0:0 (driver name: ahci) - Message repeated 4 times
> 2014-03-02T22:39:00.262Z cpu2:32791)ScsiDeviceIO: 2324:
> Cmd(0x412e8088eac0) 0x4d, CmdSN 0x784 from world 0 to dev
> "t10.ATA_____INTEL_SSDSC2BW240A4_____________________CVDA341000752403GN__"
> failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.
> 2014-03-02T22:39:00.275Z cpu2:32784)ScsiDeviceIO: 2324:
> Cmd(0x412e80842b00) 0x28, CmdSN 0x51c3 from world 32878 to dev
> "t10.ATA_____INTEL_SSDSC2BW240A4_____________________CVDA341000752403GN__"
> failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.
>
> Could my SSD be failing? But I just replaced the previous boot disk as it
> looked like it was failing...
>
> Device sense code D:0x8 equates to 08h BUSY according to these docs:
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902
>
> It could be a MOBO issue with the SATA port or even the CPU or RAM. Ugh.
>
> I tried memtest86 and all passed...
>
> Any suggestions on a full-system hardware test suite would be much
> appreciated.
>
> Matt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sgvlug.net/pipermail/sgvlug/attachments/20140303/61cdaa1d/attachment-0001.html>
More information about the SGVLUG
mailing list