[SGVLUG] Recommendation of an open source hardware diagnostic tool

Matthew Campbell dvdmatt at gmail.com
Sun Mar 2 15:23:56 PST 2014


Does anyone have a hardware diagnostic tool they like, preferably open
source?  I have been fighting a host for two weeks now and after finding
and submitted 2 kernel bugs have begun to suspect that the problems I am
running into are being exposed by a hardware failure.

The system appears to be running fine, but every 10-15 seconds will zone
out for a couple of seconds.  At first I thought it was a BTRFS bug, and
the errors I was seeing turned out to be just that.

Once they were fixed the freezing kept on.  Further poking uncovered a NFS
bug in its interaction with the underlying filesystem, but having also
patched the kernel for that the poor performance continues.

Now I'm starting to see errors of this sort in my syslog:

2014-03-02T22:39:00.262Z cpu6:34527)WARNING: LinScsi:
SCSILinuxQueueCommand:1207: queuecommand failed with status = 0x1056
Unknown status vmhba0:0:0:0 (driver name: ahci) - Message repeated 4 times
2014-03-02T22:39:00.262Z cpu2:32791)ScsiDeviceIO: 2324: Cmd(0x412e8088eac0)
0x4d, CmdSN 0x784 from world 0 to dev
"t10.ATA_____INTEL_SSDSC2BW240A4_____________________CVDA341000752403GN__"
failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-03-02T22:39:00.275Z cpu2:32784)ScsiDeviceIO: 2324: Cmd(0x412e80842b00)
0x28, CmdSN 0x51c3 from world 32878 to dev
"t10.ATA_____INTEL_SSDSC2BW240A4_____________________CVDA341000752403GN__"
failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.

Could my SSD be failing?  But I just replaced the previous boot disk as it
looked like it was failing...

Device sense code D:0x8 equates to 08h  BUSY according to these docs:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902

It could be a MOBO issue with the SATA port or even the CPU or RAM.  Ugh.

I tried memtest86 and all passed...

Any suggestions on a full-system hardware test suite would be much
appreciated.

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sgvlug.net/pipermail/sgvlug/attachments/20140302/4ad12d2c/attachment.html>


More information about the SGVLUG mailing list