Noticed yesterday that one of my 500G Western Digital drives wasn't showing as mounted on the server. As it's full of films for the kids, and they're currently on holiday, I decided to investigate earlier today. Dmesg was showing lots of messages regarding sense errors and the kernel appeared to be stepping back the SATA speed to try and communicate with the drive. There was also a stack of messages with 'ata4.00: status: { DRDY } ata4: hard resetting link ata4'.
Shut down the server and pulled the drive. Not easy because of a) the location of the server, and b) the graphics card being in the way of getting it out ! Plonked it into an IDE/SATA -> usb caddy and plugged it into laptop. Disk tests fine. All the S.M.A.R.T. figures look fine too, so I put it back in the server.
Server won't boot
Hangs after detecting all the drives, which incidentally all show up fine. With the side off the server, I can hear an all too familiar clicking noise. The sound of disk heads swinging around aimlessly. Anyone that's heard a failed drive will know exactly what I mean. It wasn't the WD 'ticking' though, it was the boot disk.
This was somewhat disconcerting, as the server serves TV, films and music all around the house and the prospect of having to set all that up again was not appealing.
I know this issue can sometimes be caused by bad sata cables, so I swapped cables around with no better results. I was just on the verge of ordering a new drive, when I noticed that both the boot drive, and the WD that earlier had appeared to have failed were both on a SATA power cable coming from a splitter. Although I couldn't see anything wrong with the splitter, I swapped it, and the power cable out.
Server booted at the next attempt, all drives report no issues and all is good. BUT, it goes to show that even though I would have sworn from the 'ticking' that the drive had failed, it actually hadn't and it's well worth checking power and data cables for issues before spending cash !!