Ever get the feeling things are conspiring against you?
Situation - HP Microserver running Centos, fulfils various light duties such as NAS, source control, and also MythTV recording.
Sequence of events as follows...
1. MythTV falls into disuse after my Freeview HD TV tuner dies. Not trivial to fix as that tuner is no longer available, and more recent tuners not compatible with my version of MythTV . Goes on the ToDo list.
2. Few weeks later, MythTV still out of service, a 2TB disk becomes full so I replace it with a 4TB, a decent WD Red NaS. Smart status looks fine, and a long smart test yields no errors. All seems well, and server ploughs on happily.
3. Couple months later, I get around to fixing MythTV. Feeling chuffed as I had to make some source code edits to add support for the new tuner, manually back porting a patch that's not released for my version. But it all looked straightforwards and the new tuner works. However playback freezes and crashes if co-incident with other recordings.
Problem turns out to the the disk added at (2). MythTV can generate quite heavy disk activity, recording up to 5 channels in full HD at the same time. And when under this stress, kernel log file shows incidences of groups of errors of the form " failed command: WRITE FPDMA QUEUED", after which i/o seems to cease for a short time. MythTV playback, starved of data, throws a wobbly. I gather these error logs generally suggest a problem with the Sata interface rather than disk itself so tried moving the disk to a different backplane slot. Same result, so finger of suspicion points at that nearly new disk.
So yesterday's job... another new 4TB disk, Seagate this time, and fingers crossed, it seems ok! I know I'm tempting fate saying so, it's early days, but it's been playing for about 20 mins whereas before it'd freeze within a minute or so. Nothing bad in kernel log.
I'll need to give it a week or two to make certain all is ok. Wish me luck, perhaps?
But even if the server does now behave itself, I'll be stuck with a nearly new disk that's too late to return to seller. It'll need to be RMA'd and they might give me a hard time as I've very little evidence other than a kernel logfile, to show that anything is amiss. Smart status still looks perfect.
And if it still acts up I'm even worse off, having an even more obscure fault, and no excuse to return either HDD.