Kitz Forum

Chat => Tech Chat => Topic started by: sevenlayermuddle on November 21, 2019, 11:07:49 AM

Title: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 11:07:49 AM
Ever get the feeling things are conspiring against you?

Situation - HP Microserver running Centos, fulfils various light duties such as NAS, source control, and also MythTV recording.

Sequence of events as follows...

1. MythTV falls into disuse after my Freeview HD TV tuner dies.  Not trivial to fix as that tuner is no longer available, and more recent tuners not compatible with my version of MythTV .  Goes on the ToDo list.

2. Few weeks later, MythTV still out of service, a 2TB disk becomes full so I replace it with a 4TB, a decent  WD Red NaS.  Smart status looks fine, and a long smart test yields no errors.  All seems well, and server ploughs on happily.

3. Couple months later, I get around to fixing MythTV.  Feeling chuffed as I had to make some source code edits to add support for the new tuner, manually back porting a patch that's not released for my version.  But it all looked straightforwards and the new tuner works.  However playback freezes and crashes if co-incident with other recordings.  :(

Problem turns out to the the disk added at (2). MythTV can generate quite heavy disk activity, recording up to 5 channels in full HD at the same time.  And when under this stress, kernel log file shows incidences of groups of errors of the form " failed command: WRITE FPDMA QUEUED", after which i/o seems to cease for a short time.  MythTV playback, starved of data, throws a wobbly.   I gather these error logs generally suggest a problem with the Sata interface rather than disk itself so tried moving the disk to a different backplane slot.  Same result, so finger of suspicion points at that nearly new disk.  :'(

So yesterday's job... another new 4TB disk, Seagate this time, and fingers crossed, it seems ok!   I know I'm tempting fate saying so, it's early days, but it's been playing for about 20 mins whereas before it'd freeze within a minute or so.  Nothing bad in kernel log.

I'll need to give it a week or two to make certain all is ok.  Wish me luck, perhaps?  :fingers:

But even if the server does now behave itself, I'll be stuck with a nearly new disk that's too late to return to seller.  It'll need to be RMA'd and they might give me a hard time as I've very little evidence other than a kernel logfile, to show that anything is amiss.  Smart status still looks perfect.    >:(

And if it still acts up I'm even worse off, having an even more obscure fault, and no excuse to return either HDD.   ::)
Title: Re: Really annoying HDD fault
Post by: chenks on November 21, 2019, 11:14:46 AM
when did you buy the HDD?
if it's under 12 months then point of return would be the seller (even though they'd argue the case they would be wrong).
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 11:25:11 AM
The disk's a couple months old.

I got it from BT shop, who's policy seems to be that if a product develops a fault (as opposed to being faulty on delivery), you need to take it up with the manufacturer.   If that fails to yield a result, it is returnable to BT shop.

I know their interpretation of the law might seem weak but these days, I resist challenging these things.   Even if I win the argument, by the time I have won, I invariably feel the worse for the stress endured. :'(

I know, that probably makes me a wimp.  But I'm a wimp with blood pressure has much improved from the days when I was more argumentative.
Title: Re: Really annoying HDD fault
Post by: chenks on November 21, 2019, 01:29:34 PM
their policy can say anything it wants, but you have trading laws on your side that allow you to return it to the point of purchase.
you don't have the automatic right to replacement, but you do have the right for them to offer a repair or replacement (whichever they choose).
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 02:00:41 PM
That’s the other thing, I don’t want a like for like replacement or repair as I’d have a lingering suspicion that, rather than a fault, it might just be some specific incompatibility with my system.  So whilst I’ve had perfectly good WD disks before, I’ve lost faith in that particular model.

But especially if dealing with manufacturer, and to some extent even the retailer, a like-for-like replacement is what I’d likely get. :(

I suppose I could always just take it apart and use the nice shiny platters for Christmas tree decorations?

Istr that the head control magnets are fun to play with too, but beware of getting fingers and thumbs squashed as they snap together.
Title: Re: Really annoying HDD fault
Post by: chenks on November 21, 2019, 02:06:49 PM
well i don't think you are entitled to a refund, from either the seller or manufacturer.
unless, of course, you can prove an inherrent fault that existed at the point of purchase, but the onus is on you to prove that.
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 02:28:32 PM
I suspect you’re right.

The only retailer I know to offer genuine and unlimited refunds is Costco.   It’s worked to my advantage, to the point of being embarrassing, at least twice.  One was a camcorder a few months old, another was a shredder a few years old.  Both were genuinely faulty and in each case I simply turned up with the old device, which was refunded in full without question.  And in each case, I then walked over to the shelf and picked up an identical new one, now discounted.  And having paid at checkout, home I went, with a few more £££ in my pocket and a brand new replacement.

Unfortunately Costco don’t sell Sata disks and even if they did, their returns policy for computer gear is slightly more restrictive. :)
Title: Re: Really annoying HDD fault
Post by: vic0239 on November 21, 2019, 02:54:55 PM
I suppose I could always just take it apart and use the nice shiny platters for Christmas tree decorations?
Those things are lethal! I dismantled an old HDD and one of the platters shattered with the only the slightest pressure. A shard of glass pierced my finger, result - gushing blood everywhere and a lasting sensation that there is still something embedded in said finger!  :'(
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 03:00:41 PM
Those things are lethal! I dismantled an old HDD and one of the platters shattered with the only the slightest pressure. A shard of glass pierced my finger, result - gushing blood everywhere and a lasting sensation that there is still something embedded in said finger!  :'(

I’ve made the same mistake and narrowly escaped injury, with a 2.5 inch laptop disk, with glass platters.

This however is a 3.5 inch drive and I have never known a 3.5 inch to have glass platters, always alloy in my experience.

But you are certainly correct to suggest caution. :)

Title: Re: Really annoying HDD fault
Post by: parkdale on November 21, 2019, 05:04:35 PM
The few times iv'e had data errors on Sata disks were mostly down to the data cable connections, or when using removable drive cages.. :-X
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 21, 2019, 06:22:19 PM
The few times iv'e had data errors on Sata disks were mostly down to the data cable connections, or when using removable drive cages.. :-X

Same for me but this time the evidence pointed squarely towards the disk itself. 
Title: Re: Really annoying HDD fault
Post by: Chrysalis on November 22, 2019, 05:28:46 PM
WD seem be very good at RMA's.

I had a drive out of warranty, not only did they say no problem, but I got a brand new drive back, when many companies try to give you used stuff.

I wonder also if head parking was your issue, there is a tool where you can disable it or at least adjust it so its less aggressive on WD drives.

Bear in mind tho my drive failed the quick test in the official WD diag tool.  If your drive passes that test, then you may have a hard time.
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 22, 2019, 06:29:07 PM
Main thing is, I seem to have got the system working properly again.   A model of perfection for a day and a half now. It had also been raising lots of errors during my overnight cron job, silent last night.

Attributing the fault to the disk itself seemed like a long shot, but that’s what the evidence suggested. And splashing out on yet another new drive, maybe against the odds, might easily have proven to be  good money after bad.  Thankfully though looks like the right decision, regardless of where I go from here.

Just so annoying, the sequence of events.  MythTV dies, then new HDD all good, then MythTV fixed, then new HDD not good after all.    As said, sometimes life just conspires against us. :D

Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 22, 2019, 06:38:57 PM
I wonder also if head parking was your issue, there is a tool where you can disable it or at least adjust it so its less aggressive on WD drives.

Sorry Chrysalis, I missed that point.   I’ve heard rumours of head parking issues but it doesn’t fit the symptoms.

Just reading the disk, or just writing it, all was well.   But with a mixture of reading and writing, the errors started.   I even managed to reproduce it a couple of times by just repeatedly copying a big 10GB file to another file on same partition, within a ‘while’ loop.

Title: Re: Really annoying HDD fault
Post by: Chrysalis on November 23, 2019, 07:01:48 AM
Well I actually did a drive swap in my STB because of head parking, un parking a head adds a delay, and in old kernels I noticed that can even translate to a i/o error in the logs.  Given its a new drive I thought I would say that as a suggestion as to why you had it on the new drive, but of course I was probably wrong, was just trying to offer a possible diagnosis.

With read/write mixed, and getting errors, then yeah thats probably not head parking as it wont park the head whilst during tasks.

Have you used the wd diag tool yet?
Title: Re: Really annoying HDD fault
Post by: sevenlayermuddle on November 23, 2019, 09:21:17 AM

Is that their ‘data lifeguard diagnostics’?   The server is Linux, which doesn’t seem to be supported by that tool.  I think they publish instructions to create and run from a bootable USB but even that would be a pain as the server runs headless, no console or keyboard normally attached.

But looking at the spec for the Windows version if the tool, it doesn’t seem to do anything that smartctl, combined with other basic linux utilities, won’t do....

https://support.wdc.com/downloads.aspx?lang=en
Quote
QUICK TEST - performs SMART drive quick self-test to gather and verify the Data Lifeguard information contained on the drive.
EXTENDED TEST - performs a Full Media Scan to detect bad sectors. Test may take several hours to complete depending on the size of the drive.
ERASE - writes zeros to the drive with options of Full Erase and Quick Erase. File system and data will be lost.
VIEW TEST RESULT - displays the latest test results.

...All these can be done from linux command line with smartctl, and I have already confirmed SMART attributes wrongly indicate that the  disk is perfectly ok.  That’s one of the things that made it unlikely (but not impossible) that the disk itself was culprit.

Grateful as I am for the assistance, I’m not going to get too stressed over this.    If I’m stuck with a disk that I can’t return because it’s too much bother then I’ll do just that    ...not bother, and move on with life.
Title: Re: Really annoying HDD fault
Post by: Chrysalis on November 23, 2019, 09:30:17 AM
yeah but its a corporate, they have policies and all ;)

But maybe they would accept without you running the tool. :)