Re: impact of one slow drive?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Lethe wrote:
-----Original Message-----<edited for brevity>

From:  "Miles Fidelman"<mfidelman@xxxxxxxxxxxxxxxx>


I've been slowly tracking down a problem on one of my servers - after
prolonged periods of high disk activity, iowait goes to 100% and things
slow down to a crawl.

I recall from previous experience, with a failing drive, that the drive
also tested fine - but seemed to have very high access delays as
compared to other drives (I can't remember what tool I used to measure
it) -- which led me to surmise that something, between the disk platter,
and higher level software, was exhibiting one of two failure modes:
a) very high delay, or,
b) required multiple retries, but ultimately came back with a proper
response
Either way, the performance of one drive dragged down the entire system.
based on symptoms, your disk is frequently in deep recovery cycle. I.e. It tries for 5-20+ seconds to recover data.

Fix is really to get enterprise drives which don't have this problem to begin with, after replacing drive... A patch is to run full media reads often to force remapping of hard-to-read blocks.
By "enterprise drives" are you suggesting drives with TLER? As I've been reading through things, it seems like TLER is designed to avoid having disks drop out of RAID arrays, rather than what I'm looking for - i.e, have the drive drop out if it starts exhibiting high delays.

I notice that FreeBSD's GEOM raid drivers are tunable - there's a parameter "kern.geom.mirror.timeout" that lets you set a timeout condition for dropping a disk from a RAID array. Is there an equivalent in md?

Miles


--

In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux