Chris Murphy | 19 Aug 2012 00:40

md raid behavior, bad sector uncorrectable read error

I'm not experiencing this at the moment, but I'm curious about what happens if it were to happen.

A drive detects a sector error, but can't correct it, and returns an 0x40 error to the system.

Presumably md raid (any RAID I would think) becomes aware of this error, as the chunk is incomplete without
the sector. Next, does md raid rebuild the affected chunk from parity on-the-fly? Or is the entire drive
dropped from the array and placed into degraded mode?

If the affected chunk is rebuilt from parity (or copy in case of mirror), does md raid issue a write command to
write the rebuilt chunk back to the disk at the same LBAs? If so, shouldn't this force the drive to determine
if the sector error is transient or persistent, and if persistent the disk firmware will remap the bad
sector to a reserve sector?

Thanks,

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Albert Pauw | 19 Aug 2012 10:54
Picon

Re: md raid behavior, bad sector uncorrectable read error

From what I know (and please correct me if I'm wrong), the drive
happily remaps the sector to a different, spare, location. Only when
it can't do that it throws an error status back. Mind you, more than
10 years ago I had a scsi disk in a linux machine which suddenly broke
down. Checking the logs I saw a lot of remap messages in the log in
the weeks before. So it maybe that the drive gives warnings back up
the controller chain.
If this is true, what is md doing with this? If the remaps increase on
that particular drive does i throw it out of the raid set?

Curious on what actually happens, interesting question.

Just my two cents,

Albert

On 19 August 2012 00:40, Chris Murphy <lists <at> colorremedies.com> wrote:
> I'm not experiencing this at the moment, but I'm curious about what happens if it were to happen.
>
> A drive detects a sector error, but can't correct it, and returns an 0x40 error to the system.
>
> Presumably md raid (any RAID I would think) becomes aware of this error, as the chunk is incomplete without
the sector. Next, does md raid rebuild the affected chunk from parity on-the-fly? Or is the entire drive
dropped from the array and placed into degraded mode?
>
> If the affected chunk is rebuilt from parity (or copy in case of mirror), does md raid issue a write command
to write the rebuilt chunk back to the disk at the same LBAs? If so, shouldn't this force the drive to
determine if the sector error is transient or persistent, and if persistent the disk firmware will remap
the bad sector to a reserve sector?
>
(Continue reading)

Chris Murphy | 19 Aug 2012 20:12

Re: md raid behavior, bad sector uncorrectable read error


On Aug 19, 2012, at 2:54 AM, Albert Pauw wrote:

> From what I know (and please correct me if I'm wrong), the drive
> happily remaps the sector to a different, spare, location.

Yes. Although there might be a difference between SATA and SAS handling, as SAS ECC is better by a lot. SATA
ECC can return false corrections. Small problem.

For this thread, the context I'm curious about, is the detected but uncorrectable sector error. 

My understanding is that since the error is not corrected, the firmware won't remap the sector, but sets it
as a pending sector remap until it receives a write command for that LBA. If the write is successful,
pending status is removed. If the write persistently fails, the data is written to a reserved (good)
sector which then gets that LBA and the bad sector is reserved (bad), no LBA. Since the LBA is the same,
neither md nor the file system need to be informed of anything.

> Only when
> it can't do that it throws an error status back. Mind you, more than
> 10 years ago I had a scsi disk in a linux machine which suddenly broke
> down. Checking the logs I saw a lot of remap messages in the log in
> the weeks before. So it maybe that the drive gives warnings back up
> the controller chain.
> If this is true, what is md doing with this? If the remaps increase on
> that particular drive does i throw it out of the raid set?

What I'd like to see happen, is read errors cause chunk reconstruction from parity (or mirrored copy), and
rewritten to disk. If it's just on-the-fly correction, without re-writing the chunk, then we'd need to
run 'echo repair > /sys/block/mdX/md/sync_action' which would take an awfully long time in comparison.

(Continue reading)

NeilBrown | 20 Aug 2012 00:19
Picon
Gravatar

Re: md raid behavior, bad sector uncorrectable read error

On Sun, 19 Aug 2012 12:12:38 -0600 Chris Murphy <lists <at> colorremedies.com>
wrote:

> On Aug 19, 2012, at 3:47 AM, Mikael Abrahamsson wrote:
> > If I remember correctly from what has been described here before, a read error will cause a re-write with
information created from parity
> 
> That's the thing I'd like to get a definitive answer one. If only it were easy to simulate bad sectors in VM's
I could just test it!

If only it were easy to read the documentation, and if only the documentation
were sufficiently complete.....

% man 4 md

search for 'read error'

NeilBrown
Chris Murphy | 20 Aug 2012 04:06

Re: md raid behavior, bad sector uncorrectable read error


On Aug 19, 2012, at 4:19 PM, NeilBrown wrote:

> On Sun, 19 Aug 2012 12:12:38 -0600 Chris Murphy <lists <at> colorremedies.com>
> wrote:
> 
>> On Aug 19, 2012, at 3:47 AM, Mikael Abrahamsson wrote:
>>> If I remember correctly from what has been described here before, a read error will cause a re-write with
information created from parity
>> 
>> That's the thing I'd like to get a definitive answer one. If only it were easy to simulate bad sectors in
VM's I could just test it!
> 
> If only it were easy to read the documentation, and if only the documentation
> were sufficiently complete.....
> 
> % man 4 md
> 
> search for 'read error'

The well deserved and diplomatic variant of RTFM. So the summary is: a write error on RAID 1456 causes md to
mark the device as faulty. A read error (with recent kernels) causes correct data to overwrite the bad
block *and* re-read, and if either write or re-read fails, the device is marked faulty. Very cool.

BTW, there's a lot of detail nuggets in man md that I'm not finding documented anywhere else. Nor have I found
in other documentation a reference to man md until now.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
(Continue reading)

Phil Turmel | 20 Aug 2012 14:36

Re: md raid behavior, bad sector uncorrectable read error

On 08/19/2012 10:06 PM, Chris Murphy wrote:

> BTW, there's a lot of detail nuggets in man md that I'm not finding
> documented anywhere else. Nor have I found in other documentation a
> reference to man md until now.

Seriously?  Every other man page in my mdadm installation lists "md(4)"
in the "See Also" section.

Or were you simply not aware that man pages are the *primary*
form of documentation for most core software in *nix environments?
You should consider reviewing "man man".

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Brown | 20 Aug 2012 08:36
Picon

Re: md raid behavior, bad sector uncorrectable read error

On 19/08/2012 20:12, Chris Murphy wrote:

>
> On Aug 19, 2012, at 3:47 AM, Mikael Abrahamsson wrote:
>> If I remember correctly from what has been described here before, a
>> read error will cause a re-write with information created from
>> parity
>
> That's the thing I'd like to get a definitive answer one. If only it
> were easy to simulate bad sectors in VM's I could just test it!
>

I wonder why no one has thought of this idea before...  Wait a minute, 
they have!  When you read the "man 4 md" or "man mdadm" pages, read 
about the "faulty" module.  You don't have to use a virtual machine for 
this, it has been used for testing md raid from long before virtual 
machines were practical in the PC world.

(Sorry for the sarcasm - it was just too tempting in light of Neil's 
recommendation to read the "man 4 md" page!)

Have fun playing with these things - it's good practice so you know what 
to expect if a real failure occurs.

>> If it throws a write error, the drive is kicked from the array
>> (because a drive with write errors is clearly defective).
>
> That makes complete sense.
>
--
(Continue reading)

Mikael Abrahamsson | 19 Aug 2012 11:47
Picon
Favicon

Re: md raid behavior, bad sector uncorrectable read error

On Sat, 18 Aug 2012, Chris Murphy wrote:

> If the affected chunk is rebuilt from parity (or copy in case of 
> mirror), does md raid issue a write command to write the rebuilt chunk 
> back to the disk at the same LBAs? If so, shouldn't this force the drive 
> to determine if the sector error is transient or persistent, and if 
> persistent the disk firmware will remap the bad sector to a reserve 
> sector?

If I remember correctly from what has been described here before, a read 
error will cause a re-write with information created from parity, and the 
drive should figure it out (either is successfully writes the new 
information to the existing sector, or if it can't, it writes it to a 
spare sector (re-map)). If it throws a write error, the drive is kicked 
from the array (because a drive with write errors is clearly defective).

So yes, you're correct in your assumption from what I know.

--

-- 
Mikael Abrahamsson    email: swmike <at> swm.pp.se
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Oliver Schinagl | 19 Aug 2012 12:39
Picon
Favicon

Re: md raid behavior, bad sector uncorrectable read error

On 08/19/12 11:47, Mikael Abrahamsson wrote:
> On Sat, 18 Aug 2012, Chris Murphy wrote:
>
>> If the affected chunk is rebuilt from parity (or copy in case of
>> mirror), does md raid issue a write command to write the rebuilt chunk
>> back to the disk at the same LBAs? If so, shouldn't this force the
>> drive to determine if the sector error is transient or persistent, and
>> if persistent the disk firmware will remap the bad sector to a reserve
>> sector?
>
> If I remember correctly from what has been described here before, a read
> error will cause a re-write with information created from parity, and
> the drive should figure it out (either is successfully writes the new
> information to the existing sector, or if it can't, it writes it to a
> spare sector (re-map)). If it throws a write error, the drive is kicked
> from the array (because a drive with write errors is clearly defective).
>
> So yes, you're correct in your assumption from what I know.
>
Also, SMARTd would throw up errors when monitoring such drive when a 
re-map occurs, mdadm may not even notice this if the remap happens 
successfully.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Richard Scobie | 20 Aug 2012 06:53
Picon

Re: md raid behavior, bad sector uncorrectable read error

Chris Murphy wrote:

 > Nor have I found in other documentation a reference to man md until
 > now.

The command

man -k KEYWORD

will search the whatis database for man pages containing KEYWORD, 
(assuming you have run "makewhatis", to generate a current database).

man -k md

or

man -k RAID

both return the md(4) man page.

Regards,

Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane