Stefan Ring | 15 Jun 2012 10:54
Picon
Gravatar

Re: Recovery of RAIDZ with broken label(s)

>> Have you also mounted the broken image as /dev/lofi/2?
>
> Yep.

Wouldn't it be better to just remove the corrupted device? This worked
just fine in my case.
Scott Aitken | 16 Jun 2012 06:10

Re: Recovery of RAIDZ with broken label(s)

On Fri, Jun 15, 2012 at 10:54:34AM +0200, Stefan Ring wrote:
> >> Have you also mounted the broken image as /dev/lofi/2?
> >
> > Yep.
> 
> Wouldn't it be better to just remove the corrupted device? This worked
> just fine in my case.
>

Hi Stefan,

when you say remove the device, I assume you mean simply make it unavailable
for import (I can't remove it from the vdev).

This is what happens (lofi/2 is the drive which ZFS thinks has corrupted
data):

oot <at> openindiana-01:/mnt# zpool import -d /dev/lofi
  pool: ZP-8T-RZ1-01
    id: 9952605666247778346
 state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

        ZP-8T-RZ1-01              FAULTED  corrupted data
          raidz1-0                ONLINE
            12339070507640025002  UNAVAIL  corrupted data
            /dev/lofi/5           ONLINE
(Continue reading)

Stefan Ring | 16 Jun 2012 08:54
Picon
Gravatar

Re: Recovery of RAIDZ with broken label(s)

> when you say remove the device, I assume you mean simply make it unavailable
> for import (I can't remove it from the vdev).

Yes, that's what I meant.

> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
>  pool: ZP-8T-RZ1-01
>    id: 9952605666247778346
>  state: FAULTED
> status: One or more devices are missing from the system.
> action: The pool cannot be imported. Attach the missing
>        devices and try again.
>   see: http://www.sun.com/msg/ZFS-8000-3C
> config:
>
>        ZP-8T-RZ1-01              FAULTED  corrupted data
>          raidz1-0                DEGRADED
>            12339070507640025002  UNAVAIL  cannot open
>            /dev/lofi/5           ONLINE
>            /dev/lofi/4           ONLINE
>            /dev/lofi/3           ONLINE
>            /dev/lofi/1           ONLINE
>
> It's interesting that even though 4 of the 5 disks are available, it still
> can import it as DEGRADED.

I agree that it's "interesting". Now someone really knowledgable will
need to have a look at this. I can only imagine that somehow the
devices contain data from different points in time, and that it's too
far apart for the aggressive txg rollback that was added in PSARC
(Continue reading)

Scott Aitken | 16 Jun 2012 09:02

Re: Recovery of RAIDZ with broken label(s)

On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> > when you say remove the device, I assume you mean simply make it unavailable
> > for import (I can't remove it from the vdev).
> 
> Yes, that's what I meant.
> 
> > root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
> > ??pool: ZP-8T-RZ1-01
> > ?? ??id: 9952605666247778346
> > ??state: FAULTED
> > status: One or more devices are missing from the system.
> > action: The pool cannot be imported. Attach the missing
> > ?? ?? ?? ??devices and try again.
> > ?? see: http://www.sun.com/msg/ZFS-8000-3C
> > config:
> >
> > ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> > ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
> > ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
> > ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/3 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/1 ?? ?? ?? ?? ?? ONLINE
> >
> > It's interesting that even though 4 of the 5 disks are available, it still
> > can import it as DEGRADED.
> 
> I agree that it's "interesting". Now someone really knowledgable will
> need to have a look at this. I can only imagine that somehow the
> devices contain data from different points in time, and that it's too
(Continue reading)

Gregg Wonderly | 16 Jun 2012 16:09
Picon

Re: Recovery of RAIDZ with broken label(s)

Use 'dd' to replicate as much of lofi/2 as you can onto another device, and then 
cable that into place?

It looks like you just need to put a functioning, working, but not correct 
device, in that slot so that it will import and then you can 'zpool replace' the 
new disk into the pool perhaps?

Gregg Wonderly

On 6/16/2012 2:02 AM, Scott Aitken wrote:
> On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
>>> when you say remove the device, I assume you mean simply make it unavailable
>>> for import (I can't remove it from the vdev).
>> Yes, that's what I meant.
>>
>>> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
>>> ??pool: ZP-8T-RZ1-01
>>> ?? ??id: 9952605666247778346
>>> ??state: FAULTED
>>> status: One or more devices are missing from the system.
>>> action: The pool cannot be imported. Attach the missing
>>> ?? ?? ?? ??devices and try again.
>>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
>>> config:
>>>
>>> ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
>>> ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
>>> ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
>>> ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
>>> ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
(Continue reading)

Scott Aitken | 16 Jun 2012 16:49

Re: Recovery of RAIDZ with broken label(s)

On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
> Use 'dd' to replicate as much of lofi/2 as you can onto another device, and then 
> cable that into place?
> 
> It looks like you just need to put a functioning, working, but not correct 
> device, in that slot so that it will import and then you can 'zpool replace' the 
> new disk into the pool perhaps?
> 
> Gregg Wonderly
> 
> On 6/16/2012 2:02 AM, Scott Aitken wrote:
> > On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> >>> when you say remove the device, I assume you mean simply make it unavailable
> >>> for import (I can't remove it from the vdev).
> >> Yes, that's what I meant.
> >>
> >>> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
> >>> ??pool: ZP-8T-RZ1-01
> >>> ?? ??id: 9952605666247778346
> >>> ??state: FAULTED
> >>> status: One or more devices are missing from the system.
> >>> action: The pool cannot be imported. Attach the missing
> >>> ?? ?? ?? ??devices and try again.
> >>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
> >>> config:
> >>>
> >>> ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> >>> ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
> >>> ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
> >>> ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
(Continue reading)

Gregg Wonderly | 16 Jun 2012 16:58

Re: Recovery of RAIDZ with broken label(s)


On Jun 16, 2012, at 9:49 AM, Scott Aitken wrote:

> On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
>> Use 'dd' to replicate as much of lofi/2 as you can onto another device, and then 
>> cable that into place?
>> 
>> It looks like you just need to put a functioning, working, but not correct 
>> device, in that slot so that it will import and then you can 'zpool replace' the 
>> new disk into the pool perhaps?
>> 
>> Gregg Wonderly
>> 
>> On 6/16/2012 2:02 AM, Scott Aitken wrote:
>>> On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
>>>>> when you say remove the device, I assume you mean simply make it unavailable
>>>>> for import (I can't remove it from the vdev).
>>>> Yes, that's what I meant.
>>>> 
>>>>> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
>>>>> ??pool: ZP-8T-RZ1-01
>>>>> ?? ??id: 9952605666247778346
>>>>> ??state: FAULTED
>>>>> status: One or more devices are missing from the system.
>>>>> action: The pool cannot be imported. Attach the missing
>>>>> ?? ?? ?? ??devices and try again.
>>>>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
>>>>> config:
>>>>> 
>>>>> ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
(Continue reading)

Scott Aitken | 16 Jun 2012 17:13

Re: Recovery of RAIDZ with broken label(s)

On Sat, Jun 16, 2012 at 09:58:40AM -0500, Gregg Wonderly wrote:
> 
> On Jun 16, 2012, at 9:49 AM, Scott Aitken wrote:
> 
> > On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
> >> Use 'dd' to replicate as much of lofi/2 as you can onto another device, and then 
> >> cable that into place?
> >> 
> >> It looks like you just need to put a functioning, working, but not correct 
> >> device, in that slot so that it will import and then you can 'zpool replace' the 
> >> new disk into the pool perhaps?
> >> 
> >> Gregg Wonderly
> >> 
> >> On 6/16/2012 2:02 AM, Scott Aitken wrote:
> >>> On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> >>>>> when you say remove the device, I assume you mean simply make it unavailable
> >>>>> for import (I can't remove it from the vdev).
> >>>> Yes, that's what I meant.
> >>>> 
> >>>>> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
> >>>>> ??pool: ZP-8T-RZ1-01
> >>>>> ?? ??id: 9952605666247778346
> >>>>> ??state: FAULTED
> >>>>> status: One or more devices are missing from the system.
> >>>>> action: The pool cannot be imported. Attach the missing
> >>>>> ?? ?? ?? ??devices and try again.
> >>>>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
> >>>>> config:
> >>>>> 
(Continue reading)

Jim Klimov | 16 Jun 2012 17:46
Picon

Re: Recovery of RAIDZ with broken label(s)

2012-06-16 19:13, Scott Aitken wrote:
> Given I am working with images, it's hard to put just anything "in place" of
> lofi/2.  ZFS scans all of the files in the directory for ZFS labels, so just
> replacing lofi/2 with an empty file (for example) just means ZFS skips it,
> which is the same result as deleting lofi/2 altogether.  I did this, but to
> no avail.  ZFS complains about having insufficient replicas.

I've seen your post that a scrub doesn't start.
Did you try replacing the faulted device with an empty one after
the zpool import - i.e. with "zpool replace lofi/2 lofi/6"?

Also, maybe I missed - did you "zdb -l" your pool components
(the 5 lofi devices) to inspect and compare their ZFS labels?
There should be 4 per device, including the list of device
GUIDs for this TLVDEV and a "last known up-to-date TXG number"
for this disk, and you can see if they differ a lot as you
suspect, or if they don't.

When crafting an empty replacement lofi device, you can also
try to clone or forge a ZFS label for that disk, by dd'ing
the labels from lofi/2 to new empty lofi/6. There are 4 labels
each sized 256KB, two at the head of the drive (0..512KB) and
two at the very end (SZ-512KB..SZ).

If you haven't browsed the "zfs on-disk spec", it may be also
helpful (though outdated in regard to current features):
* 
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf

HTH,
(Continue reading)

Gregg Wonderly | 16 Jun 2012 17:38

Re: Recovery of RAIDZ with broken label(s)


On Jun 16, 2012, at 10:13 AM, Scott Aitken wrote:

> On Sat, Jun 16, 2012 at 09:58:40AM -0500, Gregg Wonderly wrote:
>> 
>> On Jun 16, 2012, at 9:49 AM, Scott Aitken wrote:
>> 
>>> On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
>>>> Use 'dd' to replicate as much of lofi/2 as you can onto another device, and then 
>>>> cable that into place?
>>>> 
>>>> It looks like you just need to put a functioning, working, but not correct 
>>>> device, in that slot so that it will import and then you can 'zpool replace' the 
>>>> new disk into the pool perhaps?
>>>> 
>>>> Gregg Wonderly
>>>> 
>>>> On 6/16/2012 2:02 AM, Scott Aitken wrote:
>>>>> On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
>>>>>>> when you say remove the device, I assume you mean simply make it unavailable
>>>>>>> for import (I can't remove it from the vdev).
>>>>>> Yes, that's what I meant.
>>>>>> 
>>>>>>> root <at> openindiana-01:/mnt# zpool import -d /dev/lofi
>>>>>>> ??pool: ZP-8T-RZ1-01
>>>>>>> ?? ??id: 9952605666247778346
>>>>>>> ??state: FAULTED
>>>>>>> status: One or more devices are missing from the system.
>>>>>>> action: The pool cannot be imported. Attach the missing
>>>>>>> ?? ?? ?? ??devices and try again.
(Continue reading)


Gmane