Michael Rendell | 18 Dec 2008 02:45
Picon
Picon

mkfs.ext3 taking long time on raid1 with nda disk

Hi Peter,

  Have noticed that creating a file system on a raid1 device
that uses a enbd device sometimes runs into problems: looks
like it is geting errors writing to the server and taking a
while to recover.  Wondering if something is wrong in the
configuration?

The set up is as follows:

/etc/enbd.conf:
    module 0 show_errs=1
    client mDisklessVars /dev/nda nmhd-bs2 1090 -e -n4 -q -t 0

(pwprog is configured but disabled)

cat /proc/nbdinfo
    Device a:       Open
    [a] State:      verify, rw, enabled, validated, show_errs, last error 0, 
lives 0, bp 0
    [a] Queued:     +0R/0W curr (check 0R/0W) +0R/2W max
    [a] Buffersize: 266240  (sectors=520, blocks=65)
    [a] Blocksize:  4096    (log=12)
    [a] Size:       3387392KB
    [a] Blocks:     846848
    [a] Sockets:    4       (+)     (+)     (*)     (+)
    [a] Requested:  0       (0)     (0)     (0)     (0)     0R/0W   max 0
    [a] Despatched: 0       (0)     (0)     (0)     (0)     0R/0W   md5 0W (0 
eq, 0 ne, 0 dn)
    [a] Errored:    0       (0)     (0)     (0)     (0)     0+0
(Continue reading)

Peter T. Breuer | 18 Dec 2008 11:13
Picon

Re: mkfs.ext3 taking long time on raid1 with nda disk

"Also sprach Michael Rendell:"
>   Have noticed that creating a file system on a raid1 device
> that uses a enbd device sometimes runs into problems: looks

Which enbd? You are using enbd 2.4.35 aren't you? That's the stable
version.

I've also just noticed that since about kernel 2.6.12, the kernel's
32 bit jiffies rollover happens 5 minutes after boot! That probably
causes huge instability for the first five minutes in enbd, since
most timeouts will be the other side of int_max from start time. I've
rapidly done a hardening against rollovers myself. 

You sound like you're efficient enough to get things working within
five minutes!

> like it is geting errors writing to the server and taking a
> while to recover.  Wondering if something is wrong in the
> configuration?

Nothing can be wrong :).

>  # raid now created; mkfs running...
> 
>     Dec 17 17:18:01 nmhd-bs1 kernel: ENBD #1629[0]: enbd_rollback (0): 
> erroring too old (delay 18s >= timeout 90s) req cb2f9078 id 0x4745ca1f!

Yes, this is a symptom of having crossed the 32 bit jiffies rollover boundary.

It is not normally the case that 18 is greater than 90!
(Continue reading)


Gmane