David Arendt | 3 May 2009 00:55

nilfs_cpfile_delete_checkpoints: cannot delete block

Hi,

Until now nilfs-2.0.12 has run very stable without data corruption.
However on one partition (600G) I have got the following errors while 
running the cleaner:

nilfs_cpfile_delete_checkpoints: cannot delete block
NILFS: GC failed during preparation: cannot delete checkpoints: err=-2

This is a partition mainly holding large temporary render files (can be 
up to 25gb/file). There are currently 132702 snapshots.

As this partition is not used during the next few days, I will leave it 
with the error so if you would like me to test further things, please 
let me know.

Bye,
David Arendt
Ryusuke Konishi | 3 May 2009 10:08

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi David,
On Sun, 03 May 2009 00:55:43 +0200, David Arendt wrote:
> Hi,
> 
> Until now nilfs-2.0.12 has run very stable without data corruption.
> However on one partition (600G) I have got the following errors while 
> running the cleaner:
> 
> nilfs_cpfile_delete_checkpoints: cannot delete block
> NILFS: GC failed during preparation: cannot delete checkpoints: err=-2
> 
> This is a partition mainly holding large temporary render files (can be 
> up to 25gb/file). There are currently 132702 snapshots.
> 
> As this partition is not used during the next few days, I will leave it 
> with the error so if you would like me to test further things, please 
> let me know.
> 
> Bye,
> David Arendt

I have reviewed the function in question, but could not find any
likely problems.

Could you try the following patch?

It's applicable to v2.0.12.

I have some pending patches later than 2.0.12, but they seem to be
independent with your problem.
(Continue reading)

David Arendt | 4 May 2009 06:16

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi,

This night. I had lots of:

nilfs_btree_propagate: key = 67, level == 0

On the parition where cleanerd has failed.

A try to umount it resulted in a hang with the following message:

NILFS warning (device sda10): nilfs_segctor_destroy: dirty file(s) after 
the final construction

Bye,
David Arendt

Ryusuke Konishi wrote:
> Hi David,
> On Sun, 03 May 2009 00:55:43 +0200, David Arendt wrote:
>   
>> Hi,
>>
>> Until now nilfs-2.0.12 has run very stable without data corruption.
>> However on one partition (600G) I have got the following errors while 
>> running the cleaner:
>>
>> nilfs_cpfile_delete_checkpoints: cannot delete block
>> NILFS: GC failed during preparation: cannot delete checkpoints: err=-2
>>
>> This is a partition mainly holding large temporary render files (can be 
(Continue reading)

Ryusuke Konishi | 5 May 2009 13:23

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi David,
On Mon, 04 May 2009 06:16:24 +0200, David Arendt wrote:
> Hi,
> 
> This night. I had lots of:
> 
> nilfs_btree_propagate: key = 67, level == 0
> 
> On the parition where cleanerd has failed.

This error is related to the GC failure.

Both logs indicate that btree look-up of the 67th block on the
checkpoint file failed.

I suspect inconsistency between the block on page cache and btree; the
block was removed from the btree but were remaining on the page cache.

Could you try the following bugfix patch?

The patch ensures to clear dirty state of page and buffer after
removal of block, and would prevent the inconsistency.

Thanks in advance,
Ryusuke Konishi
--
diff --git a/fs/btnode.c b/fs/btnode.c
index 5e83c60..11a7305 100644
--- a/fs/btnode.c
+++ b/fs/btnode.c
(Continue reading)

David Arendt | 3 May 2009 11:26

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi,

I have tried your patch.

The more verbose error message is:

nilfs_cpfile_delete_checkpoints: cannot delete block: cno=1407, range = 
[11, 75990)
NILFS: GC failed during preparation: cannot delete checkpoints: err=-2

Bye,
David Arendt

Ryusuke Konishi wrote:
> Hi David,
> On Sun, 03 May 2009 00:55:43 +0200, David Arendt wrote:
>   
>> Hi,
>>
>> Until now nilfs-2.0.12 has run very stable without data corruption.
>> However on one partition (600G) I have got the following errors while 
>> running the cleaner:
>>
>> nilfs_cpfile_delete_checkpoints: cannot delete block
>> NILFS: GC failed during preparation: cannot delete checkpoints: err=-2
>>
>> This is a partition mainly holding large temporary render files (can be 
>> up to 25gb/file). There are currently 132702 snapshots.
>>
>> As this partition is not used during the next few days, I will leave it 
(Continue reading)

Ryusuke Konishi | 3 May 2009 11:44

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi!
On Sun, 03 May 2009 11:26:49 +0200, David Arendt wrote:
> Hi,
> 
> I have tried your patch.
> 
> The more verbose error message is:
> 
> nilfs_cpfile_delete_checkpoints: cannot delete block: cno=1407, range = 
> [11, 75990)
> NILFS: GC failed during preparation: cannot delete checkpoints: err=-2

You didn't see any DAT warnings?

If so, do you think the range of deleting checkpoints
(i.e. 11 ~ 75990 - 1) is proper?

How is the output of lscp?

Ryusuke Konishi

> Bye,
> David Arendt
> 
> Ryusuke Konishi wrote:
> > Hi David,
> > On Sun, 03 May 2009 00:55:43 +0200, David Arendt wrote:
> >   
> >> Hi,
> >>
(Continue reading)

David Arendt | 3 May 2009 12:06

Re: nilfs_cpfile_delete_checkpoints: cannot delete block

Hi,

I didn't see any DAT warnings.

Using lscp I see that the first entry is

1428  2009-03-30 02:13:06   cp    -        259      74436

The last entry is

134128  2009-05-03 00:04:28   cp    i      81813        876

If you want to have the full output of lscp please tell me, then I will 
send it to you without sending to the mailinglist as the file has 10mb.

Bye,
David Arendt

Ryusuke Konishi wrote:
> Hi!
> On Sun, 03 May 2009 11:26:49 +0200, David Arendt wrote:
>   
>> Hi,
>>
>> I have tried your patch.
>>
>> The more verbose error message is:
>>
>> nilfs_cpfile_delete_checkpoints: cannot delete block: cno=1407, range = 
>> [11, 75990)
(Continue reading)


Gmane