Andreas Dilger | 13 Jun 2003 09:35

Re: Race with inodes in I_FREEING state

On Jun 13, 2003  15:02 +1000, Neil Brown wrote:
> >   I'm developing a file system for Linux (I'm currently only using the
> > 2.4 tree), and  have seem to have  hit a small race with  the VFS code
> > starting to iget()  an inode while it's being  freed, which is causing
> > my code to panic.
> > 
> >   The race occurs in the following scenario:
> > 
> > 1) prune_icache() is called, and inode $x$ (ino = $z$) is removed from
> >    the inode hash.
> > 
> > 2) dispose_list() is called, but is preempted/scheduled.
> > 
> > 3) Another task  calls iget() for inode $y$ (ino  also = $z$), doesn't
> >    find it in the hash, and reads the inode (read_inode()). 
> > 
> > 4) dispose_list() wakes up, and finally calls FS-specific clear_inode()
> >    operation on inode $x$.
> > 
> >   It _is_ true that $x$ on steps 1 and 4 is a different inode than $y$
> > in step 3. However, my FS  has some hashed/shared data, kept in 'union
> > u', which is  deleted when clear_inode() is called. So,  in the end of
> > step 4, inode $y$ has a broken 'u' field, pointing to deleted memory.
> > 
> >   After looking around in the  archive, I believe this race is similar
> > to the one described here, by Niel Brown:
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=105235852013658&w=2
> > 
> >   Does this not also happen in  version 2.4.20? Can anybody tell me if
> > my logic is  wrong, or if I'm just plain doing  something stupid in my
(Continue reading)

Livio Baldini Soares | 13 Jun 2003 15:00
Picon
Picon
Favicon

Re: Race with inodes in I_FREEING state

  Hey!

Andreas Dilger writes:
> On Jun 13, 2003  15:02 +1000, Neil Brown wrote:
> > > On Friday June 13, livio <at> ime.usp.br wrote:
[...]

> > >   Does this not also happen in  version 2.4.20? Can anybody tell me if
> > > my logic is  wrong, or if I'm just plain doing  something stupid in my
> > > FS?
> > 
> > Yep.  It sound like the same race.  I wasn't going to submit a 2.4
> > patch until the 2.5 one went in.  I hope to submit the 2.4 equivalent
> > when 2.4.22-pre opens up.

  Ah, _great_! Thanks a lot Niel.

> Sigh, we've just spent a week chasing exactly this same race in Lustre
> on 2.4.  It also stores pointers to shared data in the inode (DLM locks)
> which are freed when clear_inode() is called.  We fixed it only a few
> hours ago by not matching our hashed locks if they are not for the same
> inode _pointer_ instead of just for the same inode _number_/generation,
> which is what the distributed lock name is.

  Humm.. interesting  work around.  Except,  my FS, the hashed  data I
have  in the  inode's private  parts has  no idea  that an  inode even
_exists_.  Guess I'll  have to  start keeping  a back  pointer  to the
inode, until this is fixed. Darn.

> If only Livio had posted this email last week ;-).
(Continue reading)


Gmane