Dieter Stüken | 8 Jan 16:58 2007
Picon

mtab locking

Hi,

after I switched to an SMP system now, I sporadically got a broken /etc/mtab
again. I unanalyzed the actual code and came down to lock_mtab() within
mount/fstab.c. This function was patched/improved/repaired over and over again 
and still today each distribution supplies its own patches to get it fixed.

After some days of analyzing it, I came to the conclusion, that there is a 
principal problem to get the locking work safely with the current concept (or
I do not understand how it is supposed to work, and someone can explain it to me).

The strategy seems to place an F_WRLCK on mtab~ to proceed modifying mtab itself.
If no current ~mtab was found, a new one is created and finally removed again.
Thus the creation/deletion are independent of the flock itself.

I think this is the root of all problems: either the placement of the flock
is the semaphore; in this case the lock file should never be deleted. Or the
creation/existence of the lockfile is the semaphore. In this case the flock
helps to notify waiting mount processes, but it is not the semaphore itself. 

There was a major patch around util-linux-2.9i to improve this, but besides
a lot of very confusing if/else/while(ntry++<5) constructs, it does not solve
the problem in principal. In addition there are a couple of patches around
to handle additional races, but they are making things even more complicated.
This broken concept also went into other packages i.e. mount.nfs. 

For all solutions currently around, the possibility remains, to place
an flock on a foreign mtab~ file, which may get removed by someone else
before the update if mtab itself was finished :-(

(Continue reading)

Karel Zak | 18 Jan 20:14 2007
Picon

Re: mtab locking

On Mon, Jan 08, 2007 at 04:58:10PM +0100, Dieter Stüken wrote:
> 
> after I switched to an SMP system now, I sporadically got a broken /etc/mtab
> again. I unanalyzed the actual code and came down to lock_mtab() within

 More questions come in mind:

 * which distribution are you using?
 * is there any distribution specific patch to lock_mtab() ?
 * is there really only the mount command that lock mtab in your sustem?
   (amd, autofs, mount.nfs, mount.cifs, mount.XXX)

> After some days of analyzing it, I came to the conclusion, that there is a 
> principal problem to get the locking work safely with the current concept (or

 I don't thing so ;-) There is one bug only.

> I do not understand how it is supposed to work, and someone can explain it to me).
> 
> The strategy seems to place an F_WRLCK on mtab~ to proceed modifying mtab itself.
> If no current ~mtab was found, a new one is created and finally removed again.
> Thus the creation/deletion are independent of the flock itself.

 No. The core of the strategy is not F_WRLCK , but the link().

 Step by step the locking:

 1) mount creates unique file mtab~<pid>

 2) mount tries create hardlink from the unique file to generic mtab~
(Continue reading)

Dieter Stüken | 23 Jan 18:02 2007
Picon

Re: mtab locking

The process of creating the mtab~ lock itself and
placing the primary F_SETLK may be improved slightly.

I'll try to implement this and send out a small patch, soon.
Which version should I use to start, util-linux-2.13-pre7?

Karel Zak wrote:
>  Step by step the locking:
> 
>  1) mount creates unique file mtab~<pid>

create the F_SETLK now! This ensures our lock is placed on this inode before
any hard link is created. Keep this open lockfile_fd until we finally finished.
> 
>  2) mount tries create hardlink from the unique file to generic mtab~
> 
>    -- this is important step. You cannot create the link if there is
>    already any other mtab~. The link() syscall is atomic operation.
> 
>  3) 
>     a) the mount can't create the hardlink

open the already exiting mtab~ with a separate temporary fd.

>        -- well, let's use F_SETLKW as a conditional wait on foreign
>        mtab~ (= wait until there is not any other process with locked mtab~))
> 
>        when the mtab~ is unlocked by a concurrent process the mount
>        tries everything __again__ from step 1)

(Continue reading)

Karel Zak | 24 Jan 11:19 2007
Picon

Re: mtab locking

On Tue, Jan 23, 2007 at 06:02:04PM +0100, Dieter Stüken wrote:
> The process of creating the mtab~ lock itself and
> placing the primary F_SETLK may be improved slightly.
> 
> I'll try to implement this and send out a small patch, soon.
> Which version should I use to start, util-linux-2.13-pre7?
> 
> Karel Zak wrote:
> >  Step by step the locking:
> > 
> >  1) mount creates unique file mtab~<pid>
> 
> create the F_SETLK now! This ensures our lock is placed on this inode before
> any hard link is created. Keep this open lockfile_fd until we finally finished.

 You don't understand. Sorry, but I don't want to do any change or
 discuss about a "bug fix" if I don't see any problem with the current
 code. 

 Please, show me a scenario when the current code doesn't work
 correctly. 

    Karel

> > 
> >  2) mount tries create hardlink from the unique file to generic mtab~
> > 
> >    -- this is important step. You cannot create the link if there is
> >    already any other mtab~. The link() syscall is atomic operation.
> > 
(Continue reading)

Ian Kent | 24 Jan 14:25 2007
Picon

Re: mtab locking

On Wed, 2007-01-24 at 11:19 +0100, Karel Zak wrote:
> On Tue, Jan 23, 2007 at 06:02:04PM +0100, Dieter Stüken wrote:
> > The process of creating the mtab~ lock itself and
> > placing the primary F_SETLK may be improved slightly.
> > 
> > I'll try to implement this and send out a small patch, soon.
> > Which version should I use to start, util-linux-2.13-pre7?
> > 
> > Karel Zak wrote:
> > >  Step by step the locking:
> > > 
> > >  1) mount creates unique file mtab~<pid>
> > 
> > create the F_SETLK now! This ensures our lock is placed on this inode before
> > any hard link is created. Keep this open lockfile_fd until we finally finished.
> 
>  You don't understand. Sorry, but I don't want to do any change or
>  discuss about a "bug fix" if I don't see any problem with the current
>  code. 
>  
>  Please, show me a scenario when the current code doesn't work
>  correctly. 

Yes.
That may have been missed.
It's impossible to know if a problem has been resolved if it's not
identified in the beginning. And sometimes additional problems can be
introduced without realizing it.

Did you identify what the actual problem was when you started on this?
(Continue reading)

Dieter Stüken | 24 Jan 17:53 2007
Picon

Re: mtab locking

Karel Zak wrote:
>  You don't understand. Sorry, but I don't want to do any change or
>  discuss about a "bug fix" if I don't see any problem with the current
>  code. 

You are absolutely right, but I don't understand why you are upset.

As I pointed out before, I finally did NOT find the bug within your
code. Instead I found a bug introduced by some stupid SuSE people.

It was not my intend to blame you for any bug. My remaining effort
was just to make the code a bit cleaner, to avoid such bugs in future.

If you think my discussion is counterproductive, please let me know,
and I will withdraw any further postings and release this list.

>  Please, show me a scenario when the current code doesn't work
>  correctly. 

So I will try again to explain the scenario which makes me headache:

  i = open (linktargetfile, O_WRONLY|O_CREAT, 0);
...
  close(i);
...
  j = link(linktargetfile, MOUNTED_LOCK);
...
  fd = open (MOUNTED_LOCK, O_WRONLY);

This is processed by two processes in parallel.
(Continue reading)

Karel Zak | 24 Jan 19:30 2007
Picon

Re: mtab locking

On Wed, Jan 24, 2007 at 05:53:51PM +0100, Dieter Stüken wrote:
> Karel Zak wrote:
> >  You don't understand. Sorry, but I don't want to do any change or
> >  discuss about a "bug fix" if I don't see any problem with the current
> >  code. 
> 
> You are absolutely right, but I don't understand why you are upset.

 Ah.. it's probably my poor English, but I'm really not upset :-) 

> >  Please, show me a scenario when the current code doesn't work
> >  correctly. 
> 
> So I will try again to explain the scenario which makes me headache:
> 
>   i = open (linktargetfile, O_WRONLY|O_CREAT, 0);
> ...
>   close(i);
> ...
>   j = link(linktargetfile, MOUNTED_LOCK);
> ...
>   fd = open (MOUNTED_LOCK, O_WRONLY);
> 
> This is processed by two processes in parallel.
> Only one of them (A) was able to make the link, good!
> The other one (B) failed with j=-1 and has to wait.
> Both open the mtab~ lockfile.
> (A) creates the F_SETLK now and proceeds.
> (B) tries an F_SETLKW and will wait until (A) is ready.
> perfect!
(Continue reading)

Dieter Stüken | 25 Jan 09:03 2007
Picon

Re: mtab locking

Karel Zak wrote:
>> You are absolutely right, but I don't understand why you are upset.
> 
>  Ah.. it's probably my poor English, but I'm really not upset :-)

But this is my problem, too :-)

>  (You can theoretically see unsuccessful end of process B, because it
>  doesn't wait on F_SETLKW -- but it's really only theory, because
>  there is also sleep() for process B and it should be enough.)

before it's (B) first attempt to make its F_SETLKW to admit
(A) making the primary F_SETLK? This might hep a bit....
But the problem is not just theoretical. On my SMP system
I observe a substantial probability for (A) abd (B) to act
in the unexpected order.

>  The bad thing is that every change to this lock code we have to also
>  make to many others utils like am-utils, mount.nfs, ...

Absolutely! This is my motivation to take a change to clean up
the code before it distributes into other packages.

>  or you can use 2.13-pre7 which is same like version in the GIT
>  repository.

I'll try that.

Dieter.

(Continue reading)

Dieter Stüken | 25 Jan 15:10 2007
Picon

Re: mtab locking

Karel Zak wrote:
>  (You can theoretically see unsuccessful end of process B, because it
>  doesn't wait on F_SETLKW -- but it's really only theory, because
>  there is also sleep() for process B and it should be enough.)

I performed some tests of 1000000 turns and up to 10 concurrent
processes. On a single processor system the SETLK problem occurred
about 20:1000000 times, on my SMP system the "fault" rate raised to
400:1000000. If the locking took place via NFS it happened with 30:100
which is about 30%!

>  It means we have to be sure that we really need it. Maybe we can wait
>  (1 year?) for libmount.

There is no need to change it urgently, as it seems not to cause
serious problems currently (except if you fall victim to SuSE patches :-)
But it may be a good starting point towards a libmount.

OK, here comes an alternate patch again.

This time I tried to focus solely on the SETLK problem.
But even so, a lot of lines are affected, so it might be
unclear what it does by just analyzing the patch.
It currently bases on util-linux-2.13-pre7. 

Summary:

The general concept of locking with a link() and 
using a SETLK for synchronizing remains exactly the same. 

(Continue reading)

Karel Zak | 8 Jan 22:31 2007
Picon

Re: mtab locking

On Mon, Jan 08, 2007 at 04:58:10PM +0100, Dieter Stüken wrote:
> Hi,
> 
> after I switched to an SMP system now, I sporadically got a broken /etc/mtab

 Yes, I know about this problem (mostly on systems with heavy load).

 Some people from Red Had worked on this issue two years ago. Now I
 don't have any feedback about any problem (it doesn't mean that RH
 patch is perfect :-).

 This issue is definitely somethig what should be fixed in a next
 release. But don't forget that the mount command is not alone, there
 is more utils that modify /etc/mtab.

 There is also other issue -- some people look for way how support
 read-only root fs and how move the /etc/mtab~ lock file to /var/lock.

> again. I unanalyzed the actual code and came down to lock_mtab() within
> mount/fstab.c. This function was patched/improved/repaired over and over again 
> and still today each distribution supplies its own patches to get it fixed.

 Red Hat patch:

  http://people.redhat.com/kzak/util-linux/util-linux-2.12p-mtab-lock.patch

 Suse patch:

  http://people.redhat.com/kzak/util-linux/util-linux-2.12h-mtab-lock-suse.diff

(Continue reading)

Dieter Stüken | 10 Jan 18:57 2007
Picon

Re: mtab locking

After some attempt to repair the current code with minimal changes
I started over with my chainsaw and cut lock_mtab() into peaces.
There are too many aspects tried to be handled simultaneously
with the current lock_mtab() implementation. This seems extremely
error prone.

So I tried to split it up into separate small procedures. Each of them
fits on a single screen and performs an obvious task. This should help
to avoid unexpected side effects. I also tried to minimize the communication
via static variables. Instead of publishing a long patch, I directly post a
cut from my current code to discuss, if it is usable this way.

Dieter.

/* Updating mtab ----------------------------------------------*/

/* This should go to path.h. Unfortunately I can't see how to 
 * construct it from MOUNTED, to get it to the same directory
 */
static const char* MOUNTLOCK_LINKTARGET = ".mtab.lock";

/* Record created lockfile name and open file handle */
static const char* lock_file_created = NULL;
static int lock_file_desc = -1;

/* Ensure that the lock is released if we are interrupted.  */
extern char *strsignal(int sig);	/* not always in <string.h> */

static void
handler (int sig) {
(Continue reading)

Ian Kent | 11 Jan 05:39 2007
Picon

Re: mtab locking

On Wed, 2007-01-10 at 18:57 +0100, Dieter Stüken wrote:
> After some attempt to repair the current code with minimal changes
> I started over with my chainsaw and cut lock_mtab() into peaces.
> There are too many aspects tried to be handled simultaneously
> with the current lock_mtab() implementation. This seems extremely
> error prone.
> 
> So I tried to split it up into separate small procedures. Each of them
> fits on a single screen and performs an obvious task. This should help
> to avoid unexpected side effects. I also tried to minimize the communication
> via static variables. Instead of publishing a long patch, I directly post a
> cut from my current code to discuss, if it is usable this way.
> 
> Dieter.

I'll look closer at it when I get a chance but, after a quick look,
there's a couple of thing to consider. See below.

I don't think this will work on an NFS mounted filesystem. See comments
in open(2) regarding O_EXCL.

> 
> /* Updating mtab ----------------------------------------------*/
> 
> /* This should go to path.h. Unfortunately I can't see how to 
>  * construct it from MOUNTED, to get it to the same directory
>  */
> static const char* MOUNTLOCK_LINKTARGET = ".mtab.lock";

This needs to be unique, so maybe add the pid.
(Continue reading)

Ian Kent | 11 Jan 05:43 2007
Picon

Re: mtab locking

On Wed, 2007-01-10 at 18:57 +0100, Dieter Stüken wrote:
> After some attempt to repair the current code with minimal changes
> I started over with my chainsaw and cut lock_mtab() into peaces.
> There are too many aspects tried to be handled simultaneously
> with the current lock_mtab() implementation. This seems extremely
> error prone.
> 
> So I tried to split it up into separate small procedures. Each of them
> fits on a single screen and performs an obvious task. This should help
> to avoid unexpected side effects. I also tried to minimize the communication
> via static variables. Instead of publishing a long patch, I directly post a
> cut from my current code to discuss, if it is usable this way.

I really like to see patches because it gives context about the existing
implementation.

But for now the code will be good enough to start things going.

Ian

-
To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dieter Stüken | 11 Jan 13:13 2007
Picon

Re: mtab locking

Ian Kent wrote:
> I really like to see patches because it gives context about the existing
> implementation.

Until now this peace of code is part of a tiny testing program, only.
If this solution will some day be developed towards a separate
libmount it may be a good idea to even separate it from the current
mount tool context....

I'm currently running some torture tests on my 64-bit SMP 
machine, including the NFS case, too. Running 10 concurrent
tasks trying to perform the locking 10000 times seems to work well
without any problems so far. My next step will be to build a new
mount command using this code and install it on my server. If this
works over the weekend without problems, I will send out a patch.

> I don't think this will work on an NFS mounted filesystem. See comments
> in open(2) regarding O_EXCL.
...
>> /* Updating mtab ----------------------------------------------*/
>> > 
>> > /* This should go to path.h. Unfortunately I can't see how to 
>> >  * construct it from MOUNTED, to get it to the same directory
>> >  */
>> > static const char* MOUNTLOCK_LINKTARGET = ".mtab.lock";
> 
> This needs to be unique, so maybe add the pid.
...
> > 	int fd = open(MOUNTLOCK_LINKTARGET, O_WRONLY|O_CREAT, 0600);

(Continue reading)

Dieter Stüken | 15 Jan 15:06 2007
Picon

Re: mtab locking

Dieter Stüken wrote:
> Ian Kent wrote:
>> I really like to see patches because it gives context about the existing
>> implementation.

here comes the patch against my util-linux-2.13-pre7.
This seems to fix the current mount utility.
But I don't know, how this will solve mtab problems
with external tools like mount.nfs. It should all go
towards a more general solution like a common library.... 

Dieter.

--- fstab.c.ORIG	2005-10-15 00:10:56.000000000 +0200
+++ fstab.c	2007-01-15 14:17:09.000000000 +0100
 <at>  <at>  -388,15 +388,18  <at>  <at> 
 	return NULL;
 }

 /* Updating mtab ----------------------------------------------*/

-/* Flag for already existing lock file. */
-static int we_created_lockfile = 0;
-
-/* Flag to indicate that signals have been set up. */
-static int signals_have_been_setup = 0;
+/* This should go to path.h. Unfortunately I can't see how to 
+ * construct it from MOUNTED, to get it to the same directory
+ */
+static const char* MOUNTLOCK_LINKTARGET = "/etc/.mtab.lock";
(Continue reading)

Ian Kent | 9 Jan 04:31 2007
Picon

Re: mtab locking

On Mon, 2007-01-08 at 16:58 +0100, Dieter Stüken wrote:
> Hi,
> 
> after I switched to an SMP system now, I sporadically got a broken /etc/mtab
> again. I unanalyzed the actual code and came down to lock_mtab() within
> mount/fstab.c. This function was patched/improved/repaired over and over again 
> and still today each distribution supplies its own patches to get it fixed.
> 
> After some days of analyzing it, I came to the conclusion, that there is a 
> principal problem to get the locking work safely with the current concept (or
> I do not understand how it is supposed to work, and someone can explain it to me).

It's been a long time since I looked at this code.
Last time I thought my hair was going to fall out after I tried to work
out what was going on.

> 
> The strategy seems to place an F_WRLCK on mtab~ to proceed modifying mtab itself.
> If no current ~mtab was found, a new one is created and finally removed again.
> Thus the creation/deletion are independent of the flock itself.
> 
> I think this is the root of all problems: either the placement of the flock
> is the semaphore; in this case the lock file should never be deleted. Or the
> creation/existence of the lockfile is the semaphore. In this case the flock
> helps to notify waiting mount processes, but it is not the semaphore itself. 
> 
> There was a major patch around util-linux-2.9i to improve this, but besides
> a lot of very confusing if/else/while(ntry++<5) constructs, it does not solve
> the problem in principal. In addition there are a couple of patches around
> to handle additional races, but they are making things even more complicated.
(Continue reading)


Gmane