andros | 12 Jun 2012 16:39
Picon

[PATCH 0/1] NFSv4.1 Fix umount when filelayout DS is also the MDS

From: Andy Adamson <andros@...>

At the Bakeathon, Jorge noted that when the MDS and DS use the same struct
nfs_client, umount would fail to free the nfs_client struct and the
keep-alive SEQUENCE compound would continue ad infinitum.

This was due to the fact that the DS reference to the MDS nfs_client would
prevent the umount from dropping the cl_count to zero. Moreover, the
DS refererence is only dropped by the deviceid dereference only called
by nfs4_deviceid_purge_client from nfs_free_client when the nfs_client
cl_count was zero.

See the patch comments for the solution description.

I've tested this solution against a 2-node C-mode filer where one node is an
MDS/DS and the other node a standalone MDS but uses the first node as a DS.

I tested all combinations, including mounting the MDS/DS node, and using pNFS
for I/O. Then mounting the solo MDS node, and using pNFS for I/O (which uses
the MDS/DS node as a DS), then umounting one of the mount points, doing pNFS
I/O and then umount the other.

In all cases, umount destroyed the appropriate nfs_client, and associated
session.

Andy Adamson (1):
  NFSv4.1 Fix umount when filelayout DS is also the MDS

 fs/nfs/client.c            |   71 ++++++++++++++++++++++++++++++++++++++++++--
 fs/nfs/internal.h          |    1 +
(Continue reading)

andros | 12 Jun 2012 16:39
Picon

[PATCH 1/1] NFSv4.1 Fix umount when filelayout DS is also the MDS

From: Andy Adamson <andros@...>

Add a secondary creation count to struct nfs_client to handle the corner case
when a file layout data server is also the mounted MDS, and shares the
cl_session. In this case, a umount of the MDS should not destroy the nfs_client
as it could still be in use as a DS (only) for another deviceid/MDS.

Currently there is a 'chicken and egg' issue when the DS is also the mounted
MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
is not called to dereference the MDS usage of a deviceid which holds a
reference to the DS nfs_client.  The result is the umount program returns,
but the nfs_client is not freed, and the cl_session hearbeat continues.

The MDS (and all other nfs mounts) lose their last nfs_client reference in
nfs_free_server when the last nfs_server (fsid) is umounted.
The file layout DS lose their last nfs_client reference in destroy_ds
when the last deviceid referencing the data server is put and destroy_ds is
called. This is triggered by a call to nfs4_deviceid_purge_client which
removes references to a pNFS deviceid used by an MDS mount.

The new cl_ds_count is an additional 'creation' reference for a file layout
data server struct nfs_client. When an nfs_client is a DS,
the cl_ds_count is incremented from 0 to 1, and decremented from 1 to zero
in destroy_ds called on the last deviceid reference to the data server.

Both the cl_count and the cl_ds_count must be zero to free the nfs_client.

With the cl_ds_count, when the DS is also an MDS, the nfs_match_client
reference from nfs4_set_ds_client triggered by the DS finding the existing
(Continue reading)


Gmane