david.noveck | 11 Jun 2012 20:16

Re: RFC 5661 LAYOUTRETURN clarification.

Tweet version: "Andy's right."

Some details below:

> I contend that sending the LAYOUTRETURN in this error case does not
> violate the two sections of RFC 5661 below, as the client has stopped
> sending any I/O requests using the returned layout.

That takes care of 13.6 only.  That says you "MUST NOT send an IO"
and you aren't.

As regards 18.44.3, things are more "interesting".

> Others contend that since the in-flight RPCs reference the returned
> layout, the client is still 'using' the layout with these in-flight
> requests, and can not call LAYOUTRETURN until all in-flight RPCs
> return, with or without an error.

They are right that there is no assurance that the layout is
not being used.  Whether the 'client' is using it is a more
metaphysical question and that's trouble.

Luckily, we're saved by the final infinitive phrase in "MUST NOT use the 
returned layout(s) and the associated storage protocol to access the 
file data".  If he is using it, he is not it using to access file data". 

In Andy's example the client has already made the transition to using the 
MDS to access the file data.

-----Original Message-----
(Continue reading)

Boaz Harrosh | 11 Jun 2012 20:40
Favicon
Gravatar

Re: RFC 5661 LAYOUTRETURN clarification.

On 06/11/2012 07:01 PM, Andy Adamson wrote:

> I'm coding file layout data server recovery for the Linux NFS client,
> and came across an issue with LAYOUTRETURN that
> could use some comment from the list.
> 
> The error case I'm handling is an RPC layer dis-connection error
> during heavy WRITE i/o to a file layout data server. Our response is
> to internally mark the deviceid as invalid which prevents all pNFS
> calls using the deviceid - e.g. no new I/O using any layout that uses
> the invalid deviceid, and to redirect all I/O to the MDS (any queued
> RPC request that has not been sent is redirected to the MDS).
> 
> Plus - and here is where the clarification is needed - we immediately
> send a LAYOUTRETURN for any layout with in-flight requests to the
> dis-connected data server.  By in-flight I mean transmitted WRT the
> RPC layer.  The purpose of this LAYOUTRETURN is to notify the file
> layout MDS to fence the DS for the specified LAYOUTs, as the WRITEs
> will also be sent to the MDS.
> 

I do not disagree with this completely. The point here is very fine
grained and should be specified explicitly. I would like to see text
as of something like.

There are 3 types of in-flght RPC/IO
1. Client has sent RPC header + all of associated data and is waiting
   for DS WRITE/READ_DONE reply.

   (For me this case can be, client may return LAYOUTRETURN as your
(Continue reading)

Myklebust, Trond | 11 Jun 2012 20:59
Picon

Re: RFC 5661 LAYOUTRETURN clarification.

On Mon, 2012-06-11 at 21:40 +0300, Boaz Harrosh wrote:

> And again, please explain why do you want it. What is wrong with the
> case we all agree with? ie: "Client can not call LAYOUTRETURN until
> all in-flight RPCs return, with or without an error"

Who "agreed" to this? This would mean that if the DS goes down, we can't
ever send LAYOUTRETURN which is patently wrong.

--

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust <at> netapp.com
www.netapp.com

_______________________________________________
nfsv4 mailing list
nfsv4 <at> ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4

Boaz Harrosh | 12 Jun 2012 00:18
Favicon
Gravatar

Re: RFC 5661 LAYOUTRETURN clarification.

On 06/11/2012 09:59 PM, Myklebust, Trond wrote:

> On Mon, 2012-06-11 at 21:40 +0300, Boaz Harrosh wrote:
> 
>> And again, please explain why do you want it. What is wrong with the
>> case we all agree with? ie: "Client can not call LAYOUTRETURN until
>> all in-flight RPCs return, with or without an error"
> 
> Who "agreed" to this? This would mean that if the DS goes down, we can't
> ever send LAYOUTRETURN which is patently wrong.
> 

"DS goes down" is under the above "RPC return an error" the error condition
of an RPC is well defined.

From what of my words did you understand that I said
	"we can't ever send a LAYOUTRETURN"

If my English is wrongly worded. Which is perfectly possible. Please correct
me so I can learn. Did you honestly think that's what I meant? 

I meant we all agree, that this case is covered by RFC. That is  - no one would
accuse a client who does that, as violating the RFC.

And again my question. The motivation?

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
(Continue reading)

Myklebust, Trond | 12 Jun 2012 00:31
Picon

Re: RFC 5661 LAYOUTRETURN clarification.

On Tue, 2012-06-12 at 01:18 +0300, Boaz Harrosh wrote:
> On 06/11/2012 09:59 PM, Myklebust, Trond wrote:
> 
> > On Mon, 2012-06-11 at 21:40 +0300, Boaz Harrosh wrote:
> > 
> >> And again, please explain why do you want it. What is wrong with the
> >> case we all agree with? ie: "Client can not call LAYOUTRETURN until
> >> all in-flight RPCs return, with or without an error"
> > 
> > Who "agreed" to this? This would mean that if the DS goes down, we can't
> > ever send LAYOUTRETURN which is patently wrong.
> > 
> 
> 
> "DS goes down" is under the above "RPC return an error" the error condition
> of an RPC is well defined.

???? Now you have me extremely confused. How does the client distinguish
between ETIMEDOUT-because-DS-went-down and
ETIMEDOUT-but-in-flight-RPCs-will-eventually-succeed-so-please-hold-that-LAYOUTRETURN?

> >From what of my words did you understand that I said
> 	"we can't ever send a LAYOUTRETURN"
> 
> If my English is wrongly worded. Which is perfectly possible. Please correct
> me so I can learn. Did you honestly think that's what I meant? 
> 
> I meant we all agree, that this case is covered by RFC. That is  - no one would
> accuse a client who does that, as violating the RFC.
> 
(Continue reading)

Welch, Brent | 12 Jun 2012 18:34
Favicon

Re: RFC 5661 LAYOUTRETURN clarification.

This is why the objects layout communicates errors upon layout return.

When you suffer write timeouts (or any errors) with a DS, then the client promptly returns the layout with an
indication of what went wrong.  This should be the intent of whatever words are in the RFC.

--
Brent

-----Original Message-----
From: nfsv4-bounces <at> ietf.org [mailto:nfsv4-bounces <at> ietf.org] On Behalf Of Myklebust, Trond
Sent: Monday, June 11, 2012 11:59 AM
To: Harrosh, Boaz
Cc: NFS list; Adamson, Andy; Andy Adamson; NFSv4
Subject: Re: [nfsv4] RFC 5661 LAYOUTRETURN clarification.

On Mon, 2012-06-11 at 21:40 +0300, Boaz Harrosh wrote:

> And again, please explain why do you want it. What is wrong with the
> case we all agree with? ie: "Client can not call LAYOUTRETURN until
> all in-flight RPCs return, with or without an error"

Who "agreed" to this? This would mean that if the DS goes down, we can't
ever send LAYOUTRETURN which is patently wrong.

--

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust <at> netapp.com
(Continue reading)

david.noveck | 11 Jun 2012 21:02

Re: RFC 5661 LAYOUTRETURN clarification.

> And again, please explain why do you want it. What is wrong with the
> case we all agree with? ie: "Client can not call LAYOUTRETURN until
> all in-flight RPCs return, with or without an error"

It's a recipe for data corruption.  If, as Andy explained, he starts doing
IO's (let's suppose WRITEs) to the MDS any lingering WRITEs to the DS
since they reflect an earlier state of affairs can cause data corruption.

There are three ways to prevent those lingering DS writes from corrupting 
data:

1) Doing a LAYOUTRETURN
2) waiting until the IO's return.
3) "magically plugging the network interface".

Since there is no way to do 3), saying that you only can do 1) until after
2) is done is essentially going to mean:

a) that it may take a very long time:
b) that you will only do it, when it is no longer useful.

If you do 1) asap, then the lingering DS write problem is gone sooner,
and that's a good thing. 

-----Original Message-----
From: nfsv4-bounces <at> ietf.org [mailto:nfsv4-bounces <at> ietf.org] On Behalf Of Boaz Harrosh
Sent: Monday, June 11, 2012 2:41 PM
To: Andy Adamson
Cc: Andy Adamson; NFS list; Trond Myklebust; NFSv4
Subject: Re: [nfsv4] RFC 5661 LAYOUTRETURN clarification.
(Continue reading)

Myklebust, Trond | 11 Jun 2012 21:43
Picon

Re: RFC 5661 LAYOUTRETURN clarification.


The _only_ reason why a pNFS files client would ever want to send a
LAYOUTRETURN is in order to have the MDS take action to fence off any
outstanding writes to the DS.

The _only_ case where that is actually an important issue is when
something happens to the DS which forces the client to fall back to
writing through the MDS.

_ALL_ other cases are trivially covered by the existing NFSv4 state
model in that when the client unlocks and/or closes the file, then the
lock/open stateids that are used in the READ and WRITE operations will
be updated, and will cause those operations to be rejected with a
BAD_STATEID error. This fencing model is irrespective of whether or not
a layout is held, and is irrespective of whether the READ/WRITE was sent
to the MDS or the DS.

IOW: if pNFS files servers don't want to do this kind of fencing, then I
suggest we file an errata that labels the LAYOUTRETURN operation as
mandatory to not implement for those servers.

On Mon, 2012-06-11 at 15:02 -0400, david.noveck <at> emc.com wrote:
> > And again, please explain why do you want it. What is wrong with the
> > case we all agree with? ie: "Client can not call LAYOUTRETURN until
> > all in-flight RPCs return, with or without an error"
> 
> It's a recipe for data corruption.  If, as Andy explained, he starts doing
> IO's (let's suppose WRITEs) to the MDS any lingering WRITEs to the DS
> since they reflect an earlier state of affairs can cause data corruption.
> 
(Continue reading)

Myklebust, Trond | 11 Jun 2012 22:11
Picon

Re: [nfsv4] RFC 5661 write to DS clarification. (was: [nfsv4] RFC 5661 LAYOUTRETURN clarification)

BTW: 

I forgot to add that the fencing issues are also the reason why the
Linux client is unlikely to comply any time soon with RFC5661 Section
13.9.1.'s request that we prefer use of the OPEN stateid over the LOCK
stateid when talking to the DS.
If the server revokes the lock, or if the client calls LOCKU, all WRITEs
that were made under that lock need to be fenced off. Unless mandatory
locking is in effect, that won't happen if the WRITE ops were sent using
the OPEN stateid.

This is also why I believe we should revisit the rule that the client
should only send stateids with a zero seqid to the DS.

Cheers
  Trond

On Mon, 2012-06-11 at 15:43 -0400, Trond Myklebust wrote:
> The _only_ reason why a pNFS files client would ever want to send a
> LAYOUTRETURN is in order to have the MDS take action to fence off any
> outstanding writes to the DS.
> 
> The _only_ case where that is actually an important issue is when
> something happens to the DS which forces the client to fall back to
> writing through the MDS.
> 
> _ALL_ other cases are trivially covered by the existing NFSv4 state
> model in that when the client unlocks and/or closes the file, then the
> lock/open stateids that are used in the READ and WRITE operations will
> be updated, and will cause those operations to be rejected with a
(Continue reading)


Gmane