David Xu | 1 Aug 2012 04:49
Picon
Favicon

short read/write and error code

POSIX requires write() to return actually bytes written, same rule is
applied to read().

http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
> ETURN VALUE
>
> Upon successful completion, write() [XSI]   and pwrite()  shall
 > return the number of bytes actually written to the file associated
> with fildes. This number shall never be greater than nbyte.
 > Otherwise, -1 shall be returned and errno set to indicate the error.

http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
> RETURN VALUE
>
> Upon successful completion, read() [XSI]   and pread()  shall return
 > a non-negative integer indicating the number of bytes actually read.
 > Otherwise, the functions shall return -1 and set errno to indicate
 > the error.

I have following patch to fix our code to be compatible with POSIX:

Index: sys_generic.c
===================================================================
--- sys_generic.c	(revision 238927)
+++ sys_generic.c	(working copy)
 <at>  <at>  -333,8 +333,7  <at>  <at> 
  #endif
  	cnt = auio->uio_resid;
  	if ((error = fo_read(fp, auio, td->td_ucred, flags, td))) {
-		if (auio->uio_resid != cnt && (error == ERESTART ||
(Continue reading)

Alfred Perlstein | 1 Aug 2012 06:11
Picon
Favicon
Gravatar

Re: short read/write and error code

This is cool, is there a test case that you can run
on Linux/Solaris to compare expected vs actual
behavior?

* David Xu <davidxu <at> freebsd.org> [120731 19:49] wrote:
> POSIX requires write() to return actually bytes written, same rule is
> applied to read().
> 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
> >ETURN VALUE
> >
> >Upon successful completion, write() [XSI]   and pwrite()  shall
> > return the number of bytes actually written to the file associated
> >with fildes. This number shall never be greater than nbyte.
> > Otherwise, -1 shall be returned and errno set to indicate the error.
> 
> 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
> >RETURN VALUE
> >
> >Upon successful completion, read() [XSI]   and pread()  shall return
> > a non-negative integer indicating the number of bytes actually read.
> > Otherwise, the functions shall return -1 and set errno to indicate
> > the error.
> 
> I have following patch to fix our code to be compatible with POSIX:
> 
> Index: sys_generic.c
> ===================================================================
> --- sys_generic.c	(revision 238927)
(Continue reading)

David Xu | 1 Aug 2012 07:09
Picon

Re: short read/write and error code

On 2012/8/1 12:11, Alfred Perlstein wrote:
> This is cool, is there a test case that you can run
> on Linux/Solaris to compare expected vs actual
> behavior?

I have written a simple test program:
http://people.freebsd.org/~davidxu/patch/fifopipe/fifotest.c 
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/fifotest.c>

and results are:
http://people.freebsd.org/~davidxu/patch/fifopipe/typescript.linux.txt
http://people.freebsd.org/~davidxu/patch/fifopipe/typescript.fbsd9.txt
http://people.freebsd.org/~davidxu/patch/fifopipe/typescript.mycur.txt

_______________________________________________
freebsd-arch <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe <at> freebsd.org"

Konstantin Belousov | 1 Aug 2012 09:19
Picon

Re: short read/write and error code

On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
> POSIX requires write() to return actually bytes written, same rule is
> applied to read().
> 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
> >ETURN VALUE
> >
> >Upon successful completion, write() [XSI]   and pwrite()  shall
> > return the number of bytes actually written to the file associated
> >with fildes. This number shall never be greater than nbyte.
> > Otherwise, -1 shall be returned and errno set to indicate the error.
> 
> 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
> >RETURN VALUE
> >
> >Upon successful completion, read() [XSI]   and pread()  shall return
> > a non-negative integer indicating the number of bytes actually read.
> > Otherwise, the functions shall return -1 and set errno to indicate
> > the error.
Note that the wording is only about successful return, not for the case
when error occured. I do think that if fo_read() returned an error, and
error is not of the kind 'interruption', then the error shall be returned
as is.

> 
> I have following patch to fix our code to be compatible with POSIX:
...

> -current only resets error code to zero for short write when code is
(Continue reading)

David Xu | 1 Aug 2012 09:59
Picon

Re: short read/write and error code

On 2012/8/1 15:19, Konstantin Belousov wrote:
> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>> POSIX requires write() to return actually bytes written, same rule is
>> applied to read().
>>
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>> ETURN VALUE
>>>
>>> Upon successful completion, write() [XSI]   and pwrite()  shall
>>> return the number of bytes actually written to the file associated
>>> with fildes. This number shall never be greater than nbyte.
>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>> RETURN VALUE
>>>
>>> Upon successful completion, read() [XSI]   and pread()  shall return
>>> a non-negative integer indicating the number of bytes actually read.
>>> Otherwise, the functions shall return -1 and set errno to indicate
>>> the error.
> Note that the wording is only about successful return, not for the case
> when error occured. I do think that if fo_read() returned an error, and
> error is not of the kind 'interruption', then the error shall be returned
> as is.
I do think data is more important than error code.  Do you think if a 
512 bytes block is bad,
all bytes in the block should be thrown away while you could really get 
some bytes from it,
this might be very important to someone, such as a password or a bank 
account,  this
(Continue reading)

Warner Losh | 1 Aug 2012 16:12

Re: short read/write and error code


On Aug 1, 2012, at 1:59 AM, David Xu wrote:

> On 2012/8/1 15:19, Konstantin Belousov wrote:
>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>>> POSIX requires write() to return actually bytes written, same rule is
>>> applied to read().
>>> 
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>>> ETURN VALUE
>>>> 
>>>> Upon successful completion, write() [XSI]   and pwrite()  shall
>>>> return the number of bytes actually written to the file associated
>>>> with fildes. This number shall never be greater than nbyte.
>>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>> 
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>>> RETURN VALUE
>>>> 
>>>> Upon successful completion, read() [XSI]   and pread()  shall return
>>>> a non-negative integer indicating the number of bytes actually read.
>>>> Otherwise, the functions shall return -1 and set errno to indicate
>>>> the error.
>> Note that the wording is only about successful return, not for the case
>> when error occured. I do think that if fo_read() returned an error, and
>> error is not of the kind 'interruption', then the error shall be returned
>> as is.
> I do think data is more important than error code.  Do you think if a 512 bytes block is bad,
> all bytes in the block should be thrown away while you could really get some bytes from it,
> this might be very important to someone, such as a password or a bank account,  this
(Continue reading)

Bruce Evans | 1 Aug 2012 20:04
Picon

Re: short read/write and error code

On Wed, 1 Aug 2012, Warner Losh wrote:

> On Aug 1, 2012, at 1:59 AM, David Xu wrote:
>
>> On 2012/8/1 15:19, Konstantin Belousov wrote:
>>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:

Please trim quotes.

>>>> ...[some trimmed]

>>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>>>> RETURN VALUE
>>>>>
>>>>> Upon successful completion, read() [XSI]   and pread()  shall return
>>>>> a non-negative integer indicating the number of bytes actually read.
>>>>> Otherwise, the functions shall return -1 and set errno to indicate
>>>>> the error.
>>> Note that the wording is only about successful return, not for the case
>>> when error occured. I do think that if fo_read() returned an error, and
>>> error is not of the kind 'interruption', then the error shall be returned
>>> as is.
>> I do think data is more important than error code.  Do you think if a 512 bytes block is bad,
>> all bytes in the block should be thrown away while you could really get some bytes from it,
>> this might be very important to someone, such as a password or a bank account,  this
>> is just an example, whether filesystem works in this way is irrelevant.
>
> You do know that with disk drives it is an all or nothing sort of thing at the sector level.  Either you get the
whole thing, or you get none of it.  There's no partial sector reads, and there's no way to get the data
generally.  Some drives sometimes allow you to access raw tracks, but those interfaces are never
(Continue reading)

David Xu | 2 Aug 2012 00:02
Picon

Re: short read/write and error code

On 2012/8/1 22:12, Warner Losh wrote:
> On Aug 1, 2012, at 1:59 AM, David Xu wrote:
>
>> On 2012/8/1 15:19, Konstantin Belousov wrote:
>>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>>>> POSIX requires write() to return actually bytes written, same rule is
>>>> applied to read().
>>>>
>>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>>>> ETURN VALUE
>>>>>
>>>>> Upon successful completion, write() [XSI]   and pwrite()  shall
>>>>> return the number of bytes actually written to the file associated
>>>>> with fildes. This number shall never be greater than nbyte.
>>>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>>>> RETURN VALUE
>>>>>
>>>>> Upon successful completion, read() [XSI]   and pread()  shall return
>>>>> a non-negative integer indicating the number of bytes actually read.
>>>>> Otherwise, the functions shall return -1 and set errno to indicate
>>>>> the error.
>>> Note that the wording is only about successful return, not for the case
>>> when error occured. I do think that if fo_read() returned an error, and
>>> error is not of the kind 'interruption', then the error shall be returned
>>> as is.
>> I do think data is more important than error code.  Do you think if a 512 bytes block is bad,
>> all bytes in the block should be thrown away while you could really get some bytes from it,
>> this might be very important to someone, such as a password or a bank account,  this
>> is just an example, whether filesystem works in this way is irrelevant.
(Continue reading)

Bruce Evans | 2 Aug 2012 07:53
Picon

Re: short read/write and error code

On Thu, 2 Aug 2012, David Xu wrote:

> On 2012/8/1 22:12, Warner Losh wrote:
>> On Aug 1, 2012, at 1:59 AM, David Xu wrote:

Please trim quotes!

>[>>>>...] trimmed

>> You do know that with disk drives it is an all or nothing sort of thing at 
>> the sector level.  Either you get the whole thing, or you get none of it. 
>> There's no partial sector reads, and there's no way to get the data 
>> generally.  Some drives sometimes allow you to access raw tracks, but those 
>> interfaces are never connected to read, but usually an ioctl that issues 
>> the special command and returns the results.  And even then, it returns 
>> everything (perhaps including the ECC bytes)
> Sorry, my example is not precise, see blow.

> Unfortunately, the dofileread and dofilewrite are very high level API, it 
> does not device,
> it is file oriented API, not device oriented. Let me interpret what's wrong 
> in their code.
> dofileread requests fo_read to read back data into user space buffer, the 
> user space buffer
> can be very large.  fo_read is an intermediate layer, assume it supports 
> large buffer size,
> for example, it is file system's  interface to read data, or it can be some 
> intermediate code
> which also supports very large buffer size until max-value of SSIZE_T,  at 
> lowest level, they
(Continue reading)

Bruce Evans | 1 Aug 2012 11:23
Picon

Re: short read/write and error code

On Wed, 1 Aug 2012, Konstantin Belousov wrote:

> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>> POSIX requires write() to return actually bytes written, same rule is
>> applied to read().
>>
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>> ETURN VALUE
>>>
>>> Upon successful completion, write() [XSI]   and pwrite()  shall
>>> return the number of bytes actually written to the file associated
>>> with fildes. This number shall never be greater than nbyte.
>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>> RETURN VALUE
>>>
>>> Upon successful completion, read() [XSI]   and pread()  shall return
>>> a non-negative integer indicating the number of bytes actually read.
>>> Otherwise, the functions shall return -1 and set errno to indicate
>>> the error.
> Note that the wording is only about successful return, not for the case
> when error occured. I do think that if fo_read() returned an error, and
> error is not of the kind 'interruption', then the error shall be returned
> as is.

That is clearly not what is intended.  write() is unusable if it won't
tell you how many bytes it wrote.  According to your interpretation,
recalcitrantix would conform to POSIX if all it writes wrote whatever
they could and then returned -1 after detecting the error EPOSIXFUZZY.
(Continue reading)

David Xu | 1 Aug 2012 11:57
Picon

Re: short read/write and error code

On 2012/8/1 17:23, Bruce Evans wrote:
> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>
>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>>> POSIX requires write() to return actually bytes written, same rule is
>>> applied to read().
>>>
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>>> ETURN VALUE
>>>>
>>>> Upon successful completion, write() [XSI]   and pwrite() shall
>>>> return the number of bytes actually written to the file associated
>>>> with fildes. This number shall never be greater than nbyte.
>>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>>
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>>> RETURN VALUE
>>>>
>>>> Upon successful completion, read() [XSI]   and pread() shall return
>>>> a non-negative integer indicating the number of bytes actually read.
>>>> Otherwise, the functions shall return -1 and set errno to indicate
>>>> the error.
>> Note that the wording is only about successful return, not for the case
>> when error occured. I do think that if fo_read() returned an error, and
>> error is not of the kind 'interruption', then the error shall be 
>> returned
>> as is.
>
> That is clearly not what is intended.  write() is unusable if it won't
> tell you how many bytes it wrote.  According to your interpretation,
(Continue reading)

Konstantin Belousov | 1 Aug 2012 18:28
Picon

Re: short read/write and error code

On Wed, Aug 01, 2012 at 07:23:09PM +1000, Bruce Evans wrote:
> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
> 
> >On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
> >>POSIX requires write() to return actually bytes written, same rule is
> >>applied to read().
> >>
> >>http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
> >>>ETURN VALUE
> >>>
> >>>Upon successful completion, write() [XSI]   and pwrite()  shall
> >>>return the number of bytes actually written to the file associated
> >>>with fildes. This number shall never be greater than nbyte.
> >>>Otherwise, -1 shall be returned and errno set to indicate the error.
> >>
> >>http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
> >>>RETURN VALUE
> >>>
> >>>Upon successful completion, read() [XSI]   and pread()  shall return
> >>>a non-negative integer indicating the number of bytes actually read.
> >>>Otherwise, the functions shall return -1 and set errno to indicate
> >>>the error.
> >Note that the wording is only about successful return, not for the case
> >when error occured. I do think that if fo_read() returned an error, and
> >error is not of the kind 'interruption', then the error shall be returned
> >as is.
> 
> That is clearly not what is intended.  write() is unusable if it won't
> tell you how many bytes it wrote.  According to your interpretation,
> recalcitrantix would conform to POSIX if all it writes wrote whatever
(Continue reading)

Bruce Evans | 1 Aug 2012 20:58
Picon

Re: short read/write and error code

On Wed, 1 Aug 2012, Konstantin Belousov wrote:

> On Wed, Aug 01, 2012 at 07:23:09PM +1000, Bruce Evans wrote:
>> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>>
>>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
> ...
>> The usability is specified for signals.  From an old POSIX draft:
>>
>> % 51235              If write( ) is interrupted by a signal before it
>> writes any data, it shall return -1 with errno set to
>> % 51236              [EINTR].
>> % 51237              If write( ) is interrupted by a signal after it
>> successfully writes some data, it shall return the
>> % 51238              number of bytes written.
> This is exactly what existing code does.

Yes, this is one of the few cases that is completely specified (when
complications for SA_RESTART are added).

>> POSIX formally defines "Successfully Transferred", mainly for aio.  I
>> couldn't find any formal definition of "successfully writes", but clearly
>> it is nonsense for a write to be unsuccessful if a reader on the local
>> system or on an external system has successfully read some of the data
>> written by the write.
>>
>> FreeBSD does try to convert EINTR to 0 after some data has been written,
>> to conform to the above.  SIGPIPE should return EINTR to be returned to
>> dofilewrite(), so there should be no problem for SIGPIPE.  But we were
>> reminded of this old FreeBSD bug by probelms with SIGPIPE.
(Continue reading)

David Xu | 2 Aug 2012 01:31
Picon

Re: short read/write and error code

On 2012/8/2 2:58, Bruce Evans wrote:
> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>
>> As I said, patch behaviour in regard of SIGPIPE is just wrong.
>
> It is too simple.

The pipe code already reset error code zero if the writer have written 
all bytes
to pipe buffer, it does not return EPIPE even if reader closed the pipe, so
SIGPIPE is not sent in the case.
The SIGPIPE is only generated for short write, this is an intention, it 
is for
applications which takes no notice of short write, such an application 
does not
check return value from write(), when kernel generates SIGPIPE, the 
default action
is to kill the application, if application setup a sigaction for 
SIGPIPE, this implies
that application knows short write. So the patch is correct when it sees 
EPIPE code.

Regards,
David Xu

_______________________________________________
freebsd-arch <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe <at> freebsd.org"

(Continue reading)

Konstantin Belousov | 2 Aug 2012 12:02
Picon

Re: short read/write and error code

On Thu, Aug 02, 2012 at 04:58:49AM +1000, Bruce Evans wrote:
> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
> 
> >On Wed, Aug 01, 2012 at 07:23:09PM +1000, Bruce Evans wrote:
> >>On Wed, 1 Aug 2012, Konstantin Belousov wrote:
> >>
> >>>On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
> >...
> >>The usability is specified for signals.  From an old POSIX draft:
> >>
> >>% 51235              If write( ) is interrupted by a signal before it
> >>writes any data, it shall return -1 with errno set to
> >>% 51236              [EINTR].
> >>% 51237              If write( ) is interrupted by a signal after it
> >>successfully writes some data, it shall return the
> >>% 51238              number of bytes written.
> >This is exactly what existing code does.
> 
> Yes, this is one of the few cases that is completely specified (when
> complications for SA_RESTART are added).
> 
> >>POSIX formally defines "Successfully Transferred", mainly for aio.  I
> >>couldn't find any formal definition of "successfully writes", but clearly
> >>it is nonsense for a write to be unsuccessful if a reader on the local
> >>system or on an external system has successfully read some of the data
> >>written by the write.
> >>
> >>FreeBSD does try to convert EINTR to 0 after some data has been written,
> >>to conform to the above.  SIGPIPE should return EINTR to be returned to
> >>dofilewrite(), so there should be no problem for SIGPIPE.  But we were
(Continue reading)

David Xu | 2 Aug 2012 13:52
Picon

Re: short read/write and error code

On 2012/8/2 18:02, Konstantin Belousov wrote:
> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
> index 338256c..1a61b93 100644
> --- a/sys/kern/sys_pipe.c
> +++ b/sys/kern/sys_pipe.c
>  <at>  <at>  -1286,13 +1286,13  <at>  <at>  pipe_write(fp, uio, active_cred, flags, td)
>   	}
>   
>   	/*
> -	 * Don't return EPIPE if I/O was successful
> +	 * Don't return EPIPE if any byte was written.
> +	 * EINTR and other interrupts are handled by generic I/O layer.
> +	 * Do not pretend that I/O succeeded for obvious user error
> +	 * like EFAULT.
>   	 */
> -	if ((wpipe->pipe_buffer.cnt == 0) &&
> -	    (uio->uio_resid == 0) &&
> -	    (error == EPIPE)) {
> +	if (uio->uio_resid != orig_resid && error == EPIPE)
>   		error = 0;
> -	}
>   
>   	if (error == 0)
>   		vfs_timestamp(&wpipe->pipe_mtime);

I dislike the patch, if I thought it is right, I would have already 
submit such
a patch. Now let us see why your patch is wore than my version (attached):
-current:
     short write is done, some bytes were sent
(Continue reading)

David Xu | 2 Aug 2012 13:53
Picon

Re: short read/write and error code

On 2012/8/2 19:52, David Xu wrote:
> On 2012/8/2 18:02, Konstantin Belousov wrote:
>> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
>> index 338256c..1a61b93 100644
>> --- a/sys/kern/sys_pipe.c
>> +++ b/sys/kern/sys_pipe.c
>>  <at>  <at>  -1286,13 +1286,13  <at>  <at>  pipe_write(fp, uio, active_cred, flags, td)
>>       }
>>         /*
>> -     * Don't return EPIPE if I/O was successful
>> +     * Don't return EPIPE if any byte was written.
>> +     * EINTR and other interrupts are handled by generic I/O layer.
>> +     * Do not pretend that I/O succeeded for obvious user error
>> +     * like EFAULT.
>>        */
>> -    if ((wpipe->pipe_buffer.cnt == 0) &&
>> -        (uio->uio_resid == 0) &&
>> -        (error == EPIPE)) {
>> +    if (uio->uio_resid != orig_resid && error == EPIPE)
>>           error = 0;
>> -    }
>>         if (error == 0)
>>           vfs_timestamp(&wpipe->pipe_mtime);
>
> I dislike the patch, if I thought it is right, I would have already 
> submit such
> a patch. Now let us see why your patch is wore than my version 
> (attached):
> -current:
>     short write is done, some bytes were sent
(Continue reading)

Konstantin Belousov | 2 Aug 2012 15:51
Picon

Re: short read/write and error code

On Thu, Aug 02, 2012 at 07:52:11PM +0800, David Xu wrote:
> On 2012/8/2 18:02, Konstantin Belousov wrote:
> >diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
> >index 338256c..1a61b93 100644
> >--- a/sys/kern/sys_pipe.c
> >+++ b/sys/kern/sys_pipe.c
> > <at>  <at>  -1286,13 +1286,13  <at>  <at>  pipe_write(fp, uio, active_cred, flags, td)
> >  	}
> >  
> >  	/*
> >-	 * Don't return EPIPE if I/O was successful
> >+	 * Don't return EPIPE if any byte was written.
> >+	 * EINTR and other interrupts are handled by generic I/O layer.
> >+	 * Do not pretend that I/O succeeded for obvious user error
> >+	 * like EFAULT.
> >  	 */
> >-	if ((wpipe->pipe_buffer.cnt == 0) &&
> >-	    (uio->uio_resid == 0) &&
> >-	    (error == EPIPE)) {
> >+	if (uio->uio_resid != orig_resid && error == EPIPE)
> >  		error = 0;
> >-	}
> >  
> >  	if (error == 0)
> >  		vfs_timestamp(&wpipe->pipe_mtime);
> 
> I dislike the patch, if I thought it is right, I would have already 
> submit such
> a patch. Now let us see why your patch is wore than my version (attached):
> -current:
(Continue reading)

David Xu | 2 Aug 2012 16:17
Picon

Re: short read/write and error code

On 2012/8/2 21:51, Konstantin Belousov wrote:
> On Thu, Aug 02, 2012 at 07:52:11PM +0800, David Xu wrote:
>> On 2012/8/2 18:02, Konstantin Belousov wrote:
>>> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
>>> index 338256c..1a61b93 100644
>>> --- a/sys/kern/sys_pipe.c
>>> +++ b/sys/kern/sys_pipe.c
>>>  <at>  <at>  -1286,13 +1286,13  <at>  <at>  pipe_write(fp, uio, active_cred, flags, td)
>>>   	}
>>>   
>>>   	/*
>>> -	 * Don't return EPIPE if I/O was successful
>>> +	 * Don't return EPIPE if any byte was written.
>>> +	 * EINTR and other interrupts are handled by generic I/O layer.
>>> +	 * Do not pretend that I/O succeeded for obvious user error
>>> +	 * like EFAULT.
>>>   	 */
>>> -	if ((wpipe->pipe_buffer.cnt == 0) &&
>>> -	    (uio->uio_resid == 0) &&
>>> -	    (error == EPIPE)) {
>>> +	if (uio->uio_resid != orig_resid && error == EPIPE)
>>>   		error = 0;
>>> -	}
>>>   
>>>   	if (error == 0)
>>>   		vfs_timestamp(&wpipe->pipe_mtime);
>> I dislike the patch, if I thought it is right, I would have already
>> submit such
>> a patch. Now let us see why your patch is wore than my version (attached):
>> -current:
(Continue reading)

Bruce Evans | 2 Aug 2012 15:54
Picon

Re: short read/write and error code

Please trime quotes!!

On Thu, 2 Aug 2012, Konstantin Belousov wrote:

> On Thu, Aug 02, 2012 at 04:58:49AM +1000, Bruce Evans wrote:
>> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>>
>>> On Wed, Aug 01, 2012 at 07:23:09PM +1000, Bruce Evans wrote:
>>>[>] ...
>>>> FreeBSD does try to convert EINTR to 0 after some data has been written,
>>>> to conform to the above.  SIGPIPE should return EINTR to be returned to
>>>> dofilewrite(), so there should be no problem for SIGPIPE.  But we were
>>>> reminded of this old FreeBSD bug by probelms with SIGPIPE.
>>> Sorry, I do not understand this, esp. second sentence.
>>
>> Oops, the second sentence has bad grammar.  I meant "generation of
>> SIGPIPE should cause EINTR to be returned to dofilewrite(), so...".
>> What actually happens is that lower-level code returns EPIPE, without
>> generating EINTR, to dofilewrite() (except socket code generates
>> SIGPIPE directly unless this is disabled by a socket option).  Then
>> dofilewrite() first fails to see the EINTR because the return code
>> is EPIPE (this happens even for the socket case).  Then dofilewrite()
>> generates SIGPIPE, again without changing the error code to EINTR or
>> looping back to pick up the special EINTR handling.  This is actually
>> correct -- the error is specified to be EPIPE, not EINTR.  (This is
>> similar to what happens for the internally-generated SIGXFSZ -- there
>> is a not-so-special error code for that.)
> Um, no, I do not see how EINTR could be referenced in this context at all.
> SUSv4 says quite unambigous that "[EPIPE] An attempt is made to
> write to a pipe or FIFO that is not open for reading by any process, or
(Continue reading)

Konstantin Belousov | 2 Aug 2012 19:05
Picon

Re: short read/write and error code

On Thu, Aug 02, 2012 at 11:54:43PM +1000, Bruce Evans wrote:
> Please trime quotes!!
Should this request be trimmed ?

> 
> On Thu, 2 Aug 2012, Konstantin Belousov wrote:
> 
> >On Thu, Aug 02, 2012 at 04:58:49AM +1000, Bruce Evans wrote:
> >>On Wed, 1 Aug 2012, Konstantin Belousov wrote:
> We must return a short write with no SIGPIPE, then SIGPIPE and EPIPE for
> the next write (without writing anything).
Exactly. I really tired arguing about this point with David.
He stopped providing any technical reasoning in later conversation,
so I stopped replying to him.

> 
> >For naive programs, which are not aware that stdout can be
> >pipe (i.e. the original target audience of SIGPIPE) not delivering the
> >signal on write(2) which write anything is just fine, since we can
> >make a valid assumption that they would repeat the write, and then
> >get EPIPE/SIGPIPE as intended.
I decided not to trim the paragraph above, possibly getting a reprimand
for excessive quoting :). It is too useful for the context.
> 
> This is correct for non-naive programs too, but the assumption isn't
> valid.  Non-naive programs don't understand short writes and typically
> treat them as errors.  Except ones that use stdio -- stdio doesn't
> repeat the whole write, but continues from the point that was
> successfully written up to.
Non-naive programs do understand short writes when they expect the file
(Continue reading)

David Xu | 3 Aug 2012 03:51
Picon

Re: short read/write and error code

On 2012/8/3 1:05, Konstantin Belousov wrote:
> On Thu, Aug 02, 2012 at 11:54:43PM +1000, Bruce Evans wrote:
>> Please trime quotes!!
> Should this request be trimmed ?
>
>> On Thu, 2 Aug 2012, Konstantin Belousov wrote:
>>
>>> On Thu, Aug 02, 2012 at 04:58:49AM +1000, Bruce Evans wrote:
>>>> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>> We must return a short write with no SIGPIPE, then SIGPIPE and EPIPE for
>> the next write (without writing anything).
> Exactly. I really tired arguing about this point with David.
> He stopped providing any technical reasoning in later conversation,
> so I stopped replying to him.
I found you were always ignoring what I said about the technical reasoning.
You are too arrogant. This is harmful to the project, may drive people away.
I agree that you may be good at some programing skill, but you are not
God. "generic layer, dont change it,..." or very generalized response, but
what can not be changed if it is wrong ? should the bug be kept there 
forever ? :-(

_______________________________________________
freebsd-arch <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe <at> freebsd.org"

Bruce Evans | 4 Aug 2012 08:32
Picon

Re: short read/write and error code

On Thu, 2 Aug 2012, Konstantin Belousov wrote:

I plan to reply to this in more detail, but don't have time and want
to slow down this thread.

> ...
> So the bugs with losing data shall be fixed in the filesystems.
> Otherwise well-behaving filesystems which do return errors only when
> it is proper to return error are punished.
> ...

My main point is that this has nothing to do with file systems.  It
is special files, fifos and many other non-regular files that are
the problem.  File systems like ffs already reset uio when the return
an error.  Thus there is (should be) no problem for file systems with
clearing `error' at the dofile*() level if uio shows i/o.  For special
files, it more usually correct to not reset uio, and then it is
convenient to not clear `error' until the dofile*() level.

I see a problem with this mainly for (seekable r/w) direct access
devices (DASDs).  These are similar to regular files in a critical
respect: suppose you write in the middle of a disk or file and get an
i/o error somewhere.  Then sometimes, even after writing parts
successfully, you (you == the system) have no idea where the error
was.  The best handling is as in ffs: fail the whole i/o, and reset
uio completely.  If you know where the error was, and that it didn't
corrupt the file (say for EFAULT or ENOSPC), then you can do better
and return a short count with no error.  ffs doesn't try to do better.
When you don't do better, the application has the burden of figuring
out where the error was and how much of the file was corrupted.  Even
(Continue reading)


Gmane