Ben Gamari | 5 Jul 2012 10:57
Picon
Gravatar

System.Posix.IO.ByteString types?


Recently, I spent a fair amount of time tracking down an encoding issue[1]
in the `serialport` package. As is often the case, `serialport` used
Strings, which blew up quite nicely when working with a binary
protocol. Delving into the code it eventually became fairly obvious that
the problem was due to encoding and could be fixed by completely moving
to ByteStrings.

Unfortunately, while the unix package contains a
System.Posix.IO.ByteString module, all of the functions within still
appear to work with Strings[2]. While it appears that paths are handled
a bit differently in the ByteString variant, but it seems like this is a
fairly small consideration compared to the data itself.

Have I stumbled across a bug? Is there anyway short of the non-Haskell
Platform unix-bytestring package to get ByteStrings out of fds? If not,
it seems this is a large hole in the interfaces exposed by unix. Is
there any plan to remedy this?

Cheers,

- Ben

[1] https://github.com/jputcu/serialport/issues/12
[2] http://hackage.haskell.org/packages/archive/unix/2.5.1.1/doc/html/System-Posix-IO-ByteString.html
Simon Marlow | 5 Jul 2012 13:38
Picon

Re: System.Posix.IO.ByteString types?

On 05/07/2012 09:57, Ben Gamari wrote:
>
> Recently, I spent a fair amount of time tracking down an encoding issue[1]
> in the `serialport` package. As is often the case, `serialport` used
> Strings, which blew up quite nicely when working with a binary
> protocol. Delving into the code it eventually became fairly obvious that
> the problem was due to encoding and could be fixed by completely moving
> to ByteStrings.
>
> Unfortunately, while the unix package contains a
> System.Posix.IO.ByteString module, all of the functions within still
> appear to work with Strings[2]. While it appears that paths are handled
> a bit differently in the ByteString variant, but it seems like this is a
> fairly small consideration compared to the data itself.
>
> Have I stumbled across a bug? Is there anyway short of the non-Haskell
> Platform unix-bytestring package to get ByteStrings out of fds? If not,
> it seems this is a large hole in the interfaces exposed by unix. Is
> there any plan to remedy this?

The .ByteString variants of the modules in System.Posix were added 
primarily to address the problem that FilePath was represented as String 
in the existing System.Posix APIs, but POSIX file paths are really 
sequences of bytes.

Making ByteString versions of fdRead and fdWrite is a separate issue; 
but an important one undoubtedly.  If there's general agreement on what 
the API should be, and if someone writes a patch, I'd be happy to apply it.

Cheers,
(Continue reading)

wren ng thornton | 10 Jul 2012 09:00

Re: System.Posix.IO.ByteString types?

On 7/5/12 4:57 AM, Ben Gamari wrote:
> Recently, I spent a fair amount of time tracking down an encoding issue[1]
> in the `serialport` package. As is often the case, `serialport` used
> Strings, which blew up quite nicely when working with a binary
> protocol. Delving into the code it eventually became fairly obvious that
> the problem was due to encoding and could be fixed by completely moving
> to ByteStrings.
>
> Unfortunately, while the unix package contains a
> System.Posix.IO.ByteString module, all of the functions within still
> appear to work with Strings[2]. While it appears that paths are handled
> a bit differently in the ByteString variant, but it seems like this is a
> fairly small consideration compared to the data itself.

It is indeed a small consideration, IMO. In order to use ByteStrings as 
the medium of the data being exchanged, you want to use the 
unix-bytestring library[1]. Unfortunately that may be taxing on newer 
GHCs since the new version of unix stole the namespaces from 
unix-bytestring.

I had tried to get unix-bytestring incorporated into unix previously 
(before unix's .ByteString modules existed), but made the unfortunate 
decision to emulate the String-based API--- and noone liked the old API 
to begin with, so the proposal was rejected. I've since modified the API 
to what I think is a nice design. It closely mirrors the official POSIX 
and XPG standards, and has been battle-tested by myself and other folks 
who were eschewing the unix package and doing their own ByteString-based 
operations. I'd love for unix-bytestring to be folded into unix itself, 
but have no time to make another proposal. If you'd like to do so on my 
behalf, feel free.
(Continue reading)

Bas van Dijk | 11 Jul 2012 20:00
Picon
Gravatar

Re: System.Posix.IO.ByteString types?

On 10 July 2012 09:00, wren ng thornton <wren <at> freegeek.org> wrote:
> I'd love for unix-bytestring to be folded into unix itself, but have no time
> to make another proposal. If you'd like to do so on my behalf, feel free.

You can already write down my +1!

Bas

Gmane