Niklas Hambüchen | 21 Aug 04:29 2013

ANNOUNCE: posix-paths, for faster file system operations

John Lato and I would like to announce our posix-paths package.

    https://github.com/JohnLato/posix-paths

It implements a large portion of System.Posix.FilePath using ByteString
based RawFilePaths instead of String based FilePaths, and on top of that
provides a Traversal module with a fast replacement for
`getDirectoryContents` and a recursive `allDirectoryContents`.

`getDirectoryContents` is (unsurprisingly?) really slow.
Our replacement is 11 times faster in the recursive use case [1], and
only 20% slower than `find`.

Benchmarks are at [2], code is at [3].

We hope that these improvements will eventually make it into base some day.

Until then, we propose our package as a base for discussion and further
improvements.

Contributions are welcome:
Some FilePath operations are not in it yet (especially the Windows /
drive related ones), and our traversals might not work on Windows.
We would also appreciate some thorough looks at their low level
implementations.
If you find our benchmarks against getDirectoryContents unfair or would
like to add another one, please send a pull request.

We have been running this on Linux production machines for a few months
now, and are pleased by the speed-up.
(Continue reading)

John Wiegley | 21 Aug 09:21 2013

Re: [Haskell-cafe] ANNOUNCE: posix-paths, for faster file system operations

>>>>> Niklas Hambüchen <mail <at> nh2.me> writes:

> It implements a large portion of System.Posix.FilePath using ByteString
> based RawFilePaths instead of String based FilePaths

Was there a reason you didn't base your work on the already existing
system-filepath library?  Did it have deficiencies that your rendition
corrects?

--

-- 
John Wiegley
FP Complete                         Haskell tools, training and consulting
http://fpcomplete.com               johnw on #haskell/irc.freenode.net

_______________________________________________
Haskell mailing list
Haskell <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell
John Lato | 21 Aug 09:49 2013
Picon

Re: [Haskell-cafe] ANNOUNCE: posix-paths, for faster file system operations

On Wed, Aug 21, 2013 at 2:21 AM, John Wiegley <johnw <at> fpcomplete.com> wrote:
>>>>> Niklas Hambüchen <mail <at> nh2.me> writes:

> It implements a large portion of System.Posix.FilePath using ByteString
> based RawFilePaths instead of String based FilePaths

Was there a reason you didn't base your work on the already existing
system-filepath library?  Did it have deficiencies that your rendition
corrects?

posix-paths exists in part to provide a convenient API for working with the RawFilePath type provided by the "unix" package.  system-filepath is predominantly String-based, which makes it incompatible with posix system calls (unless you perform extra marshaling).  Also its FilePath type has significant space overhead (since it's String-based), which can be an issue if you actually need to maintain large collections of paths.

I wouldn't necessarily call these deficiencies of system-filepath, since they provide certain advantages for many uses.  But the performance overhead is significant.
_______________________________________________
Haskell mailing list
Haskell <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell

Gmane