Mike Coleman | 7 Oct 04:05

newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Hi,

I could use a little help.  I was looking through the Real World
Haskell book and came across a trivial program for summing numbers in
a file.  They mentioned that that implementation was very slow, as
it's based on String's, so I thought I'd try my hand at converting it
to use lazy ByteString's.  I've made some progress, but now I'm a
little stuck because that module doesn't seem to have a 'read' method.

There's a readInt method, which I guess I could use, but it returns a
Maybe, and I don't see how I can easily strip that off.

So:

1.  Is there an easy way to strip off the Maybe that would allow an
equivalently concise definition for sumFile?  I can probably figure
out how to do it with pattern matching and a separate function--I'm
just wondering if there's a more concise way.

2.  Why doesn't ByteString implement 'read'?  Is it just that this
function (like 'input' in Python) isn't really very useful for real
programs?

3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
are Int and Integer treated specially?  Do I not need readDouble?

Thanks,
Mike

-- lazy version (doesn't compile)
(Continue reading)

Jason Dagit | 7 Oct 04:21

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)



On Mon, Oct 6, 2008 at 7:06 PM, Mike Coleman <tutufan <at> gmail.com> wrote:
Hi,

I could use a little help.  I was looking through the Real World
Haskell book and came across a trivial program for summing numbers in
a file.  They mentioned that that implementation was very slow, as
it's based on String's, so I thought I'd try my hand at converting it
to use lazy ByteString's.  I've made some progress, but now I'm a
little stuck because that module doesn't seem to have a 'read' method.

There's a readInt method, which I guess I could use, but it returns a
Maybe, and I don't see how I can easily strip that off.

So:

1.  Is there an easy way to strip off the Maybe that would allow an
equivalently concise definition for sumFile?  I can probably figure
out how to do it with pattern matching and a separate function--I'm
just wondering if there's a more concise way.

I'm not a ByteString expert, but there should be an easy way to solve this issue of Maybe.

If you go to hoogle (http://www.haskell.org/hoogle/) and type this:
[Maybe a] -> [a]
it says:
Data.Maybe.catMaybes :: [Maybe a] -> [a]

As the top search result.

This means that you can convert any list of maybes into a list of what you want.  It just tosses out the Nothings.



2.  Why doesn't ByteString implement 'read'?  Is it just that this
function (like 'input' in Python) isn't really very useful for real
programs?

I think probably for things more complex than parsing ints it's best to make your own parser?  I seem to recall that someone was working on a library of parsing functions based on bytestring?  Maybe someone else can comment?

3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
are Int and Integer treated specially?  Do I not need readDouble?

I think readInt was mostly implemented because integer reading was needed a lot for benchmarks and programming challenge sites and people noticed it was way slower than needed so someone put in the effort it optimize it.  Once it was optimized, that must have satisfied the need for fast number reading.

I would agree that at least for Prelude types it would be nice to have efficient bytestring based parsers.  Do we have Read/Show classes specifically for working in bytestrings?  Maybe that's what the world needs in the bytestring api.  Then again, I'm not really qualified to comment :)  For all I know it already exists.

Jason

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Don Stewart | 7 Oct 04:31
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

dagit:
>    As the top search result.
> 
>    This means that you can convert any list of maybes into a list of what you
>    want.  It just tosses out the Nothings.
> 
>      2.  Why doesn't ByteString implement 'read'?  Is it just that this
>      function (like 'input' in Python) isn't really very useful for real
>      programs?
> 
>    I think probably for things more complex than parsing ints it's best to
>    make your own parser?  I seem to recall that someone was working on a
>    library of parsing functions based on bytestring?  Maybe someone else can
>    comment?
> 
>      3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
>      are Int and Integer treated specially?  Do I not need readDouble?
> 
>    I think readInt was mostly implemented because integer reading was needed
>    a lot for benchmarks and programming challenge sites and people noticed it
>    was way slower than needed so someone put in the effort it optimize it. 
>    Once it was optimized, that must have satisfied the need for fast number
>    reading.

There's readInt, readInteger and readDouble now, as primitives, because
people ask for them.

>    I would agree that at least for Prelude types it would be nice to have
>    efficient bytestring based parsers.  Do we have Read/Show classes
>    specifically for working in bytestrings?  Maybe that's what the world
>    needs in the bytestring api.  Then again, I'm not really qualified to
>    comment :)  For all I know it already exists.
> 

Data.Binary is the generic parser for bytestrings. A higher level one
for parsing text (e.g. atto-parsec), might be worthwhile.

-- Don
Daniel Fischer | 7 Oct 04:35

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Am Dienstag, 7. Oktober 2008 04:21 schrieb Jason Dagit:
> On Mon, Oct 6, 2008 at 7:06 PM, Mike Coleman <tutufan <at> gmail.com> wrote:
> > Hi,
> >
> > I could use a little help.  I was looking through the Real World
> > Haskell book and came across a trivial program for summing numbers in
> > a file.  They mentioned that that implementation was very slow, as
> > it's based on String's, so I thought I'd try my hand at converting it
> > to use lazy ByteString's.  I've made some progress, but now I'm a
> > little stuck because that module doesn't seem to have a 'read' method.
> >
> > There's a readInt method, which I guess I could use, but it returns a
> > Maybe, and I don't see how I can easily strip that off.
> >
> > So:
> >
> > 1.  Is there an easy way to strip off the Maybe that would allow an
> > equivalently concise definition for sumFile?  I can probably figure
> > out how to do it with pattern matching and a separate function--I'm
> > just wondering if there's a more concise way.
>
> I'm not a ByteString expert, but there should be an easy way to solve this
> issue of Maybe.
>
> If you go to hoogle (http://www.haskell.org/hoogle/) and type this:
> [Maybe a] -> [a]
> it says:
> Data.Maybe<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb
>e.html>
> .catMaybes<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb
>e.html#v:catMaybes>:: [Maybe
> a] ->
> [a]<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html#
>v:catMaybes>
>
> As the top search result.
>
> This means that you can convert any list of maybes into a list of what you
> want.  It just tosses out the Nothings.

Since readInt returns a Maybe (Int,ByteString), Data.List.unfoldr would be a 
better fit for his needs.

The bytestring-lexing package

(http://hackage.haskell.org/packages/archive/bytestring-lexing/0.1.2/doc/html/Data-ByteString-Lex-Double.html) 
provides readDouble, which is also pretty fast, I think.

>
> > 2.  Why doesn't ByteString implement 'read'?  Is it just that this
> > function (like 'input' in Python) isn't really very useful for real
> > programs?
>
> I think probably for things more complex than parsing ints it's best to
> make your own parser?  I seem to recall that someone was working on a
> library of parsing functions based on bytestring?  Maybe someone else can
> comment?

At least parsec 3.0.0 has ByteString parsing modules (I've never used it, so I 
don't know how well they work).
IIRC, there's a plan to expand the bytestring-lexing package, too.

>
> 3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
>
> > are Int and Integer treated specially?  Do I not need readDouble?
>
> I think readInt was mostly implemented because integer reading was needed a
> lot for benchmarks and programming challenge sites and people noticed it
> was way slower than needed so someone put in the effort it optimize it. 
> Once it was optimized, that must have satisfied the need for fast number
> reading.

More's underway, readDouble already delivered.
>
> I would agree that at least for Prelude types it would be nice to have
> efficient bytestring based parsers.  Do we have Read/Show classes
> specifically for working in bytestrings?  Maybe that's what the world needs
> in the bytestring api.  Then again, I'm not really qualified to comment :)
> For all I know it already exists.

partially.
>
> Jason
Mike Coleman | 7 Oct 05:14

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Thanks for your replies.  Hoogle is pretty cool--I didn't know about that.

I ended up with this, which is pretty close to the original (it will
also bomb if given non-integer input):

import qualified Data.ByteString.Lazy.Char8 as L
import qualified Data.Maybe as M

main = do
         contents <- L.getContents
         print (sumFile contents)
             where sumFile = sum . map (fst . M.fromJust . L.readInt) . L.words

One further problem I've encountered: My Haskell program runs under
'runghc', but I get a link error when compiling it (GHC 6.8.2, latest
Ubuntu).  Any suggestions?

$ ghc -o LazySumFile LazySumFile.hs
LazySumFile.o: In function `sOz_info':
(.text+0x200): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzyziChar8_readInt_closure'
LazySumFile.o: In function `sOF_info':
(.text+0x35d): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzyziChar8_words_closure'
LazySumFile.o: In function `sRF_info':
(.text+0x4d7): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzy_getContents_closure'
LazySumFile.o: In function `sRF_info':
(.text+0x64f): undefined reference to
`__stginit_bytestringzm0zi9zi0zi1_DataziByteStringziLazzyziChar8_'
LazySumFile.o: In function `sOF_srt':
(.data+0xa0): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzyziChar8_words_closure'
LazySumFile.o: In function `sOF_srt':
(.data+0xa8): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzyziChar8_readInt_closure'
LazySumFile.o: In function `rOh_closure':
(.data+0x110): undefined reference to
`bytestringzm0zi9zi0zi1_DataziByteStringziLazzy_getContents_closure'
collect2: ld returned 1 exit status
Don Stewart | 7 Oct 05:19
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

tutufan:
> Thanks for your replies.  Hoogle is pretty cool--I didn't know about that.
> 
> I ended up with this, which is pretty close to the original (it will
> also bomb if given non-integer input):
> 
> import qualified Data.ByteString.Lazy.Char8 as L
> import qualified Data.Maybe as M
> 
> main = do
>          contents <- L.getContents
>          print (sumFile contents)
>              where sumFile = sum . map (fst . M.fromJust . L.readInt) . L.words
> 
> 
> One further problem I've encountered: My Haskell program runs under
> 'runghc', but I get a link error when compiling it (GHC 6.8.2, latest
> Ubuntu).  Any suggestions?

Oh wow, runghc is an interpreter. It is on average about 30x slower than
compiled code.

To compile, try something like:

    ghc -O2 --make A.hs

And enjoy the native codes. :)

-- Don
Mike Coleman | 7 Oct 06:17

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

On Mon, Oct 6, 2008 at 10:19 PM, Don Stewart <dons <at> galois.com> wrote:
> To compile, try something like:
>
>    ghc -O2 --make A.hs

That did the trick--thanks.

For input large enough to be a good test, this is I/O-bound on my
laptop (the speed of the compiled and interpreted versions are almost
identical), so I'll have to try it on a bigger machine tomorrow.

Mike
wman | 7 Oct 05:07

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

a slight modification to compile it :

change:
where sumFile = sum . map read . L.words
to :
where sumFile = sum . map (read . L.unpack) . L.words

but it's actually _slower_ than the non-bytestring version.

i did a little test, three versions of the same script and manufactured meself  a ~50 MB file containing 1M of  ints 0-65535. and replaced the sum with length for obvious reasons.

module Main where
import qualified Data.ByteString.Lazy.Char8 as L

main1 = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = length . map L.readInt . L.words

main2 = do
        contents <- getContents
        print (sumFile contents)
            where sumFile = length . map (read :: String -> Int) . words

main3 = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = length . map ((read :: String -> Int) . L.unpack) . L.words 

time main3 < nums
real    0m22.421s
user    0m0.031s
sys     0m0.000s

time main2 < nums
real    0m14.296s
user    0m0.015s
sys     0m0.016s

time main1 < nums
real    0m22.078s
user    0m0.015s
sys     0m0.015s

i expected the conversions (L.unpack in main3) to kill the performance a little, but not to make it nearly two times as slow.
and i certainly did not expect that even the version using the bytestring readInt to be as slow  ...

did I do something wrong ?


On Tue, Oct 7, 2008 at 4:06 AM, Mike Coleman <tutufan <at> gmail.com> wrote:
Hi,

I could use a little help.  I was looking through the Real World
Haskell book and came across a trivial program for summing numbers in
a file.  They mentioned that that implementation was very slow, as
it's based on String's, so I thought I'd try my hand at converting it
to use lazy ByteString's.  I've made some progress, but now I'm a
little stuck because that module doesn't seem to have a 'read' method.

There's a readInt method, which I guess I could use, but it returns a
Maybe, and I don't see how I can easily strip that off.

So:

1.  Is there an easy way to strip off the Maybe that would allow an
equivalently concise definition for sumFile?  I can probably figure
out how to do it with pattern matching and a separate function--I'm
just wondering if there's a more concise way.

2.  Why doesn't ByteString implement 'read'?  Is it just that this
function (like 'input' in Python) isn't really very useful for real
programs?

3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
are Int and Integer treated specially?  Do I not need readDouble?

Thanks,
Mike


-- lazy version (doesn't compile)

-- file: ch08/SumFile.hs

import qualified Data.ByteString.Lazy.Char8 as L

main = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = sum . map read . L.words
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Don Stewart | 7 Oct 05:12
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

666wman:
>    a slight modification to compile it :
> 
>    change:
>    where sumFile = sum . map read . L.words
>    to :
>    where sumFile = sum . map (read . L.unpack) . L.words
> 
>    but it's actually _slower_ than the non-bytestring version.

Never unpack a bytestring.

    import qualified Data.ByteString.Lazy.Char8 as S

    main = print . go 0 =<< S.getContents
      where
        go n s = case S.readInt s of
                        Nothing     -> n
                        Just (k,t)  -> go (n+k) (S.tail t)

Assuming you're reading int, integers or doubles.
wman | 7 Oct 05:17

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

the problem is that using readInt is actually _as slow_, at least using my test script :-((

On Tue, Oct 7, 2008 at 5:12 AM, Don Stewart <dons <at> galois.com> wrote:
666wman:
>    a slight modification to compile it :
>
>    change:
>    where sumFile = sum . map read . L.words
>    to :
>    where sumFile = sum . map (read . L.unpack) . L.words
>
>    but it's actually _slower_ than the non-bytestring version.

Never unpack a bytestring.

   import qualified Data.ByteString.Lazy.Char8 as S

   main = print . go 0 =<< S.getContents
     where
       go n s = case S.readInt s of
                       Nothing     -> n
                       Just (k,t)  -> go (n+k) (S.tail t)

Assuming you're reading int, integers or doubles.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Don Stewart | 7 Oct 05:19
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Hmm. How are you compiling it? Using bytestring 0.9.1.x ?

Should be fast,
    http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all

Assuming you're turning on optimisations ( ghc -O2 )

-- Don

666wman:
>    the problem is that using readInt is actually _as slow_, at least using my
>    test script :-((
> 
>    On Tue, Oct 7, 2008 at 5:12 AM, Don Stewart <[1]dons <at> galois.com> wrote:
> 
>      666wman:
>      >    a slight modification to compile it :
>      >
>      >    change:
>      >    where sumFile = sum . map read . L.words
>      >    to :
>      >    where sumFile = sum . map (read . L.unpack) . L.words
>      >
>      >    but it's actually _slower_ than the non-bytestring version.
> 
>      Never unpack a bytestring.
> 
>         import qualified Data.ByteString.Lazy.Char8 as S
> 
>         main = print . go 0 =<< S.getContents
>           where
>             go n s = case S.readInt s of
>                             Nothing     -> n
>                             Just (k,t)  -> go (n+k) (S.tail t)
> 
>      Assuming you're reading int, integers or doubles.
> 
> References
> 
>    Visible links
>    1. mailto:dons <at> galois.com
wman | 7 Oct 05:36

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

new figures, after updating bytestring (0.9.0.1.1 -> 0.9.1.2) && using -O2

time Main < nums
real    0m2.531s
user    0m0.015s
sys     0m0.015s

time Main2 < nums
real    0m13.999s
user    0m0.015s
sys     0m0.015s

time Main3 < nums
real    0m2.796s
user    0m0.015s
sys     0m0.015s

thats more like it, even the unpacking didn't hurt so much.

the morals: "Thou shalt update your libraries" & "Thou shalt not forget to turn on optimizations" before bitching it's too slow ;-)))

thx

On Tue, Oct 7, 2008 at 5:19 AM, Don Stewart <dons <at> galois.com> wrote:
Hmm. How are you compiling it? Using bytestring 0.9.1.x ?

Should be fast,
   http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all

Assuming you're turning on optimisations ( ghc -O2 )

-- Don

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
wman | 7 Oct 05:54

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

just for the kicks i tried the new version of bytestring without -O2 and the results were even worse:

time Main1 < nums
real    0m40.092s
user    0m0.015s
sys     0m0.015s

time Main3 < nums
real    0m41.405s
user    0m0.015s
sys     0m0.015s

it got pwned even by this very naive ruby scipt (which, btw, chewed through some ~600 MB of memory ;-))

File.open("nums","r") do |f|
  puts((f.read.split.each {|x| x.to_i }).length)
end

time ruby nums.rb
real    0m21.609s
user    0m0.015s
sys     0m0.015s

so it probably can't be stressed enough: repeat "-O2"

and now seriously:

is there a reason why -O2 shouldn't be made the default (and allowing to turn off optimizations by -O0 perhaps) ?

On Tue, Oct 7, 2008 at 5:36 AM, wman <666wman <at> gmail.com> wrote:
new figures, after updating bytestring (0.9.0.1.1 -> 0.9.1.2) && using -O2

time Main < nums
real    0m2.531s

user    0m0.015s
sys     0m0.015s

time Main2 < nums
real    0m13.999s

user    0m0.015s
sys     0m0.015s

time Main3 < nums
real    0m2.796s

user    0m0.015s
sys     0m0.015s

thats more like it, even the unpacking didn't hurt so much.

the morals: "Thou shalt update your libraries" & "Thou shalt not forget to turn on optimizations" before bitching it's too slow ;-)))

thx


On Tue, Oct 7, 2008 at 5:19 AM, Don Stewart <dons <at> galois.com> wrote:
Hmm. How are you compiling it? Using bytestring 0.9.1.x ?

Should be fast,
   http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all

Assuming you're turning on optimisations ( ghc -O2 )

-- Don


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Favicon

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

On 2008 Oct 6, at 23:54, wman wrote:
just for the kicks i tried the new version of bytestring without -O2 and the results were even worse:

Yep, ByteString does a lot of stuff that is really bad when GHC isn't performing stream fusion --- but when it is, the code compiles down to tight machine code that can rival C.

is there a reason why -O2 shouldn't be made the default (and allowing to turn off optimizations by -O0 perhaps) ?

-O2 breaks on some platforms, I think.

-- 
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery <at> kf8nh.com
system administrator [openafs,heimdal,too many hats] allbery <at> ece.cmu.edu
electrical and computer engineering, carnegie mellon university    KF8NH


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Bulat Ziganshin | 7 Oct 12:25

Re[2]: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Hello Brandon,

Tuesday, October 7, 2008, 7:59:06 AM, you wrote:

> is there a reason why -O2 shouldn't be made the default (and
> allowing to turn off optimizations by -O0 perhaps) ?

it compiles ~2x slower and firces more recompilation (because it does
inter-module inlining). so it's not perfect for edit-compile-debug
runs

--

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin <at> gmail.com
Don Stewart | 7 Oct 06:27
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

666wman:
>    just for the kicks i tried the new version of bytestring without -O2 and
>    the results were even worse:

Note that without -O or -O2 no strictness analysis is performed. So that
tail recursive loop ... won't be. You could try -Onot -fstrictness just
for kicks, to see why strictness analysis is important when writing in a
tail recursive style.

-- Don
wman | 7 Oct 06:44

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

ghc -Onot -fstrictness --make Main1.hs && ghc -Onot -fstrictness --make Main2.hs && ghc -Onot -fstrictness --make Main3.hs

time Main1 < nums
real    0m39.530s
user    0m0.015s
sys     0m0.030s

time Main2 < nums
real    0m14.078s
user    0m0.015s
sys     0m0.015s

time Main3.exe < nums
real    0m41.342s
user    0m0.015s
sys     0m0.015s

still, i'm going to google up strictness analysis to at least know what made no difference in this case ;-)

btw, why is the example #2 (http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=ghc&id=2) (which kicks collective asses of all other participants) not considered in the shootout ? Too much optimizations ?

On Tue, Oct 7, 2008 at 6:27 AM, Don Stewart <dons <at> galois.com> wrote:
666wman:
>    just for the kicks i tried the new version of bytestring without -O2 and
>    the results were even worse:

Note that without -O or -O2 no strictness analysis is performed. So that
tail recursive loop ... won't be. You could try -Onot -fstrictness just
for kicks, to see why strictness analysis is important when writing in a
tail recursive style.

-- Don

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Don Stewart | 7 Oct 06:50
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

666wman:
>    ghc -Onot -fstrictness --make Main1.hs && ghc -Onot -fstrictness --make
>    Main2.hs && ghc -Onot -fstrictness --make Main3.hs
> 
>    time Main1 < nums
>    real    0m39.530s
>    user    0m0.015s
>    sys     0m0.030s
> 
>    time Main2 < nums
>    real    0m14.078s
>    user    0m0.015s
>    sys     0m0.015s
> 
>    time Main3.exe < nums
>    real    0m41.342s
>    user    0m0.015s
>    sys     0m0.015s
> 
>    still, i'm going to google up strictness analysis to at least know what
>    made no difference in this case ;-)

Oh, that's interesting. Did they actually recompile? (you used
-fforce-recomp). Otherwise, might be an idea to use hpaste.org to paste
teh Main1,2,3 source code, since I think people reading don't know what
is represented in each program at this point.

-- Don
Bulat Ziganshin | 7 Oct 12:28

Re[2]: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Hello wman,

Tuesday, October 7, 2008, 8:44:48 AM, you wrote:

> btw, why is the example #2
> (http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=ghc&id=2)
> (which kicks collective asses of all other participants) not
> considered in the shootout ? Too much optimizations ?

it's cheating - application code written in low-level style instead of
using library functions (if you will write in the same style in C it
will probably even several times faster)

--

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin <at> gmail.com
George Pollard | 7 Oct 23:19
Favicon
Gravatar

Re: newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

On Mon, 2008-10-06 at 21:06 -0500, Mike Coleman wrote:
> There's a readInt method, which I guess I could use, but it returns a
> Maybe, and I don't see how I can easily strip that off.

You can use Data.Maybe's 'mapMaybe' function
"The mapMaybe function is a version of map which can throw out elements.
In particular, the functional argument returns something of type Maybe
b. If this is Nothing, no element is added on to the result list. If it
just Just b, then b is included in the result list."

So we change this:
> -- lazy version (doesn't compile)
> 
> -- file: ch08/SumFile.hs
> 
> import qualified Data.ByteString.Lazy.Char8 as L
> 
> main = do
>          contents <- L.getContents
>          print (sumFile contents)
>              where sumFile = sum . map read . L.words

To this:
import qualified Data.ByteString.Lazy.Char8 as B
import Data.Maybe (mapMaybe)

main = do
        contents <- B.getContents
        print (sumFile contents)
        where sumFile = sum . map fst . mapMaybe B.readInt . B.words

Now, there is  a problem with this; the original version will not
read integers that are followed by garbage, whereas this one will.
The difference is between readInt and read:

(read "2a") will fail,
(readInt "2a") will return Just (2,"a")

readInt is really designed with parsers in mind. So, we need to filter
out anything readInt returns with a non-empty string in the right part
of the tuple. Ending up with this:

sumFile = sum . map fst . filter (B.null . snd) . mapMaybe B.readInt . B.words

Actually, after saying this, the original version probably
just dies when given garbage :P
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane