Alfredo Di Napoli | 16 Oct 14:46 2012
Picon

I would like a clarification about Enumerators and Iteratees, please :)

Hi guys,


I've started playing with Iteratee and Enumerators: very cool and addictive stuff.

I have wrote this simple code:


In a nutshell, it gives back the number of occurences for a single char in case the argument passed from the command line is a single char,
or the number of lines of the entire file.

I've tried it first on a small file (~100MB) and then on a huge one (~3GB). As far as I understood Iteratee IO should be a smart way to do IO,
avoiding to keep the entire file in memory; what I observe in the second case, instead, is a sort of memory leak. The memory grows and grows
until the entire machine hangs.

If I split the huge file in "small" chunks, memory comsumption is still high but the computation terminates and is fast.

So my two questions:

a) What am I missing? Should be memory used be constant? And how can I achieve this purpose?
b) I'm using the package enumerator because as far as I can see is the most used. How does it compares with "iteratee"? Which of the two are more performant?


Bye and thanks,
A.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gregory Collins | 16 Oct 14:53 2012
Picon

Re: I would like a clarification about Enumerators and Iteratees, please :)

You have a space leak in "countCharBS". Put a bang pattern on your accumulator.

On Tue, Oct 16, 2012 at 2:46 PM, Alfredo Di Napoli <alfredo.dinapoli <at> gmail.com> wrote:
Hi guys,

I've started playing with Iteratee and Enumerators: very cool and addictive stuff.

I have wrote this simple code:


In a nutshell, it gives back the number of occurences for a single char in case the argument passed from the command line is a single char,
or the number of lines of the entire file.

I've tried it first on a small file (~100MB) and then on a huge one (~3GB). As far as I understood Iteratee IO should be a smart way to do IO,
avoiding to keep the entire file in memory; what I observe in the second case, instead, is a sort of memory leak. The memory grows and grows
until the entire machine hangs.

If I split the huge file in "small" chunks, memory comsumption is still high but the computation terminates and is fast.

So my two questions:

a) What am I missing? Should be memory used be constant? And how can I achieve this purpose?
b) I'm using the package enumerator because as far as I can see is the most used. How does it compares with "iteratee"? Which of the two are more performant?


Bye and thanks,
A.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe




--
Gregory Collins <greg <at> gregorycollins.net>
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane