Nick Rudnick | 2 Feb 21:19 2013
Picon

Substantial (1:10??) system dependencies of runtime performance??

Dear all,


for quite a while now, I have experienced this issue with some curiosity; yesterday I had it again, when a program that took well over one hour before only needed about ten minutes, after a system reboot (recent Ubuntu) and with no browser started -- finally deciding to post this.

I still can't reproduce these effects, but there is indication it is connected with browser use (mostly Google Chrome, with usually 10's of windows and ~100 folders open) and especially use of video players; closing or killing doesn't seem to set free resources, a reboot or at least suspend to disk seems to be necessary (suspend to RAM doesn't seem enough).

Roughly, I would say the differences in runtime can reach a factor as much as 1:10 at many times -- and so I am curious whether this subject has already been observed or even better discussed elsewhere. I have spoken to somebody, and our only plausible conclusion was that software like web browsers is able to somewhat aggressively claim system resources higher in the privilege hierarchy (cache?? register??), so that they are not available to other programs any more.

I hope this is interesting to others, too, I guess it is an issue for anybody programming computation intensive code to be run on standard systems with other applications running there, too, and having to predict the estimated runtime to the client.

Maybe I have overseen some libs which are already able to scan the system state in this regard, or even tell Haskell to behave 'less nice' when other applications are known to be of lower priority??

Thanks a lot in advance, Nick
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gwern Branwen | 2 Feb 21:52 2013
Picon

Re: Substantial (1:10??) system dependencies of runtime performance??

On Sat, Feb 2, 2013 at 3:19 PM, Nick Rudnick <nick.rudnick <at> gmail.com> wrote:
> Roughly, I would say the differences in runtime can reach a factor as much
> as 1:10 at many times -- and so I am curious whether this subject has
> already been observed or even better discussed elsewhere. I have spoken to
> somebody, and our only plausible conclusion was that software like web
> browsers is able to somewhat aggressively claim system resources higher in
> the privilege hierarchy (cache?? register??), so that they are not available
> to other programs any more.

Maybe the Haskell program requires a lot of disk IO? That could easily
lead to a big performance change since disk is so slow compared to
everything else these days. You could try looking with 'lsof' to see
if the browser has a ton of files open or try running the Haskell
program with higher or lower disk IO priority via 'ionice'.

--

-- 
gwern
http://www.gwern.net
Nick Rudnick | 2 Feb 22:09 2013
Picon

Re: Substantial (1:10??) system dependencies of runtime performance??

Hi Gwern,

thanks for the interesting info. I quite often have processing of CSV file data of about 100M-1G done.

Thanks a lot, Nick

2013/2/2 Gwern Branwen <gwern0 <at> gmail.com>
On Sat, Feb 2, 2013 at 3:19 PM, Nick Rudnick <nick.rudnick <at> gmail.com> wrote:
> Roughly, I would say the differences in runtime can reach a factor as much
> as 1:10 at many times -- and so I am curious whether this subject has
> already been observed or even better discussed elsewhere. I have spoken to
> somebody, and our only plausible conclusion was that software like web
> browsers is able to somewhat aggressively claim system resources higher in
> the privilege hierarchy (cache?? register??), so that they are not available
> to other programs any more.

Maybe the Haskell program requires a lot of disk IO? That could easily
lead to a big performance change since disk is so slow compared to
everything else these days. You could try looking with 'lsof' to see
if the browser has a ton of files open or try running the Haskell
program with higher or lower disk IO priority via 'ionice'.

--
gwern
http://www.gwern.net

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Erik de Castro Lopo | 3 Feb 01:17 2013

Re: Substantial (1:10??) system dependencies of runtime performance??

Nick Rudnick wrote:

> thanks for the interesting info. I quite often have processing of CSV file
> data of about 100M-1G done.

What library are you using to process the CSV? I have had problems
with excessive laziness causing processing of a 75Meg CSV file
consuming 500+ megabytes and after I fixed it memory usage
dropped to under a megabyte. Processing time dropped from over 10
minutes to about 2 minutes.

I blogged my problem and solution here:

    http://www.mega-nerd.com/erikd/Blog/CodeHacking/Haskell/my_space_is_leaking.html

I probably need to revisit that because the problem can probably 
be fixed without deepseq-generics and just using BangPatterns.

Erik
--

-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
Ozgun Ataman | 3 Feb 02:14 2013
Picon

Re: Substantial (1:10??) system dependencies of runtime performance??

If you are doing row-by-row transformations, I would recommend giving a try to my csv-conduit or
csv-enumerator packages on Hackage. They were designed with constant space operation in mind, which may
help you here. 

If you're keeping an accumulator around, however, you may still run into issues with too much laziness. 

Ozgun

On Feb 2, 2013, at 7:17 PM, Erik de Castro Lopo <mle+hs <at> mega-nerd.com> wrote:

> Nick Rudnick wrote:
> 
>> thanks for the interesting info. I quite often have processing of CSV file
>> data of about 100M-1G done.
> 
> What library are you using to process the CSV? I have had problems
> with excessive laziness causing processing of a 75Meg CSV file
> consuming 500+ megabytes and after I fixed it memory usage
> dropped to under a megabyte. Processing time dropped from over 10
> minutes to about 2 minutes.
> 
> I blogged my problem and solution here:
> 
>    http://www.mega-nerd.com/erikd/Blog/CodeHacking/Haskell/my_space_is_leaking.html
> 
> I probably need to revisit that because the problem can probably 
> be fixed without deepseq-generics and just using BangPatterns.
> 
> Erik
> -- 
> ----------------------------------------------------------------------
> Erik de Castro Lopo
> http://www.mega-nerd.com/
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
Johan Tibell | 3 Feb 02:19 2013
Picon

Re: Substantial (1:10??) system dependencies of runtime performance??

On Sat, Feb 2, 2013 at 5:14 PM, Ozgun Ataman <ozataman <at> gmail.com> wrote:
> If you are doing row-by-row transformations, I would recommend giving a try to my csv-conduit or
csv-enumerator packages on Hackage. They were designed with constant space operation in mind, which may
help you here.
>
> If you're keeping an accumulator around, however, you may still run into issues with too much laziness.

The cassava package also has a Streaming and an Incremental module for
constant space parsing.
Erik de Castro Lopo | 3 Feb 04:04 2013

Re: Substantial (1:10??) system dependencies of runtime performance??

Ozgun Ataman wrote:

> If you are doing row-by-row transformations, I would recommend
> giving a try to my csv-conduit

I was using csv-conduit.

> If you're keeping an accumulator around, however, you may still
> run into issues with too much laziness. 

This was the problem which I solved with deepseq-generic. However,
I suspect deepseq-generic was a bigger hammer than was actually
needed.

Erik
--

-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Gmane