Tom Ellis | 24 Feb 18:49 2013
Picon

Thunks and GHC pessimisation

To avoid retaining a large lazy data structure in memory it is useful to
hide it behind a function call.  Below, "many" is used twice.  It is hidden
behind a function call so it can be garbage collected between uses.  That's
good.  When compiling with "-O" it seems that GHC 7.4.1 decides to keep it
in memory anyway.  That's bad.  (I can't read core so I don't know exactly
what's going on).  Replacing one of the "many" in "twice" with
"different_many" makes everything fine again.

Is this considered a bug in GHC?  Is it a known bug?  It is incredibly
concerning that GHC would perform this kind of pessimisation.

Tom

% cat thunkfail.hs                                                               
{-# OPTIONS_GHC -fno-warn-unused-binds #-}

import Data.List

million :: Int
million = 10 ^ (6 :: Int)

many :: () -> [Int]
many () = [1..million]

different_many :: () -> [Int]
different_many () = [1..million]

twice :: (Int, Int)
twice = (foldl' (+) 0 (many ()), foldl' (+) 0 (many ()))

(Continue reading)

Joachim Breitner | 24 Feb 19:12 2013
Picon

Re: Thunks and GHC pessimisation

Hi,

I believe, without checking, that

Am Sonntag, den 24.02.2013, 17:49 +0000 schrieb Tom Ellis:
> many :: () -> [Int]
> many () = [1..million]

is indeed called many times. The problem is that [1..million] is not
dependent on the parameter, so GHC will float it out to a top level
declaration, and hence share it between the multiple calls to many.

You should try:

> million :: () -> Int
> million _ = 10 ^ (6 :: Int)
>
> many :: () -> [Int]
> many x = [1..million x]

Greetings,
Joachim

(Maybe I should continue to work on http://arxiv.org/abs/1106.3478 and
related ideas, like annotating thunks in the code as unshareable... or
does this mostly occur in small example programs like this?)

--

-- 
Joachim "nomeata" Breitner
Debian Developer
(Continue reading)

Tom Ellis | 24 Feb 19:41 2013
Picon

Re: Thunks and GHC pessimisation

On Sun, Feb 24, 2013 at 07:12:24PM +0100, Joachim Breitner wrote:
> You should try:
> 
> > million :: () -> Int
> > million _ = 10 ^ (6 :: Int)
> >
> > many :: () -> [Int]
> > many x = [1..million x]

Thanks Joachim, but that doesn't work either.

Tom

% cat thunkfail.hs 
{-# OPTIONS_GHC -fno-warn-unused-binds #-}

import Data.List

million :: () -> Int
million _ = 10 ^ (6 :: Int)

many :: () -> [Int]
many x = [1..million x]

different_many :: () -> [Int]
different_many x = [1..million x]

twice :: (Int, Int)
twice = (foldl' (+) 0 (many ()), foldl' (+) 0 (many ()))

(Continue reading)

Joachim Breitner | 25 Feb 09:44 2013
Picon

Re: Thunks and GHC pessimisation

Hi,

Am Sonntag, den 24.02.2013, 18:41 +0000 schrieb Tom Ellis:
> On Sun, Feb 24, 2013 at 07:12:24PM +0100, Joachim Breitner wrote:
> > You should try:
> > 
> > > million :: () -> Int
> > > million _ = 10 ^ (6 :: Int)
> > >
> > > many :: () -> [Int]
> > > many x = [1..million x]
> 
> Thanks Joachim, but that doesn't work either.

oh, strange, I though I had added
        Maybe you will need {-# NOINLINE million #-} as well.
to the mailing. Must have skipped or accidentally removed it... anyways,
with that pragma it works.

Greetings,
Joachim

--

-- 
Joachim "nomeata" Breitner
Debian Developer
  nomeata <at> debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C
  JID: nomeata <at> joachim-breitner.de | http://people.debian.org/~nomeata

(Continue reading)

Don Stewart | 24 Feb 19:25 2013
Picon

Re: Thunks and GHC pessimisation

In toy examples like this it will be generally hard to convince GHC not to just collapse your program down to a constant, when you're turning up the optimization level. 

In particular, you are implying -ffull-laziness with -O (or -O2), which can increase sharing.

GHC doesn't implement complete full-laziness. When optimisation in on, and -fno-full-laziness is not  given, some transformations that increase sharing are performed, such as extracting repeated computations from a loop. These are the same transformations that a fully lazy implementation would do, the difference is that GHC doesn't consistently apply full-laziness, so don't rely on it.


If you explicitly rely on this not happening, turn it off:

$ ghc -O2 -fno-full-laziness --make A.hs -rtsopts -fforce-recomp 
[1 of 1] Compiling Main             ( A.hs, A.o )
Linking A ...

$ time ./A +RTS -M750k
(500000500000,500000500000)
./A +RTS -M750k  0.06s user 0.00s system 97% cpu 0.069 total

A 750k heap should be enough for anyone :)

-- Don

On Sun, Feb 24, 2013 at 5:49 PM, Tom Ellis <tom-lists-haskell-cafe-2013 <at> jaguarpaw.co.uk> wrote:
To avoid retaining a large lazy data structure in memory it is useful to
hide it behind a function call.  Below, "many" is used twice.  It is hidden
behind a function call so it can be garbage collected between uses.  That's
good.  When compiling with "-O" it seems that GHC 7.4.1 decides to keep it
in memory anyway.  That's bad.  (I can't read core so I don't know exactly
what's going on).  Replacing one of the "many" in "twice" with
"different_many" makes everything fine again.

Is this considered a bug in GHC?  Is it a known bug?  It is incredibly
concerning that GHC would perform this kind of pessimisation.

Tom


% cat thunkfail.hs
{-# OPTIONS_GHC -fno-warn-unused-binds #-}

import Data.List

million :: Int
million = 10 ^ (6 :: Int)

many :: () -> [Int]
many () = [1..million]

different_many :: () -> [Int]
different_many () = [1..million]

twice :: (Int, Int)
twice = (foldl' (+) 0 (many ()), foldl' (+) 0 (many ()))

main :: IO ()
main = print twice

% ghc -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail +RTS -M5M
[1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
Linking thunkfail ...
(1784293664,1784293664)

% ghc -O -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail +RTS -M5M
[1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
Linking thunkfail ...
Heap exhausted;
Current maximum heap size is 5242880 bytes (5 MB);
use `+RTS -M<size>' to increase it.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Tom Ellis | 24 Feb 19:55 2013
Picon

Re: Thunks and GHC pessimisation

On Sun, Feb 24, 2013 at 06:25:56PM +0000, Don Stewart wrote:
> If you explicitly rely on this not happening, turn it off:
> 
> $ ghc -O2 -fno-full-laziness --make A.hs -rtsopts -fforce-recomp
> [1 of 1] Compiling Main             ( A.hs, A.o )
> Linking A ...
> 
> $ time ./A +RTS -M750k
> (500000500000,500000500000)
> ./A +RTS -M750k  0.06s user 0.00s system 97% cpu 0.069 total

Thanks Don, that's good to know.  I am not currently bitten by this issue in
any production code, but I am concerned it will crop up when I least expect
it.

Tom
oleg | 26 Feb 11:00 2013

Thunks and GHC pessimisation


Tom Ellis wrote:
> To avoid retaining a large lazy data structure in memory it is useful to
> hide it behind a function call.  Below, "many" is used twice.  It is hidden
> behind a function call so it can be garbage collected between uses. 

As you discovered, it is quite challenging to ``go against the grain''
and force recomputation. GHC is quite good at avoiding
recomputation. This is a trade-off, of time vs space. For large
search tree, it is space that is a premium, and laziness and similar
strategies are exactly the wrong trade-off. 

The solution (which I've seen in some of the internal library code) is
to confuse GHC with extra functions:
        http://okmij.org/ftp/Haskell/misc.html#memo-off

So, eventually it is possible to force recomputation. But the solution
leaves a poor taste -- fighting a compiler is never a good idea. So,
this is a bug of sort -- not the bug of GHC, but of lazy
evaluation. Lazy evaluation is not the best evaluation strategy. It is
a trade-off, which suits a large class of problems and punishes
another large class of problems.
Tom Ellis | 26 Feb 22:13 2013
Picon

Re: Thunks and GHC pessimisation

On Tue, Feb 26, 2013 at 10:00:32AM -0000, oleg <at> okmij.org wrote:
> Tom Ellis wrote:
> > To avoid retaining a large lazy data structure in memory it is useful to
> > hide it behind a function call.  Below, "many" is used twice.  It is hidden
> > behind a function call so it can be garbage collected between uses. 
> 
> As you discovered, it is quite challenging to ``go against the grain''
> and force recomputation. GHC is quite good at avoiding
> recomputation. This is a trade-off, of time vs space. For large
> search tree, it is space that is a premium, and laziness and similar
> strategies are exactly the wrong trade-off. 

Hi Oleg,

I have good news: Don's suggestion of -fno-full-laziness that fixed the
space leak in my toy example also fixes the space leak in your iterative
deepening code.  To be precise, there is no longer any space leak in your
second implementation where subtrees are hidden behind a function call.

With full laziness your second implementation consumes 99M.  To avoid
sharing hacks are required and your third implementation consumes 2M. 
However, without full laziness your second implementation only consumes 2M.

> The solution (which I've seen in some of the internal library code) is
> to confuse GHC with extra functions:
>         http://okmij.org/ftp/Haskell/misc.html#memo-off

Yes, when I read your exposition several months ago it started the thinking
process which culminated in this thread.

> So, eventually it is possible to force recomputation. But the solution
> leaves a poor taste -- fighting a compiler is never a good idea.

I agree, but I'm glad to discover that disabling full laziness is enough to
avoid having to fight the compiler.

> So, this is a bug of sort -- not the bug of GHC, but of lazy evaluation. 
> Lazy evaluation is not the best evaluation strategy.  It is a trade-off,
> which suits a large class of problems and punishes another large class of
> problems.

I'm not as pessimistic as you on this issue.  If the results of function
calls may be shared with previous callers then the situation is problematic. 
However, if it is known that function calls return new thunks then the
programmer has a lot of power to avoid space leaks.

My conclusion is that this is not the bug of lazy evalution.  I would prefer
GHC to be more discerning about when it applies full laziness.  It is
unfortunate that it will apply it in cases where unbounded amounts of
additional memory usage may result.

Tom

% wget --quiet http://okmij.org/ftp/Haskell/STrees.hs

% ghc -O2 -fforce-recomp -rtsopts -main-is STrees.main2 STrees.hs && ./STrees iter 100 +RTS -tstderr > /dev/null
[1 of 1] Compiling STrees           ( STrees.hs, STrees.o )
Linking STrees ...
<<ghc: 159988256 bytes, 308 GCs, 12946822/42139812 avg/max bytes residency (10 samples), 99M in use,
0.00 INIT (0.00 elapsed), 8.40 MUT (8.40 elapsed), 0.95 GC (0.95 elapsed) :ghc>>

% ghc -O2 -fforce-recomp -rtsopts -main-is STrees.main3 STrees.hs && ./STrees iter 100 +RTS -tstderr > /dev/null
[1 of 1] Compiling STrees           ( STrees.hs, STrees.o )
Linking STrees ...
<<ghc: 6488675656 bytes, 12415 GCs, 214291/384036 avg/max bytes residency (37 samples), 2M in use, 0.00
INIT (0.00 elapsed), 8.64 MUT (8.64 elapsed), 0.32 GC (0.32 elapsed) :ghc>>

% ghc -O2 -fno-full-laziness -fforce-recomp -rtsopts -main-is STrees.main2 STrees.hs && ./STrees iter
100 +RTS -tstderr > /dev/null             
[1 of 1] Compiling STrees           ( STrees.hs, STrees.o )
Linking STrees ...
<<ghc: 11235944260 bytes, 21510 GCs, 208262/456880 avg/max bytes residency (44 samples), 2M in use, 0.00
INIT (0.00 elapsed), 8.03 MUT (8.03 elapsed), 0.52 GC (0.52 elapsed) :ghc>>

Gmane