Peter Divianszky | 3 Nov 10:05 2012
Picon

more sharing in generated code

Hi,

I have a question about sharing at the Haskell run-time level.

Suppose we have a record update

   r { x = f (r x)}

and suppose that most of the time f returns it's argument unchanged.

I have the following questions:

1. Does the generated code for the record update build an identical 
record when f returns it's argument unchanged instead of sharing the old 
one? (I guess yes.)

2. Can we prevent building identical records?

    Recently I've heard about Q-combinators.
    Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning
    Nothing when the value didn't change.
    Then we can replace the record update with smarter code which
    preserves more sharing.

    My question is: Can (or could) we enable more sharing without
    changing the source code?
    Ideally the compiler would generate record update code which
    checks poiter-equality of the updated field value to decide
    whether a copy of the original record is needed or it can be shared
    (given an optimisation flag enabled).
(Continue reading)

Andreas Abel | 3 Nov 10:47 2012
Picon

Re: more sharing in generated code

On 03.11.12 10:05 AM, Peter Divianszky wrote:
> Suppose we have a record update
>
>    r { x = f (r x)}
>
> and suppose that most of the time f returns it's argument unchanged.
>
>     Recently I've heard about Q-combinators.
>     Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning
>     Nothing when the value didn't change.
>     Then we can replace the record update with smarter code which
>     preserves more sharing.

Just adding a remark here:  I actually played with these Q-combinators, 
they actually worsened performance of Agda.  The problem is that they 
make the code strict.  The performance loss due to strictness 
outweighted the potential performance gain by increased sharing. 
Q-combinators were developed by John Harrison in the context of ML, 
which is strict anyway.

I guess a compiler support for smart record update would not have the 
strictness penalty.

Cheers,
Andreas

--

-- 
Andreas Abel  <><      Du bist der geliebte Mensch.

Theoretical Computer Science, University of Munich
(Continue reading)

Peter Divianszky | 3 Nov 11:20 2012
Picon

Re: more sharing in generated code

On 03/11/2012 10:47, Andreas Abel wrote:
> On 03.11.12 10:05 AM, Peter Divianszky wrote:
>> Suppose we have a record update
>>
>>    r { x = f (r x)}
>>
>> and suppose that most of the time f returns it's argument unchanged.
>>
>>     Recently I've heard about Q-combinators.
>>     Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning
>>     Nothing when the value didn't change.
>>     Then we can replace the record update with smarter code which
>>     preserves more sharing.
>
> Just adding a remark here:  I actually played with these Q-combinators,
> they actually worsened performance of Agda.  The problem is that they
> make the code strict.  The performance loss due to strictness
> outweighted the potential performance gain by increased sharing.
> Q-combinators were developed by John Harrison in the context of ML,
> which is strict anyway.
>
> I guess a compiler support for smart record update would not have the
> strictness penalty.

Yes, for that we need a copy first and an update later.
This can be implemented by replacing every record update

    r' = r { x = y }

with
(Continue reading)

Peter Divianszky | 3 Nov 11:26 2012
Picon

Re: more sharing in generated code


On 03/11/2012 11:20, Peter Divianszky wrote:
> On 03/11/2012 10:47, Andreas Abel wrote:
>> On 03.11.12 10:05 AM, Peter Divianszky wrote:
>>> Suppose we have a record update
>>>
>>>    r { x = f (r x)}
>>>
>>> and suppose that most of the time f returns it's argument unchanged.
>>>
>>>     Recently I've heard about Q-combinators.
>>>     Central idea: Change (f :: a -> a) to (f' :: a -> Maybe a) returning
>>>     Nothing when the value didn't change.
>>>     Then we can replace the record update with smarter code which
>>>     preserves more sharing.
>>
>> Just adding a remark here:  I actually played with these Q-combinators,
>> they actually worsened performance of Agda.  The problem is that they
>> make the code strict.  The performance loss due to strictness
>> outweighted the potential performance gain by increased sharing.
>> Q-combinators were developed by John Harrison in the context of ML,
>> which is strict anyway.
>>
>> I guess a compiler support for smart record update would not have the
>> strictness penalty.
>
> Yes, for that we need a copy first and an update later.
> This can be implemented by replacing every record update
>
>     r' = r { x = y }
(Continue reading)

Joachim Breitner | 4 Nov 13:00 2012
Picon

Re: more sharing in generated code

Hi,

Am Samstag, den 03.11.2012, 11:26 +0100 schrieb Peter Divianszky:
> > This can be implemented by replacing every record update
> >
> >     r' = r { x = y }
> >
> > with
> >
> >     r' = r { x = unsafePerformIO (cond-update r r' (x r) y) }
> >
> > where we keep the current record update mechanism and implement
> > cond-update like
> >
>
> cond-update :: r -> r -> x -> IO x
> cond-update r_old r_new x_old x_new = do
>       eval x_new
>       b <- x_old === x_new
>       when b (replace r_new r_old)
>       return x_new
> 
> > where (===) is pointer-equality and replace is a low-level function
> > which replaces thunks in the heap.

I think one problem with this approach is that now, until "x r'" is
evaluated, you keep both r and r' alive. If r was _not_ retained by
anything else, then you are now preventing the GC from freeing it and
hence increasing the memory footprint.

(Continue reading)

Dennis Felsing | 3 Nov 11:52 2012
Picon

Re: more sharing in generated code

On 2012-11-03T10:05+0100, Peter Divianszky wrote:
> Suppose we have a record update
> 
>   r { x = f (r x)}

Hi Peter,

I think you mean this:

  r { x = f (x r) }

> 1. Does the generated code for the record update build an identical
> record when f returns it's argument unchanged instead of sharing the
> old one? (I guess yes.)

In GHC: In your example a new record is built, but all its entries (x in
this case) are shared. Here's how I found out:

  λ> let f = id
  λ> data R = R { x :: Int }
  λ> let r = R 0
  λ> let u = r { x = f (x r) }
  λ> :view r
  λ> :view u

:view is from ghc-vis[1], the resulting visualization is attached.

In this simple example, when you compile it with -O1 the compiler
figures out that u will always be the same as r and shares them. This
does not work when it's dynamic whether u will be identical to r.
(Continue reading)

Peter Divianszky | 3 Nov 13:39 2012
Picon

Re: more sharing in generated code

Hi Dennis,

 > I think you mean this:
>
>    r { x = f (x r) }

Yes, I made a typo.

Thanks for advising ghc-vis.

> In GHC: In your example a new record is built, but all its entries (x in
> this case) are shared.

The problem is, that in case of nested records, if an inner record is 
updated, the whole path to that record is copied.

Of course copying just the path to the inner data is a lot better than 
copying the whole tree of data, but my proposal is about to optimize it 
further: don't copy of the path if the updated value is identical to the 
original one. The Agda compiler may benefit by this optimization for 
example.

In fact, the last version of the proposal[^1] will copy the path to keep 
laziness properties, but it does it in such a way that the GC can 
collect the copied path instead of trying to collect the old path (which 
will if it is shared).

[^1]: 
http://www.haskell.org/pipermail/haskell-cafe/2012-November/104311.html

(Continue reading)


Gmane