19 Feb 2013 12:10

## ANN: data-fix-cse -- Common subexpression elimination for EDSLs

I'm glad to announce the package for Common subexpression elimination [1].
It's an implementation of the hashconsig algorithm as described in the paper
'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.

Main point of the library is to define this algorithm in the most generic way.
You can define the AST for your DSL as fixpoint type[2]. And then all you need
to use the library is to define the instance for type class `Traversable`.
This idea is inspired by `data-reify` [3] package which you can use to transform
your ASTs to DAGs too. But it relies on inspection of the references for values
when `data-fix-cse` doesn't sacrifices the purity.

A short example:

Let's define a tiny DSL for signals

import Data.Fix

type Name = String

type E = Fix Exp

data Exp a = Const Double | ReadPort Name | Tfm Name [a] | Mix a a
deriving (Show, Eq, Ord)

We can make constant signals, read them from some ports and transform them
(apply some named function to the list of signals) and mix two signals.

Let's define an instance of the Traversable (hence for the Functor and Foldable)

import Control.Applicative

import Data.Monoid
import Data.Traversable
import Data.Foldable

instance Functor Exp where
fmap f x = case x of
Const d -> Const d
Mix a b -> Mix (f a) (f b)
Tfm n as -> Tfm n \$ fmap f as

instance Foldable Exp where
foldMap f x = case x of
Mix a b -> f a <> f b
Tfm n as -> mconcat \$ fmap f as
_ -> mempty

instance Traversable Exp where
traverse f x = case x of
Mix a b -> Mix <\$> f a <*> f b
Tfm n as -> Tfm n <\$> traverse f as
a -> pure a

Now we can use the functio `cse`

cse :: (Eq (f Int), Ord (f Int), Traversable f) => Fix f -> Dag f

to transform our AST to DAG. DAG is already sorted.

Later we can define a handy wrapper to hide the details from the client

newtype Sig = Sig { unSig :: E }
You can find examples in the package archive

Extra-Source-Files: test/Exp.hs test/Impl.hs
test/Expl.hsIf you want to see a real world example of usage you can find it
in the csound-expression[4]. An edsl for the Csound language.

One side-note form my experience: Fixpoint types can be very flexible.
It's easy to compose them. If suddenly we need to add some extra data
to all cases from the example above we can easily do it with just another
Functor:

Imagine that we want to use a SampleRate value with all signals.
Then we can do it like this:

type E = Fix SampledExp

data SampledExp a = SampledExp SampleRate (Exp a)

then we should define an instance of the type class Traversable
for our new type SampleRate. The Exp doesn't change.

Anton
```_______________________________________________
```
19 Feb 2013 13:30

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

```2013-02-19 12:10, Anton Kholomiov skrev:
> I'm glad to announce the package for Commonsubexpression elimination [1].
> It's an implementation of the hashconsig algorithm as described in the
> paper
> 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg
> Kiselyov.
>
> Main point of the library is to define this algorithm in the most
> generic way.
> You can define the AST for your DSL as fixpoint type[2]. And then all
> you need
> to use the library is to define the instance for type class `Traversable`.

One way to make the library even more useful would have been to base it
on compdata instead of data-fix. Compdata has support for composable
types and lots of extra functionality. On the other hand, it's easy
enough to translate from compdata terms to your `Fix`...

> One side-note form my experience: Fixpoint types can be very flexible.
> It's easy to compose them. If suddenly we need to add some extra data
> to all cases from the example above we can easily do it with just another
> Functor:
>
> Imagine that we want to use a SampleRate value with all signals.
> Then we can do it like this:
>
> type E = Fix SampledExp
>
> data SampledExp a = SampledExp SampleRate (Exp a)
>
> then we should define an instance of the type class Traversable
> for our new type SampleRate. The Exp doesn't change.

Very useful indeed! A more principled way to extend data types in this
way is Data Types à la Carte:

http://dx.doi.org/10.1017/S0956796808006758

(Implemented in compdata.)

/ Emil
```
19 Feb 2013 14:12

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

There are several packages that already define fixpoints (another one is
about unification), but all packages that I'm aware of define a lot of functionality
that I don't need (and actually don't understand, packages with fixpoint types
tend to be rather dense with math). I'd like it to be simple and lightweight.
Just fixpoints, just folds and unfolds.

2013/2/19 Emil Axelsson
2013-02-19 12:10, Anton Kholomiov skrev:
I'm glad to announce the package for Commonsubexpression elimination [1].

It's an implementation of the hashconsig algorithm as described in the
paper
'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg
Kiselyov.

Main point of the library is to define this algorithm in the most
generic way.
You can define the AST for your DSL as fixpoint type[2]. And then all
you need
to use the library is to define the instance for type class `Traversable`.

One way to make the library even more useful would have been to base it on compdata instead of data-fix. Compdata has support for composable types and lots of extra functionality. On the other hand, it's easy enough to translate from compdata terms to your `Fix`...

One side-note form my experience: Fixpoint types can be very flexible.
It's easy to compose them. If suddenly we need to add some extra data
to all cases from the example above we can easily do it with just another
Functor:

Imagine that we want to use a SampleRate value with all signals.
Then we can do it like this:

type E = Fix SampledExp

data SampledExp a = SampledExp SampleRate (Exp a)

then we should define an instance of the type class Traversable
for our new type SampleRate. The Exp doesn't change.

Very useful indeed! A more principled way to extend data types in this way is Data Types à la Carte:

http://dx.doi.org/10.1017/S0956796808006758

(Implemented in compdata.)

/ Emil

```_______________________________________________
```
19 Feb 2013 14:28

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

```Fully understandable! Compdata would be quite a heavy dependency for

I'm just generally fond of the idea of collecting all DSL implementation
tricks under one umbrella. That requires using the same term representation.

/ Emil

2013-02-19 14:12, Anton Kholomiov skrev:
> There are several packages that already define fixpoints (another one is
> about unification), but all packages that I'm aware of define a lot of
> functionality
> that I don't need (and actually don't understand, packages with fixpoint
> types
> tend to be rather dense with math). I'd like it to be simple and
> lightweight.
> Just fixpoints, just folds and unfolds.
>
>
> 2013/2/19 Emil Axelsson <emax <at> chalmers.se <mailto:emax <at> chalmers.se>>
>
>     2013-02-19 12:10, Anton Kholomiov skrev:
>
>         I'm glad to announce the package for Commonsubexpression
>         elimination [1].
>
>         It's an implementation of the hashconsig algorithm as described
>         in the
>         paper
>         'Implementing Explicit and Finding Implicit Sharing in EDSLs' by
>         Oleg
>         Kiselyov.
>
>         Main point of the library is to define this algorithm in the most
>         generic way.
>         You can define the AST for your DSL as fixpoint type[2]. And
>         then all
>         you need
>         to use the library is to define the instance for type class
>         `Traversable`.
>
>
>     One way to make the library even more useful would have been to base
>     it on compdata instead of data-fix. Compdata has support for
>     composable types and lots of extra functionality. On the other hand,
>     it's easy enough to translate from compdata terms to your `Fix`...
>
>
>
>
>         One side-note form my experience: Fixpoint types can be very
>         flexible.
>         It's easy to compose them. If suddenly we need to add some extra
>         data
>         to all cases from the example above we can easily do it with
>         just another
>         Functor:
>
>         Imagine that we want to use a SampleRate value with all signals.
>         Then we can do it like this:
>
>         type E = Fix SampledExp
>
>         data SampledExp a = SampledExp SampleRate (Exp a)
>
>         then we should define an instance of the type class Traversable
>         for our new type SampleRate. The Exp doesn't change.
>
>
>     Very useful indeed! A more principled way to extend data types in
>     this way is Data Types à la Carte:
>
>     http://dx.doi.org/10.1017/__S0956796808006758
>     <http://dx.doi.org/10.1017/S0956796808006758>
>
>     (Implemented in compdata.)
>
>     / Emil
>
>
```
20 Feb 2013 01:58

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

What a delightfully elegant approach to CSE! I've been thinking about CSE for DSELs and about functor fixpoints, but it never occurred to me to put the two together.

Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.

- Conal

On Tue, Feb 19, 2013 at 3:10 AM, Anton Kholomiov wrote:
I'm glad to announce the package for Common subexpression elimination [1].
It's an implementation of the hashconsig algorithm as described in the paper
'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.

Main point of the library is to define this algorithm in the most generic way.
You can define the AST for your DSL as fixpoint type[2]. And then all you need
to use the library is to define the instance for type class `Traversable`.
This idea is inspired by `data-reify` [3] package which you can use to transform
your ASTs to DAGs too. But it relies on inspection of the references for values
when `data-fix-cse` doesn't sacrifices the purity.

A short example:

Let's define a tiny DSL for signals

import Data.Fix

type Name = String

type E = Fix Exp

data Exp a = Const Double | ReadPort Name | Tfm Name [a] | Mix a a
deriving (Show, Eq, Ord)

We can make constant signals, read them from some ports and transform them
(apply some named function to the list of signals) and mix two signals.

Let's define an instance of the Traversable (hence for the Functor and Foldable)

import Control.Applicative

import Data.Monoid
import Data.Traversable
import Data.Foldable

instance Functor Exp where
fmap f x = case x of
Const d -> Const d
Mix a b -> Mix (f a) (f b)
Tfm n as -> Tfm n \$ fmap f as

instance Foldable Exp where
foldMap f x = case x of
Mix a b -> f a <> f b
Tfm n as -> mconcat \$ fmap f as
_ -> mempty

instance Traversable Exp where
traverse f x = case x of
Mix a b -> Mix <\$> f a <*> f b
Tfm n as -> Tfm n <\$> traverse f as
a -> pure a

Now we can use the functio `cse`

cse :: (Eq (f Int), Ord (f Int), Traversable f) => Fix f -> Dag f

to transform our AST to DAG. DAG is already sorted.

Later we can define a handy wrapper to hide the details from the client

newtype Sig = Sig { unSig :: E }
You can find examples in the package archive

Extra-Source-Files: test/Exp.hs test/Impl.hs
test/Expl.hsIf you want to see a real world example of usage you can find it
in the csound-expression[4]. An edsl for the Csound language.

One side-note form my experience: Fixpoint types can be very flexible.
It's easy to compose them. If suddenly we need to add some extra data
to all cases from the example above we can easily do it with just another
Functor:

Imagine that we want to use a SampleRate value with all signals.
Then we can do it like this:

type E = Fix SampledExp

data SampledExp a = SampledExp SampleRate (Exp a)

then we should define an instance of the type class Traversable
for our new type SampleRate. The Exp doesn't change.

Anton

_______________________________________________

```_______________________________________________
```
20 Feb 2013 06:28

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.

I'm not sure I understand the case you are talking about. Can you write a simple example
of the types like this?

Cheers,
Anton
```_______________________________________________
```
23 Feb 2013 02:12

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

On Tue, Feb 19, 2013 at 9:28 PM, Anton Kholomiov wrote:

Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.

I'm not sure I understand the case you are talking about. Can you write a simple example
of the types like this?

Here's an example of a type-embedded DSEL, represented as a GADT:

> data E :: * -> * where
>   Lit :: Show a => a -> E a
>   Op  :: Op a -> E a
>   App :: E (a -> b) -> E a -> E b
>   ...
>
> data Op :: * -> * where
>   Add :: Num a => E (a -> a -> a)
>   Mul :: Num a => E (a -> a -> a)
>   Neg :: Num a => E (a -> a)
>   ...

-- Conal
```_______________________________________________
```
23 Feb 2013 12:04

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

I don't know how to express it. You need to have some dynamic representation since
dag is a container of `(Int, f Int)`. I've tried to go along this road

type Exp a = Fix (E a)

data E c :: * -> * where
Lit :: Show a => a -> E a c
Op  :: Op a -> E a c
App :: Phantom (a -> b) c -> Phantom a c -> E b c

data Op :: * -> * where
Add :: Num a => Op (a -> a -> a)
Mul :: Num a => Op (a -> a -> a)
Neg :: Num a => Op (a -> a)

newtype Phantom a b = Phantom { unPhantom :: b }

But got stuck with the definition of

app :: Exp (a -> b) -> Exp a -> Exp b
app f a = Fix \$ App (Phantom f) (Phantom a)

App requires the arguments to be of the same type (in the second type argument `c`).

2013/2/23 Conal Elliott

On Tue, Feb 19, 2013 at 9:28 PM, Anton Kholomiov wrote:

Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.

I'm not sure I understand the case you are talking about. Can you write a simple example
of the types like this?

Here's an example of a type-embedded DSEL, represented as a GADT:

> data E :: * -> * where
>   Lit :: Show a => a -> E a
>   Op  :: Op a -> E a
>   App :: E (a -> b) -> E a -> E b
>   ...
>
> data Op :: * -> * where
>   Add :: Num a => E (a -> a -> a)
>   Mul :: Num a => E (a -> a -> a)
>   Neg :: Num a => E (a -> a)
>   ...

-- Conal

```_______________________________________________
```
21 Feb 2013 11:49

### Re: ANN: data-fix-cse -- Common subexpression elimination for EDSLs

```This should be possible using higher-order terms, as in

The only complication I see is that the Dag nodes would get
heterogeneous types requiring existential quantification with a
`Typeable` constraint. A better representation might be typed ASGs [1]

Syntactic has typed ASTs and it has a module that does something similar
to data-fix-cse (uses a combination of stable names and hashing), but it
needs some fixing up.

/ Emil

[1]: http://dl.acm.org/citation.cfm?id=2426909

2013-02-20 01:58, Conal Elliott skrev:
> Do you think the approach can be extended for non-regular (nested)
> algebraic types (where the recursive data type is sometimes at a
> different type instance)? For instance, it's very handy to use GADTs to
> capture embedded language types in host language (Haskell) types, which