8 Nov 2012 16:00

## Why doesn't GHC optimize recursive functions by converting them into `let`?

Hi,

recently I found out that some recursive functions can be greatly optimized just by rewriting them using `let` so that they're not recursive themselves any more (only the inner let is). To give an example:

> fix :: (a -> a) -> a
> fix f = f (fix f)

isn't optimized, because it's a recursive function and so it isn't inlined or anything. However, defining it as

> fix f = let x = f x in x

makes it much more efficient, since `fix` is not recursive any more, so it gets inlined and then optimized for a given argument `f`.

Another example is the well-known generic catamorphism function:

> newtype Fix f = Fix { unfix :: f (Fix f) }
>
> catam :: (Functor f) => (f a -> a) -> (Fix f -> a)
> catam f = f . fmap (catam f) . unfix

is not optimized. When `catam` is used on a data structure such as [] or a tree, at each level `fmap` creates new constructors that are immediately consumed by `f`. But when written as

> catam f = let u = f . fmap u . unfix in u

this gets inlined and then optimized, and constructor creation/consumption is fused into very effective code.

As one comment asked, this seems like something GHC could do automatically. Would it be difficult? Is there a reason against it? I suppose in cases where it doesn't help, the additional `let` would get optimized away, so no harm would be done. And in cases like `fix` or `catam` the gain would be substantial.

Best regards,
Petr
```_______________________________________________
```
8 Nov 2012 22:46

### Re: Why doesn't GHC optimize recursive functions by converting them into `let`?

Hey Petr, interesting post! (and links)

I imagine that one issue that would make it not a default activity by GHC is that this sort of strategy has the following implications:

1) ghc in general doesn't always want to inline.... in general, inlining increases the size of code! and depending on how that happens that can increase compilation time and sometime decrease performance. That said, there are many instances where perfomance is

2) This approach *can* be extended to mutually recursive functions, but again, the naive inlining to "depth k" would have on the order of ~ 2^k code blow up potentially (or so I think offhand)

theres probably a few other problems with doing this automatically, but it looks like a nice performance trick I may consider using time.

cheers
-Carter

On Thu, Nov 8, 2012 at 10:00 AM, Petr P wrote:
Hi,

recently I found out that some recursive functions can be greatly optimized just by rewriting them using `let` so that they're not recursive themselves any more (only the inner let is). To give an example:

> fix :: (a -> a) -> a
> fix f = f (fix f)

isn't optimized, because it's a recursive function and so it isn't inlined or anything. However, defining it as

> fix f = let x = f x in x

makes it much more efficient, since `fix` is not recursive any more, so it gets inlined and then optimized for a given argument `f`.

Another example is the well-known generic catamorphism function:

> newtype Fix f = Fix { unfix :: f (Fix f) }
>
> catam :: (Functor f) => (f a -> a) -> (Fix f -> a)
> catam f = f . fmap (catam f) . unfix

is not optimized. When `catam` is used on a data structure such as [] or a tree, at each level `fmap` creates new constructors that are immediately consumed by `f`. But when written as

> catam f = let u = f . fmap u . unfix in u

this gets inlined and then optimized, and constructor creation/consumption is fused into very effective code.

As one comment asked, this seems like something GHC could do automatically. Would it be difficult? Is there a reason against it? I suppose in cases where it doesn't help, the additional `let` would get optimized away, so no harm would be done. And in cases like `fix` or `catam` the gain would be substantial.

Best regards,
Petr

_______________________________________________

```_______________________________________________
```
9 Nov 2012 09:16

### Re: Why doesn't GHC optimize recursive functions by converting them into `let`?

Hi Carter,

2012/11/8 Carter Schonwald
I imagine that one issue that would make it not a default activity by GHC is that this sort of strategy has the following implications:

1) ghc in general doesn't always want to inline.... in general, inlining increases the size of code! and depending on how that happens that can increase compilation time and sometime decrease performance. That said, there are many instances where perfomance is

I'd say converting a recursive function into a non-recursive one using `let` doesn't necessarily mean that it will get inlined. It's up to GHC if it inlines the function or not, just as with other functions. So I don't think this would be a problem.

2) This approach *can* be extended to mutually recursive functions, but again, the naive inlining to "depth k" would have on the order of ~ 2^k code blow up potentially (or so I think offhand)

I didn't think of mutually recursive functions before. But I'd say they could be optimized the same way, without an exponential code blowup, just by creating more complex `let` expressions:

f u = somethingF u (f u) (g u)
g u = somethingG u (f u) (g u)

could be optimized as

fg u = let x = somethingF u x y ; y = somethingG u x y in (x,y)
f u = fst (fg u)
g u = snd (fg u)

So again, no recursion is at the top level, only within `let`, so both `f` and `g` can be inlined if desired (and fst/snd/(,) will then get fused away).

Best regards,
Petr

On Thu, Nov 8, 2012 at 10:00 AM, Petr P wrote:
Hi,

recently I found out that some recursive functions can be greatly optimized just by rewriting them using `let` so that they're not recursive themselves any more (only the inner let is). To give an example:

> fix :: (a -> a) -> a
> fix f = f (fix f)

isn't optimized, because it's a recursive function and so it isn't inlined or anything. However, defining it as

> fix f = let x = f x in x

makes it much more efficient, since `fix` is not recursive any more, so it gets inlined and then optimized for a given argument `f`.

Another example is the well-known generic catamorphism function:

> newtype Fix f = Fix { unfix :: f (Fix f) }
>
> catam :: (Functor f) => (f a -> a) -> (Fix f -> a)
> catam f = f . fmap (catam f) . unfix

is not optimized. When `catam` is used on a data structure such as [] or a tree, at each level `fmap` creates new constructors that are immediately consumed by `f`. But when written as

> catam f = let u = f . fmap u . unfix in u

this gets inlined and then optimized, and constructor creation/consumption is fused into very effective code.

As one comment asked, this seems like something GHC could do automatically. Would it be difficult? Is there a reason against it? I suppose in cases where it doesn't help, the additional `let` would get optimized away, so no harm would be done. And in cases like `fix` or `catam` the gain would be substantial.

Best regards,
Petr

_______________________________________________
```_______________________________________________