Petr Pudlák | 18 Feb 17:10 2013
Picon

generalized, tail-recursive left fold that can finish tne computation prematurely

Dear Haskellers,

while playing with folds and trying to implement `!!` by folding, I came to the conclusion that:

- `foldr` is unsuitable because it counts the elements from the end, while `!!` needs counting from the start (and it's not tail recursive).
- `foldl` is also unsuitable, because it always traverses the whole list.

I came up with the following tail-recursive generalization of `foldl` that allows exiting the computation prematurely:

    foldlE :: (a -> c) -> (a -> b -> Either c a) -> Either c a -> [b] -> c
    foldlE f g = fld
      where
        fld (Left c)  _         = c
        fld (Right a) []        = f a
        fld (Right a) (x:xs)    = fld (g a x) xs

`foldl` can be defined from it  as

    foldl'' :: (a -> b -> a) -> a -> [b] -> a
    foldl'' f z = foldlE id ((Right .) . f) (Right z)

and `!!` as:

    -- Checks for a negative index omitted for brevity.
    index :: Int -> [a] -> a
    index i = foldlE (error $ "No such index") f (Right i)
      where
        f 0 x = Left x
        f n _ = Right (n - 1)

Is something like that already available somewhere?

  Best regards,
  Petr Pudlak
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Petr Pudlák | 18 Feb 17:10 2013
Picon

generalized, tail-recursive left fold that can finish tne computation prematurely

Dear Haskellers,

while playing with folds and trying to implement `!!` by folding, I came to the conclusion that:

- `foldr` is unsuitable because it counts the elements from the end, while `!!` needs counting from the start (and it's not tail recursive).
- `foldl` is also unsuitable, because it always traverses the whole list.

I came up with the following tail-recursive generalization of `foldl` that allows exiting the computation prematurely:

    foldlE :: (a -> c) -> (a -> b -> Either c a) -> Either c a -> [b] -> c
    foldlE f g = fld
      where
        fld (Left c)  _         = c
        fld (Right a) []        = f a
        fld (Right a) (x:xs)    = fld (g a x) xs

`foldl` can be defined from it  as

    foldl'' :: (a -> b -> a) -> a -> [b] -> a
    foldl'' f z = foldlE id ((Right .) . f) (Right z)

and `!!` as:

    -- Checks for a negative index omitted for brevity.
    index :: Int -> [a] -> a
    index i = foldlE (error $ "No such index") f (Right i)
      where
        f 0 x = Left x
        f n _ = Right (n - 1)

Is something like that already available somewhere?

  Best regards,
  Petr Pudlak
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Andres Löh | 18 Feb 17:26 2013
Picon

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

Hi.

> while playing with folds and trying to implement `!!` by folding, I came to
> the conclusion that:
>
> - `foldr` is unsuitable because it counts the elements from the end, while
> `!!` needs counting from the start (and it's not tail recursive).

What is the problem with the following definition using foldr?

> index :: Int -> [a] -> a
> index n xs =
>   foldr
>     (\ x r n -> if n == 0 then x else r (n - 1))
>     (const (error $ "No such index"))
>     xs
>     n

Cheers,
  Andres
Roman Cheplyaka | 18 Feb 17:28 2013

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

* Petr Pudlák <petr.mvd <at> gmail.com> [2013-02-18 17:10:26+0100]
> Dear Haskellers,
> 
> while playing with folds and trying to implement `!!` by folding, I came to
> the conclusion that:
> 
> - `foldr` is unsuitable because it counts the elements from the end, while
> `!!` needs counting from the start (and it's not tail recursive).
> - `foldl` is also unsuitable, because it always traverses the whole list.

Every structurally-recursive function is definable through foldr,
because foldr *is* the structural recursion (aka catamorphism) operation
for lists.

Here the trick is to make the accumulator a function. This way you can
pass a value from left to right.

Something like

  foldr (\x rest n -> ...) id list 0

I'll leave filling in the dots as an exercise.

Roman
Roman Cheplyaka | 18 Feb 17:31 2013

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

* Roman Cheplyaka <roma <at> ro-che.info> [2013-02-18 18:28:47+0200]
> * Petr Pudlák <petr.mvd <at> gmail.com> [2013-02-18 17:10:26+0100]
> > Dear Haskellers,
> > 
> > while playing with folds and trying to implement `!!` by folding, I came to
> > the conclusion that:
> > 
> > - `foldr` is unsuitable because it counts the elements from the end, while
> > `!!` needs counting from the start (and it's not tail recursive).
> > - `foldl` is also unsuitable, because it always traverses the whole list.
> 
> Every structurally-recursive function is definable through foldr,
> because foldr *is* the structural recursion (aka catamorphism) operation
> for lists.
> 
> Here the trick is to make the accumulator a function. This way you can
> pass a value from left to right.
> 
> Something like
> 
>   foldr (\x rest n -> ...) id list 0
> 
> I'll leave filling in the dots as an exercise.

Er, my template is not quite right — I had 'length' in mind while writing
it. Anyway, Andres showed the correct definition.

Roman

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Petr Pudlák | 18 Feb 18:12 2013
Picon

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

Thanks Roman and Andres for the tip. I knew the trick with accumulating a function, but I had never imagined it could work so efficiently. I thought the problem with using foldr would be that the function would be neither tail recursive nor efficient and so I hadn't even tried. Apparently that was wrong. After your suggestion I checked its performance and how it compiles to core and to my surprise GHC optimizes the whole thing into a most-efficient tail recursive function!

  Best regards,
  Petr


2013/2/18 Roman Cheplyaka <roma <at> ro-che.info>
* Petr Pudlįk <petr.mvd <at> gmail.com> [2013-02-18 17:10:26+0100]
> Dear Haskellers,
>
> while playing with folds and trying to implement `!!` by folding, I came to
> the conclusion that:
>
> - `foldr` is unsuitable because it counts the elements from the end, while
> `!!` needs counting from the start (and it's not tail recursive).
> - `foldl` is also unsuitable, because it always traverses the whole list.

Every structurally-recursive function is definable through foldr,
because foldr *is* the structural recursion (aka catamorphism) operation
for lists.

Here the trick is to make the accumulator a function. This way you can
pass a value from left to right.

Something like

  foldr (\x rest n -> ...) id list 0

I'll leave filling in the dots as an exercise.

Roman

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Niklas Hambüchen | 18 Feb 18:48 2013

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

On 18/02/13 16:10, Petr Pudlák wrote:
> - `foldr` is unsuitable because it counts the elements from the end,
> while `!!` needs counting from the start (and it's not tail recursive).

It is common misconception that foldr processes the list "from the right".

foldr "brackets" from the right, but this has nothing to do with
processing direction; all [a] are processed left to right, since this is
the only way to structurally deconstruct them.

This is the reason why it is possible to write
foldr (:) [] [1..]

If foldr processed the list from the right, it would on infinite lists -
and it does.
oleg | 19 Feb 07:27 2013

Re: generalized, tail-recursive left fold that can finish tne computation prematurely


As others have pointed out, _in principle_, foldr is not at all
deficient. We can, for example, express foldl via foldr. Moreover, we
can express head, tail, take, drop and even zipWith through
foldr. That is, the entire list processing library can be written in
terms of foldr:

        http://okmij.org/ftp/Algorithms.html#zip-folds

That said, to express foldl via foldr, we need a higher-order
fold. There are various problems with higher-order folds, related to
the cost of building closures. The problems are especially severe 
in strict languages or strict contexts. Indeed,

foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z

first constructs the closure and then applies it to z. The closure has
the same structure as the list -- it is isomorphic to the
list. However, the closure representation of a list takes typically
quite more space than the list. So, in strict languages, expressing
foldl via foldr is a really bad idea. It won't work for big lists.
BTW, this is why foldM is _left_ fold.

The arguments against higher-order folds as a `big hammer' were made
back in 1998 by Gibbons and Jones
        http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.1735

So, the left-fold with the early termination has a good
justification. In fact, this is how Iteratees were first presented, at
the DEFUN08 tutorial (part of the ICFP2008 conference). The idea of
left fold with early termination is much older though. For example, Takusen
(a database access framework) has been using it since 2003 or so. For
a bit of history, see

        http://okmij.org/ftp/Streams.html#fold-stream
Roman Cheplyaka | 19 Feb 08:42 2013

Re: generalized, tail-recursive left fold that can finish tne computation prematurely

* oleg <at> okmij.org <oleg <at> okmij.org> [2013-02-19 06:27:10-0000]
> 
> As others have pointed out, _in principle_, foldr is not at all
> deficient. We can, for example, express foldl via foldr. Moreover, we
> can express head, tail, take, drop and even zipWith through
> foldr. That is, the entire list processing library can be written in
> terms of foldr:
> 
>         http://okmij.org/ftp/Algorithms.html#zip-folds
> 
> That said, to express foldl via foldr, we need a higher-order
> fold. There are various problems with higher-order folds, related to
> the cost of building closures. The problems are especially severe 
> in strict languages or strict contexts. Indeed,
> 
> foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
> 
> first constructs the closure and then applies it to z. The closure has
> the same structure as the list -- it is isomorphic to the
> list. However, the closure representation of a list takes typically
> quite more space than the list. So, in strict languages, expressing
> foldl via foldr is a really bad idea. It won't work for big lists.

If we unroll foldr once (assuming l is not empty), we'll get

  \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))

which is a (shallow) closure. In order to observe what you describe (a
closure isomorphic to the list) we'd need a language which does
reductions inside closures.

Am I wrong?

Roman
oleg | 20 Feb 05:23 2013

Re: generalized, tail-recursive left fold that can


> > That said, to express foldl via foldr, we need a higher-order
> > fold. There are various problems with higher-order folds, related to
> > the cost of building closures. The problems are especially severe 
> > in strict languages or strict contexts. Indeed,
> > 
> > foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
> > 
> > first constructs the closure and then applies it to z. The closure has
> > the same structure as the list -- it is isomorphic to the
> > list. However, the closure representation of a list takes typically
> > quite more space than the list. So, in strict languages, expressing
> > foldl via foldr is a really bad idea. It won't work for big lists.
>
> If we unroll foldr once (assuming l is not empty), we'll get
>
>   \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
>
> which is a (shallow) closure. In order to observe what you describe (a
> closure isomorphic to the list) we'd need a language which does
> reductions inside closures.

I should've elaborated this point.

Let us consider monadic versions of foldr and foldl. First, monads,
sort of emulate strict contexts, making it easier to see when
closures are constructed. Second, we can easily add tracing.

import Control.Monad.Trans

-- The following is just the ordinary foldr, with a specialized
-- type for the seed: m z
foldrM :: Monad m =>
          (a -> m z -> m z) -> m z -> [a] -> m z
-- The code below is identical to that of foldr
foldrM f z [] = z
foldrM f z (h:t) = f h (foldrM f z t)

-- foldlM is identical Control.Monad.foldM 
-- Its code is shown below for reference.
foldlM, foldlM' :: Monad m =>
          (z -> a -> m z) -> z -> [a] -> m z
foldlM f z []    = return z
foldlM f z (h:t) = f z h >>= \z' -> foldlM f z' t

t1 = foldlM (\z a -> putStrLn ("foldlM: " ++ show a) >>
                     return (a:z)) [] [1,2,3]

{-
foldlM: 1
foldlM: 2
foldlM: 3
[3,2,1]
-}

-- foldlM' is foldlM expressed via foldrM
foldlM' f z l = 
    foldrM (\e am -> am >>= \k -> return $ \z -> f z e >>= k)
           (return return) l >>= \f -> f z

-- foldrM'' is foldlM' with trace printing
foldlM'' :: (MonadIO m, Show a) =>
          (z -> a -> m z) -> z -> [a] -> m z
foldlM'' f z l = 
    foldrM (\e am -> liftIO (putStrLn $ "foldR: " ++ show e) >>
            am >>= \k -> return $ \z -> f z e >>= k)
           (return return) l >>= \f -> f z

t2 = foldlM'' (\z a -> putStrLn ("foldlM: " ++ show a) >>
                       return (a:z)) [] [1,2,3]

{-
foldR: 1
foldR: 2
foldR: 3
foldlM: 1
foldlM: 2
foldlM: 3
[3,2,1]
-}

As we can see from the trace printing, first the whole list is
traversed by foldR and the closure is constructed. Only after foldr
has finished, the closure is applied to z ([] in our case), and
foldl's function f gets a chance to work. The list is effectively
traversed twice, which means the `copy' of the list has to be
allocated -- that is, the closure that incorporates the calls to
f e1, f e2, etc. 
Roman Cheplyaka | 21 Feb 20:13 2013

Re: generalized, tail-recursive left fold that can

Thanks, I see now where my mistake was.

Laziness (or call by name) is needed to make the step from

  (\e a z -> a (f z e))
    (head l)
    (foldr (\e a z -> a (f z e)) id (tail l) z)
    (f z (head l))

to

  \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))

without evaluating foldr further.

Roman

* oleg <at> okmij.org <oleg <at> okmij.org> [2013-02-20 04:23:34-0000]
> 
> > > That said, to express foldl via foldr, we need a higher-order
> > > fold. There are various problems with higher-order folds, related to
> > > the cost of building closures. The problems are especially severe 
> > > in strict languages or strict contexts. Indeed,
> > > 
> > > foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
> > > 
> > > first constructs the closure and then applies it to z. The closure has
> > > the same structure as the list -- it is isomorphic to the
> > > list. However, the closure representation of a list takes typically
> > > quite more space than the list. So, in strict languages, expressing
> > > foldl via foldr is a really bad idea. It won't work for big lists.
> >
> > If we unroll foldr once (assuming l is not empty), we'll get
> >
> >   \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
> >
> > which is a (shallow) closure. In order to observe what you describe (a
> > closure isomorphic to the list) we'd need a language which does
> > reductions inside closures.
> 
> I should've elaborated this point.
> 
> Let us consider monadic versions of foldr and foldl. First, monads,
> sort of emulate strict contexts, making it easier to see when
> closures are constructed. Second, we can easily add tracing.
> 
> 
> import Control.Monad.Trans
> 
> -- The following is just the ordinary foldr, with a specialized
> -- type for the seed: m z
> foldrM :: Monad m =>
>           (a -> m z -> m z) -> m z -> [a] -> m z
> -- The code below is identical to that of foldr
> foldrM f z [] = z
> foldrM f z (h:t) = f h (foldrM f z t)
> 
> -- foldlM is identical Control.Monad.foldM 
> -- Its code is shown below for reference.
> foldlM, foldlM' :: Monad m =>
>           (z -> a -> m z) -> z -> [a] -> m z
> foldlM f z []    = return z
> foldlM f z (h:t) = f z h >>= \z' -> foldlM f z' t
> 
> t1 = foldlM (\z a -> putStrLn ("foldlM: " ++ show a) >>
>                      return (a:z)) [] [1,2,3]
> 
> {-
> foldlM: 1
> foldlM: 2
> foldlM: 3
> [3,2,1]
> -}
> 
> -- foldlM' is foldlM expressed via foldrM
> foldlM' f z l = 
>     foldrM (\e am -> am >>= \k -> return $ \z -> f z e >>= k)
>            (return return) l >>= \f -> f z
> 
> -- foldrM'' is foldlM' with trace printing
> foldlM'' :: (MonadIO m, Show a) =>
>           (z -> a -> m z) -> z -> [a] -> m z
> foldlM'' f z l = 
>     foldrM (\e am -> liftIO (putStrLn $ "foldR: " ++ show e) >>
>             am >>= \k -> return $ \z -> f z e >>= k)
>            (return return) l >>= \f -> f z
> 
> 
> t2 = foldlM'' (\z a -> putStrLn ("foldlM: " ++ show a) >>
>                        return (a:z)) [] [1,2,3]
> 
> {-
> foldR: 1
> foldR: 2
> foldR: 3
> foldlM: 1
> foldlM: 2
> foldlM: 3
> [3,2,1]
> -}
> 
> 
> As we can see from the trace printing, first the whole list is
> traversed by foldR and the closure is constructed. Only after foldr
> has finished, the closure is applied to z ([] in our case), and
> foldl's function f gets a chance to work. The list is effectively
> traversed twice, which means the `copy' of the list has to be
> allocated -- that is, the closure that incorporates the calls to
> f e1, f e2, etc. 
> 

Gmane