Edward Z. Yang | 22 Apr 03:07 2013
Picon

Stream fusion and span/break/group/init/tails

Hello all, (cc'd stream fusion paper authors)

I noticed that the current implementation of stream fusion does
not support "multiple-return" stream combinators, e.g.
break :: (a -> Bool) -> [a] -> ([a], [a]).  I thought a little
bit about how might one go about implement this, but the problem
seems nontrivial. (One possibility is to extend the definition
of Step to support multiple return, but the details are a mess!)
Nor, as far as I can tell, does the paper give any treatment of
the subject.  Has anyone thought about this subject in some detail?

Thanks,
Edward
Ben Lippmeier | 22 Apr 04:12 2013
Picon

Re: Stream fusion and span/break/group/init/tails


On 22/04/2013, at 11:07 , Edward Z. Yang <ezyang <at> mit.edu> wrote:

> Hello all, (cc'd stream fusion paper authors)
> 
> I noticed that the current implementation of stream fusion does
> not support "multiple-return" stream combinators, e.g.
> break :: (a -> Bool) -> [a] -> ([a], [a]).  I thought a little
> bit about how might one go about implement this, but the problem
> seems nontrivial. (One possibility is to extend the definition
> of Step to support multiple return, but the details are a mess!)
> Nor, as far as I can tell, does the paper give any treatment of
> the subject.  Has anyone thought about this subject in some detail?

I've spent the last few months fighting this exact problem.

The example you state is one instance of a more general limitation. Stream fusion (and most other short-cut
fusion approaches) cannot fuse a producer into multiple consumers. The fusion systems don't support any
"unzip-like" function, where elements from the input stream end up in multiple output streams. For example:

unzip :: [(a, b)] -> ([a], [b])

dup   :: [a] -> ([a], [a])

The general problem is that if elements of one output stream are demanded before the other, then the stream
combinator must buffer elements until they are demanded by both outputs.

John Hughes described this problem in his thesis, and gave an informal proof that it cannot be solved
without some form of "concurrency" -- meaning the evaluation of the two consumers must be interleaved.

(Continue reading)

Edward Z. Yang | 22 Apr 04:23 2013
Picon

Re: Stream fusion and span/break/group/init/tails

> I've got a solution for this problem and it will form the basis of
> Repa 4, which I'm hoping to finish a paper about for  the upcoming
> Haskell Symposium.

Sounds great! You should forward me a preprint when you have something
in presentable shape. I suppose before then, I should look at repa-head/repa-stream
to figure out what the details are?

Edward
Ben Lippmeier | 22 Apr 04:29 2013
Picon

Re: Stream fusion and span/break/group/init/tails


On 22/04/2013, at 12:23 , "Edward Z. Yang" <ezyang <at> MIT.EDU> wrote:

>> I've got a solution for this problem and it will form the basis of
>> Repa 4, which I'm hoping to finish a paper about for  the upcoming
>> Haskell Symposium.
> 
> Sounds great! You should forward me a preprint when you have something
> in presentable shape. I suppose before then, I should look at repa-head/repa-stream
> to figure out what the details are?

The basic approach is already described in:

Automatic Transformation of Series Expressions into Loops
Richard Waters, TOPLAS 1991

The Anatomy of a Loop
Olin Shivers, ICFP 2005

The contribution of the HS paper is planning to be:
 1) How to extend the approach to the combinators we need for DPH
 2) How to package it nicely into a Haskell library.

I'm still working on the above...

Ben.
Edward Z. Yang | 22 Apr 09:27 2013
Picon

Re: Stream fusion and span/break/group/init/tails

So, if I understand correctly, you're using the "online/offline"
criterion to resolve non-directed cycles in pipelines?  (I couldn't
tell how the Shivers paper was related.)

Cheers,
Edward

Excerpts from Ben Lippmeier's message of Sun Apr 21 19:29:29 -0700 2013:
> 
> On 22/04/2013, at 12:23 , "Edward Z. Yang" <ezyang <at> MIT.EDU> wrote:
> 
> >> I've got a solution for this problem and it will form the basis of
> >> Repa 4, which I'm hoping to finish a paper about for  the upcoming
> >> Haskell Symposium.
> > 
> > Sounds great! You should forward me a preprint when you have something
> > in presentable shape. I suppose before then, I should look at repa-head/repa-stream
> > to figure out what the details are?
> 
> The basic approach is already described in:
> 
> Automatic Transformation of Series Expressions into Loops
> Richard Waters, TOPLAS 1991
> 
> The Anatomy of a Loop
> Olin Shivers, ICFP 2005
> 
> 
> The contribution of the HS paper is planning to be:
>  1) How to extend the approach to the combinators we need for DPH
(Continue reading)

Ben Lippmeier | 23 Apr 04:39 2013
Picon

Re: Stream fusion and span/break/group/init/tails


On 22/04/2013, at 5:27 PM, Edward Z. Yang wrote:

> So, if I understand correctly, you're using the "online/offline"
> criterion to resolve non-directed cycles in pipelines?  (I couldn't
> tell how the Shivers paper was related.)

The "online" criteria guarantees that the stream operator does not need to buffer an unbounded amount of
data (I think). 

I'm not sure what you mean by "resolve non-directed cycles".

The Shivers paper describes the same basic approach of splitting the code for a stream operator in to parts
that run before the loop/for each element of a loop/after the loop etc. Splitting multiple operators this
way and then merging the parts into a single loop provides the "concurrency" required by the description
in John Hughes's thesis.

Ben.
Duncan Coutts | 24 Apr 19:47 2013

Re: Stream fusion and span/break/group/init/tails

On Sun, 2013-04-21 at 18:07 -0700, Edward Z. Yang wrote:
> Hello all, (cc'd stream fusion paper authors)
> 
> I noticed that the current implementation of stream fusion does
> not support "multiple-return" stream combinators, e.g.
> break :: (a -> Bool) -> [a] -> ([a], [a]).  I thought a little
> bit about how might one go about implement this, but the problem
> seems nontrivial. (One possibility is to extend the definition
> of Step to support multiple return, but the details are a mess!)
> Nor, as far as I can tell, does the paper give any treatment of
> the subject.  Has anyone thought about this subject in some detail?

I address it briefly in my thesis [1], Section 4.8.2. I think it's a
fundamental limitation of stream fusion.

It looks like fold and unfold fusion systems have dual limitations:
fold-based fusion cannot handle zip style functions, while unfold-based
fusion cannot handle unzip style functions. That is fold-based cannot
consume multiple inputs, while unfold-based cannot produce multiple
outputs.

I'll be interested to see in more detail the approach that Ben is
talking about. As Ben says, intuitively the problem is that when you've
got multiple outputs so you need to make sure that someone is consuming
them and that that consumption is appropriately synchronised so that you
don't have to buffer (buffering would almost certainly eliminate the
gains from fusion). That might be possible if ultimately the multiple
outputs are combined again in some way, so that overall you still have a
single consumer, that can be turned into a single lazy or eager loop.

(Continue reading)

Bryan O'Sullivan | 24 Apr 19:56 2013

Re: Stream fusion and span/break/group/init/tails

On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
I address it briefly in my thesis [1], Section 4.8.2. I think it's a
fundamental limitation of stream fusion.

See also concat, where the naive fusion-based implementation has quadratic performance:

concat :: [Text] -> Text
concat txts = unstream (Stream.concat (List.map stream txts))

I've never figured out how to implement this with sensible characteristics within the fusion framework.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Duncan Coutts | 24 Apr 20:00 2013

Re: Stream fusion and span/break/group/init/tails

On Wed, 2013-04-24 at 10:56 -0700, Bryan O'Sullivan wrote:
> On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <
> duncan.coutts <at> googlemail.com> wrote:
> 
> > I address it briefly in my thesis [1], Section 4.8.2. I think it's a
> > fundamental limitation of stream fusion.
> >
> 
> See also concat, where the naive fusion-based implementation has quadratic
> performance:
> 
> concat :: [Text] -> Text
> concat txts = unstream (Stream.concat (List.map stream txts))
> 
> I've never figured out how to implement this with sensible characteristics
> within the fusion framework.

Well of course concatMap is another issue. I address that in section
4.8.3 :-)

Summary there is that I don't think it is a fundamental limitation, but
certainly we don't do it properly in practice now. I have a suggestion
in that section for how we might do it.

Duncan
Gábor Lehel | 25 Apr 00:52 2013
Picon

Re: Stream fusion and span/break/group/init/tails



On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:
On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
I address it briefly in my thesis [1], Section 4.8.2. I think it's a
fundamental limitation of stream fusion.

See also concat, where the naive fusion-based implementation has quadratic performance:

concat :: [Text] -> Text
concat txts = unstream (Stream.concat (List.map stream txts))

I've never figured out how to implement this with sensible characteristics within the fusion framework.

If you could "solve" concat, might that also lead to be being able to do without the Skip constructor? Skip was added explicitly to be able to efficiently handle things like filter (without it the Step datatype is exactly the base functor for lists), but if concat "works", then filter p can be expressed as concat . map (\x -> if (p x) then [x] else []). Of course, presumably filter isn't the only function that requires Skip, I don't know what the others are. (Also of course "solve" and "works" are intentionally fuzzy, with the point being that I don't know if "solving" concat implies that the desirable properties would survive composition in the suggested manner.)

--
Your ship was destroyed in a monadic eruption.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Dan Doel | 25 Apr 01:06 2013
Picon

Re: Stream fusion and span/break/group/init/tails

Presumably concat also has to use skip, for the same reason as filter. Otherwise it has to recursively process the outer stream until it gets to a non-empty inner stream, which breaks the rule that only the final consumer is recursive.

    concat [[1,2,3],[4,5],[],[6,7]]

probably looks something like:

    Yield 1, Yield 2, Yield 3, Skip, Yield 4, Yield 5, Skip, Skip, Yield 6, Yield 7, Skip, Done

-- Dan



On Wed, Apr 24, 2013 at 6:52 PM, Gábor Lehel <illissius <at> gmail.com> wrote:


On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:
On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
I address it briefly in my thesis [1], Section 4.8.2. I think it's a
fundamental limitation of stream fusion.

See also concat, where the naive fusion-based implementation has quadratic performance:

concat :: [Text] -> Text
concat txts = unstream (Stream.concat (List.map stream txts))

I've never figured out how to implement this with sensible characteristics within the fusion framework.

If you could "solve" concat, might that also lead to be being able to do without the Skip constructor? Skip was added explicitly to be able to efficiently handle things like filter (without it the Step datatype is exactly the base functor for lists), but if concat "works", then filter p can be expressed as concat . map (\x -> if (p x) then [x] else []). Of course, presumably filter isn't the only function that requires Skip, I don't know what the others are. (Also of course "solve" and "works" are intentionally fuzzy, with the point being that I don't know if "solving" concat implies that the desirable properties would survive composition in the suggested manner.)

--
Your ship was destroyed in a monadic eruption.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gábor Lehel | 25 Apr 01:43 2013
Picon

Re: Stream fusion and span/break/group/init/tails

Ah, good point.

On Thu, Apr 25, 2013 at 1:06 AM, Dan Doel <dan.doel <at> gmail.com> wrote:
Presumably concat also has to use skip, for the same reason as filter. Otherwise it has to recursively process the outer stream until it gets to a non-empty inner stream, which breaks the rule that only the final consumer is recursive.

    concat [[1,2,3],[4,5],[],[6,7]]

probably looks something like:

    Yield 1, Yield 2, Yield 3, Skip, Yield 4, Yield 5, Skip, Skip, Yield 6, Yield 7, Skip, Done

-- Dan



On Wed, Apr 24, 2013 at 6:52 PM, Gábor Lehel <illissius <at> gmail.com> wrote:


On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:
On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
I address it briefly in my thesis [1], Section 4.8.2. I think it's a
fundamental limitation of stream fusion.

See also concat, where the naive fusion-based implementation has quadratic performance:

concat :: [Text] -> Text
concat txts = unstream (Stream.concat (List.map stream txts))

I've never figured out how to implement this with sensible characteristics within the fusion framework.

If you could "solve" concat, might that also lead to be being able to do without the Skip constructor? Skip was added explicitly to be able to efficiently handle things like filter (without it the Step datatype is exactly the base functor for lists), but if concat "works", then filter p can be expressed as concat . map (\x -> if (p x) then [x] else []). Of course, presumably filter isn't the only function that requires Skip, I don't know what the others are. (Also of course "solve" and "works" are intentionally fuzzy, with the point being that I don't know if "solving" concat implies that the desirable properties would survive composition in the suggested manner.)

--
Your ship was destroyed in a monadic eruption.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe





--
Your ship was destroyed in a monadic eruption.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Duncan Coutts | 29 Apr 16:05 2013

Re: Stream fusion and span/break/group/init/tails

On Thu, 2013-04-25 at 00:52 +0200, Gábor Lehel wrote:
> On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com>wrote:
> 
> > On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <
> > duncan.coutts <at> googlemail.com> wrote:
> >
> >> I address it briefly in my thesis [1], Section 4.8.2. I think it's a
> >> fundamental limitation of stream fusion.
> >>
> >
> > See also concat, where the naive fusion-based implementation has quadratic
> > performance:
> >
> > concat :: [Text] -> Text
> > concat txts = unstream (Stream.concat (List.map stream txts))
> >
> > I've never figured out how to implement this with sensible characteristics
> > within the fusion framework.
> >
> 
> If you could "solve" concat, might that also lead to be being able to do
> without the Skip constructor?

Dan is right, we still need Skip. My suggested "solution" to the
concatmap problem is also mostly independent of the skip issue.

You shouldn't think of skip as being a hack. It's not. It's how we
express a more general class of producers in a way that is productive. 

You can think of a stream as being a little state machine and sometimes
the state machine needs to be able to make transitions without producing
any output. One solution to that is to hide those transitions (by
running the state machine until it does produce something, ie using
recursion/loops) and the other is to expose the transition as a skip.
The skip approach where we don't use recursion/loops allows us to do the
various transformations we need to be able to effectively optimise the
whole thing.

If you're interested in this stuff, you can look at the section of my
thesis that goes on about this state machine perspective on things. I
think it's quite a useful way to understand it (and understand how we
optimise stream functions by composing these state machines). More
generally, that chapter explains why stream fusion should actually be an
optimisation.

As for step and the list base functor. Yes, absolutely. And adding skip
does make things harder to prove, because it adds more "junk" values.
The other major chapter of my thesis explains why it's all still true,
even when we have skip, or rather how we have to do things carefully so
that it does still remain valid.

Duncan

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gábor Lehel | 29 Apr 20:19 2013
Picon

Re: Stream fusion and span/break/group/init/tails



On Mon, Apr 29, 2013 at 4:05 PM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
On Thu, 2013-04-25 at 00:52 +0200, Gábor Lehel wrote:
> On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com>wrote:
>
> > On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <
> > duncan.coutts <at> googlemail.com> wrote:
> >
> >> I address it briefly in my thesis [1], Section 4.8.2. I think it's a
> >> fundamental limitation of stream fusion.
> >>
> >
> > See also concat, where the naive fusion-based implementation has quadratic
> > performance:
> >
> > concat :: [Text] -> Text
> > concat txts = unstream (Stream.concat (List.map stream txts))
> >
> > I've never figured out how to implement this with sensible characteristics
> > within the fusion framework.
> >
>
> If you could "solve" concat, might that also lead to be being able to do
> without the Skip constructor?

Dan is right, we still need Skip. My suggested "solution" to the
concatmap problem is also mostly independent of the skip issue.

You shouldn't think of skip as being a hack. It's not. It's how we
express a more general class of producers in a way that is productive.

You can think of a stream as being a little state machine and sometimes
the state machine needs to be able to make transitions without producing
any output. One solution to that is to hide those transitions (by
running the state machine until it does produce something, ie using
recursion/loops) and the other is to expose the transition as a skip.
The skip approach where we don't use recursion/loops allows us to do the
various transformations we need to be able to effectively optimise the
whole thing.

If you're interested in this stuff, you can look at the section of my
thesis that goes on about this state machine perspective on things. I
think it's quite a useful way to understand it (and understand how we
optimise stream functions by composing these state machines). More
generally, that chapter explains why stream fusion should actually be an
optimisation.

As for step and the list base functor. Yes, absolutely. And adding skip
does make things harder to prove, because it adds more "junk" values.
The other major chapter of my thesis explains why it's all still true,
even when we have skip, or rather how we have to do things carefully so
that it does still remain valid.

Duncan


Thanks for the explanation. I looked at your thesis previously, but only read through a couple of sections (including the one about concatMap). I might go through the state machine parts as well now that I know the significance/relevance.

The thing in particular that was motivating me is that if it weren't for Skip, it seems that to some extent (I haven't had time to investigate precisely what extent) you could write a stream fusion framework in a datatype-generic way, parameterized over the base functor. But it wasn't obvious to me how (or whether) you would translate Skip. But maybe the state machine perspective will provide some insight into that. I'll think about it.

--
Your ship was destroyed in a monadic eruption.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Duncan Coutts | 29 Apr 20:35 2013

Re: Stream fusion and span/break/group/init/tails

On Mon, 2013-04-29 at 20:19 +0200, Gábor Lehel wrote:

> Thanks for the explanation. I looked at your thesis previously, but only
> read through a couple of sections (including the one about concatMap). I
> might go through the state machine parts as well now that I know the
> significance/relevance.
> 
> The thing in particular that was motivating me is that if it weren't for
> Skip, it seems that to some extent (I haven't had time to investigate
> precisely what extent) you could write a stream fusion framework in a
> datatype-generic way, parameterized over the base functor. But it wasn't
> obvious to me how (or whether) you would translate Skip. But maybe the
> state machine perspective will provide some insight into that. I'll think
> about it.

Oh I think you can write it in a data-type generic way.

If your datatype is described by a base functor F, then the skip version
is a simple transformation on that functor.

F_skip a = F a + a

And then the stream type for F is  nu a. F_skip a

See section 3.6.

In most of my theory chapter I write it in this style, rather than using
the list functor specifically.

Duncan

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gábor Lehel | 7 May 18:35 2013
Picon

Re: Stream fusion and span/break/group/init/tails



On Mon, Apr 29, 2013 at 8:35 PM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
On Mon, 2013-04-29 at 20:19 +0200, Gábor Lehel wrote:

> Thanks for the explanation. I looked at your thesis previously, but only
> read through a couple of sections (including the one about concatMap). I
> might go through the state machine parts as well now that I know the
> significance/relevance.
>
> The thing in particular that was motivating me is that if it weren't for
> Skip, it seems that to some extent (I haven't had time to investigate
> precisely what extent) you could write a stream fusion framework in a
> datatype-generic way, parameterized over the base functor. But it wasn't
> obvious to me how (or whether) you would translate Skip. But maybe the
> state machine perspective will provide some insight into that. I'll think
> about it.

Oh I think you can write it in a data-type generic way.

If your datatype is described by a base functor F, then the skip version
is a simple transformation on that functor.

F_skip a = F a + a

And then the stream type for F is  nu a. F_skip a

See section 3.6.

In most of my theory chapter I write it in this style, rather than using
the list functor specifically.

Duncan


Oh. So basically it does just amount to adding a `Skip s` constructor to whatever the base functor was. I definitely should read more of it. Thanks again!

--
Your ship was destroyed in a monadic eruption.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Dan Doel | 29 Apr 20:40 2013
Picon

Re: Stream fusion and span/break/group/init/tails

On Mon, Apr 29, 2013 at 10:05 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
On Thu, 2013-04-25 at 00:52 +0200, Gábor Lehel wrote:
> On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com>wrote:
>
> > On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <
> > duncan.coutts <at> googlemail.com> wrote:
> >
> >> I address it briefly in my thesis [1], Section 4.8.2. I think it's a
> >> fundamental limitation of stream fusion.
> >>
> >
> > See also concat, where the naive fusion-based implementation has quadratic
> > performance:
> >
> > concat :: [Text] -> Text
> > concat txts = unstream (Stream.concat (List.map stream txts))
> >
> > I've never figured out how to implement this with sensible characteristics
> > within the fusion framework.
> >
>
> If you could "solve" concat, might that also lead to be being able to do
> without the Skip constructor?

Dan is right, we still need Skip. My suggested "solution" to the
concatmap problem is also mostly independent of the skip issue.

You shouldn't think of skip as being a hack. It's not. It's how we
express a more general class of producers in a way that is productive.

 To further this, note that in a total language, with the type:

    codata Stream a = End | Yield a (Stream a)

filter is not definable; it is not a total function. At least, barring an extra proof of some sort that the predicate will yield true after a finite amount of time. concat is similar.

Also, adding Skip (Stream a) is a relatively standard way of explicitly representing lazy, partial values. (This is opposed to the partiality monad, which is like an encoding of strict general recursion). That is, if νF is the type of total values, then ν(F + Id) is the type of partial values. I don't know how easy it is to delete from a more complex tree using just that extension, but in theory you could productively represent arbitrary manipulations with just that, I believe.

-- Dan
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Gábor Lehel | 7 May 18:43 2013
Picon

Re: Stream fusion and span/break/group/init/tails



On Mon, Apr 29, 2013 at 8:40 PM, Dan Doel <dan.doel <at> gmail.com> wrote:
On Mon, Apr 29, 2013 at 10:05 AM, Duncan Coutts <duncan.coutts <at> googlemail.com> wrote:
On Thu, 2013-04-25 at 00:52 +0200, Gábor Lehel wrote:
> On Wed, Apr 24, 2013 at 7:56 PM, Bryan O'Sullivan <bos <at> serpentine.com>wrote:
>
> > On Wed, Apr 24, 2013 at 10:47 AM, Duncan Coutts <
> > duncan.coutts <at> googlemail.com> wrote:
> >
> >> I address it briefly in my thesis [1], Section 4.8.2. I think it's a
> >> fundamental limitation of stream fusion.
> >>
> >
> > See also concat, where the naive fusion-based implementation has quadratic
> > performance:
> >
> > concat :: [Text] -> Text
> > concat txts = unstream (Stream.concat (List.map stream txts))
> >
> > I've never figured out how to implement this with sensible characteristics
> > within the fusion framework.
> >
>
> If you could "solve" concat, might that also lead to be being able to do
> without the Skip constructor?

Dan is right, we still need Skip. My suggested "solution" to the
concatmap problem is also mostly independent of the skip issue.

You shouldn't think of skip as being a hack. It's not. It's how we
express a more general class of producers in a way that is productive.

 To further this, note that in a total language, with the type:

    codata Stream a = End | Yield a (Stream a)

filter is not definable; it is not a total function. At least, barring an extra proof of some sort that the predicate will yield true after a finite amount of time. concat is similar.

Also, adding Skip (Stream a) is a relatively standard way of explicitly representing lazy, partial values. (This is opposed to the partiality monad, which is like an encoding of strict general recursion). That is, if νF is the type of total values, then ν(F + Id) is the type of partial values. I don't know how easy it is to delete from a more complex tree using just that extension, but in theory you could productively represent arbitrary manipulations with just that, I believe.

-- Dan

Very interesting (and makes sense), thank you.

--
Your ship was destroyed in a monadic eruption.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Ben Lippmeier | 26 Apr 04:46 2013
Picon

Re: Stream fusion and span/break/group/init/tails


On 25/04/2013, at 3:47 AM, Duncan Coutts wrote:

> It looks like fold and unfold fusion systems have dual limitations:
> fold-based fusion cannot handle zip style functions, while unfold-based
> fusion cannot handle unzip style functions. That is fold-based cannot
> consume multiple inputs, while unfold-based cannot produce multiple
> outputs.

Yes. This is a general property of data-flow programs and not just compilation via Data.Vector style
co-inductive stream fusion, or a property of fold/unfold/hylo fusion. 

Consider these general definitions of streams and costreams.

-- A stream must always produce an element.
type Stream a   = IO a

-- A costream must always consume an element.
type CoStream a = a -> IO ()

And operators on them (writing S for Stream and C for CoStream).

-- Versions of map.
map     :: (a -> b) -> S a -> S b    (ok)
comap   :: (a -> b) -> C b -> C a    (ok)

-- Versions of unzip.
unzip   :: S (a, b) -> (S a, S b)    (bad)
counzip :: C a -> C b -> C (a, b)    (ok)
unzipc  :: S (a, b) -> C b -> S a    (ok)

-- Versions of zip.
zip     :: S a -> S b -> S (a, b)    (ok)
cozip   :: C (a, b) -> (C a, C b)    (bad)
zipc    :: C (a, b) -> S a -> C b    (ok)

The operators marked (ok) can be implemented without buffering data, while the combinators marked (bad)
may need an arbitrary sized buffer.

Starting with 'unzip', suppose we pull elements from the first component of the result (the (S a)) but not
the second component (the (S b)). To provide these 'a' elements, 'unzip' must pull tuples from its source
stream (S (a, b)) and buffer the 'b' part until someone pulls from the (S b).

Dually, with 'cozip', suppose we push elements into the first component of the result (the (C a)). The
implementation must buffer them until someone pushes the corresponding element into the (C b), only then
can it push the whole tuple into the source (C (a, b)) costream.

The two combinators unzipc and zipc are hybrids:

For 'unzipc', if we pull an element from the (S a), then the implementation can pull a whole (a, b) tuple from
the source (S (a, b)) and then get rid of the 'b' part by pushing it into the (C b). The fact that it can get rid of
the 'b' part means it doesn't need a buffer.

Similarly, for 'zipc', if we push a 'b' into the (C b) then the implementation can pull the corresponding 'a'
part from the (S a) and then push the whole (a, b) tuple into the C (a, b). The fact that it can get the
corresponding 'a' means it doesn't need a buffer.

I've got some hand drawn diagrams of this if anyone wants them (mail me), but none of it helps implement
'unzip' for streams or 'cozip' for costreams. 

> I'll be interested to see in more detail the approach that Ben is
> talking about. As Ben says, intuitively the problem is that when you've
> got multiple outputs so you need to make sure that someone is consuming
> them and that that consumption is appropriately synchronised so that you
> don't have to buffer (buffering would almost certainly eliminate the
> gains from fusion). That might be possible if ultimately the multiple
> outputs are combined again in some way, so that overall you still have a
> single consumer, that can be turned into a single lazy or eager loop.

At least for high performance applications, I think we've reached the limit of what short-cut fusion
approaches can provide. By "short cut fusion", I mean crafting a special source program so that the
inliner + simplifier + constructor specialisation transform can crunch down the intermediate code into
a nice loop. Geoff Mainland's recent paper extended stream fusion with support for SIMD operations, but I
don't think stream fusion can ever be made to fuse programs with unzip/cozip-like operators properly.
This is a serious problem for DPH, because the DPH vectoriser naturally produces code that contains these operators.

I'm currently working on Repa 4, which will include a GHC plugin that hijacks the intermediate GHC core code
and performs the transformation described in Richard Water's paper "Automatic transformation of
series expressions into loops". The plugin will apply to stream programs, but not affect the existing
fusion mechanism via delayed arrays. I'm using a cut down 'clock calculus' from work on synchronous
data-flow languages to guarantee that all outputs from an unzip operation are consumed in lock-step.
Programs that don't do this won't be well typed. Forcing synchronicity guarantees that Waters's
transform will apply to the program.

The Repa plugin will also do proper SIMD vectorisation for stream programs, producing the SIMD primops
that Geoff recently added. Along the way it will brutally convert all operations on boxed/lifted numeric
data to their unboxed equivalents, because I am sick of adding bang patterns to every single function
parameter in Repa programs. 

Ben.
Johan Tibell | 26 Apr 06:15 2013
Picon

Re: Stream fusion and span/break/group/init/tails

Hi Ben,

On Thu, Apr 25, 2013 at 7:46 PM, Ben Lippmeier <benl <at> ouroborus.net> wrote:
> The Repa plugin will also do proper SIMD vectorisation for stream programs, producing the SIMD primops
that Geoff recently added. Along the way it will brutally convert all operations on boxed/lifted numeric
data to their unboxed equivalents, because I am sick of adding bang patterns to every single function
parameter in Repa programs.

How far is this plugin from being usable to implement a

{-# LANGUAGE Strict #-}

pragma for treating a single module as if Haskell was strict?

Cheers,
Johan
Ben Lippmeier | 26 Apr 06:20 2013
Picon

Re: Stream fusion and span/break/group/init/tails


On 26/04/2013, at 2:15 PM, Johan Tibell wrote:

> Hi Ben,
> 
> On Thu, Apr 25, 2013 at 7:46 PM, Ben Lippmeier <benl <at> ouroborus.net> wrote:
>> The Repa plugin will also do proper SIMD vectorisation for stream programs, producing the SIMD primops
that Geoff recently added. Along the way it will brutally convert all operations on boxed/lifted numeric
data to their unboxed equivalents, because I am sick of adding bang patterns to every single function
parameter in Repa programs.
> 
> How far is this plugin from being usable to implement a
> 
> {-# LANGUAGE Strict #-}
> 
> pragma for treating a single module as if Haskell was strict?

There is already one that does this, but I haven't used it.

http://hackage.haskell.org/package/strict-ghc-plugin

It's one of the demo plugins, though you need to mark individual functions rather than the whole module
(which would be straightforward to add).

The Repa plugin is only supposed to munge functions using the Repa library, rather than the whole module.

Ben.
Johan Tibell | 26 Apr 07:35 2013
Picon

Re: Stream fusion and span/break/group/init/tails

On Thu, Apr 25, 2013 at 9:20 PM, Ben Lippmeier <benl <at> ouroborus.net> wrote:
>
> On 26/04/2013, at 2:15 PM, Johan Tibell wrote:
>
>> Hi Ben,
>>
>> On Thu, Apr 25, 2013 at 7:46 PM, Ben Lippmeier <benl <at> ouroborus.net> wrote:
>>> The Repa plugin will also do proper SIMD vectorisation for stream programs, producing the SIMD primops
that Geoff recently added. Along the way it will brutally convert all operations on boxed/lifted numeric
data to their unboxed equivalents, because I am sick of adding bang patterns to every single function
parameter in Repa programs.
>>
>> How far is this plugin from being usable to implement a
>>
>> {-# LANGUAGE Strict #-}
>>
>> pragma for treating a single module as if Haskell was strict?
>
> There is already one that does this, but I haven't used it.
>
> http://hackage.haskell.org/package/strict-ghc-plugin
>
> It's one of the demo plugins, though you need to mark individual functions rather than the whole module
(which would be straightforward to add).
>
> The Repa plugin is only supposed to munge functions using the Repa library, rather than the whole module.

I guess what I was really hoping for was a plugin that rigorously
defined what it meant to make the code strict at a source language
level, rather than at a "lets make all lets into cases" Core level. :)
Andrew Cowie | 26 Apr 07:30 2013

Re: Stream fusion and span/break/group/init/tails

On Thu, 2013-04-25 at 21:15 -0700, Johan Tibell wrote:

> {-# LANGUAGE Strict #-}

God, I would love this. Obviously the plugin approach could do it, but
could not GHC itself just _not create thunks_ for things unless told to
be lazy in the presence of such a pragma?

[at which point, we need an annotation for laziness, instead of the
annotation for strictness. We're not using ampersand for anything, are
we?

        func Int -> Thing -> WorldPeace
        func &a &b = ...

Ah, bikeshed, how we love thee]

AfC
Sydney
Johan Tibell | 26 Apr 07:34 2013
Picon

Re: Stream fusion and span/break/group/init/tails

On Thu, Apr 25, 2013 at 10:30 PM, Andrew Cowie
<andrew <at> operationaldynamics.com> wrote:
> On Thu, 2013-04-25 at 21:15 -0700, Johan Tibell wrote:
>
>> {-# LANGUAGE Strict #-}
>
> God, I would love this. Obviously the plugin approach could do it, but
> could not GHC itself just _not create thunks_ for things unless told to
> be lazy in the presence of such a pragma?
>
> [at which point, we need an annotation for laziness, instead of the
> annotation for strictness. We're not using ampersand for anything, are
> we?
>
>         func Int -> Thing -> WorldPeace
>         func &a &b = ...
>
> Ah, bikeshed, how we love thee]

We already have ~ that's used to make lazy patterns. :)

Gmane