Twan van Laarhoven | 18 Dec 17:06

Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Hello all,

As Thomas Hartman noted in a recent cafe thread [1], haskell 1.3 
included the functions 'subsequences' and 'permutations'. I think these 
functions are quite useful, and I don't know why they were ever removed. 
This is a proposal to add these two functions to Data.List. The 
implementation is taken directly from the Haskell 1.3 report [2].

Trac ticket: #1990
Discussion period ends: January 7th (since there is a holiday comming up)

Twan

[1] http://article.gmane.org/gmane.comp.lang.haskell.cafe/33535
[2] http://haskell.cs.yale.edu/haskell-report/List.html
Neil Mitchell | 18 Dec 17:34

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Hi

> As Thomas Hartman noted in a recent cafe thread [1], haskell 1.3
> included the functions 'subsequences' and 'permutations'. I think these
> functions are quite useful, and I don't know why they were ever removed.

Agreed and support. Obvious functions with unambiguous names.

Thanks

Neil
David Benbennick | 18 Dec 20:18

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Support, though I'd tweak the code in your patch a little:

subsequences            :: [a] -> [[a]]
subsequences []         =  [[]]
subsequences (x:xs)     =  let s = subsequences xs in s ++ map (x:) s

permutations            :: [a] -> [[a]]
permutations []         =  [[]]
permutations (x:xs)     =  concatMap interleave $ permutations xs
  where interleave []     =  [[x]]
        interleave (y:ys) =  (x:y:ys) : map (y:) (interleave ys)

In subsequences, make sure we don't calculate "subsequences xs" twice.
 In permutations, use : instead of ++ in one place, use concatMap
instead of list comprehension, and take out unnecessary parameter to
interleave.
Bertram Felgenhauer | 18 Dec 21:56

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

David Benbennick wrote:
> Support, though I'd tweak the code in your patch a little:
> 
> subsequences            :: [a] -> [[a]]
> subsequences []         =  [[]]
> subsequences (x:xs)     =  let s = subsequences xs in s ++ map (x:) s

This will cause a space leak, as 's' will be kept around while its first
copy is consumed.

(1)
  subsequences (x:xs)     =  let s = subsequences xs
                             in  concatMap (\xs -> [xs, x:xs]) s

behaves better in that regard, but changes the order of the result lists.
This version can also be made to work for infinite lists with a small
modification,

(2)
  subsequences (x:xs)     =  let s = subsequences xs
                             in  [] : tail (concatMap (\ys -> [ys, x:ys]) s)

finally, we could make it slightly more lazy, as follows:

(3)
  subsequences :: [a] -> [[a]]
  subsequences xs = [] : case xs of
      []     -> []
      (x:xs) -> tail (concatMap (\ys -> [ys, x:ys]) (subsequences xs))

(Continue reading)

Twan van Laarhoven | 18 Dec 23:09

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Bertram Felgenhauer wrote:
> finally, we could make it slightly more lazy, as follows:
> 
> (3)
>   subsequences :: [a] -> [[a]]
>   subsequences xs = [] : case xs of
>       []     -> []
>       (x:xs) -> tail (concatMap (\ys -> [ys, x:ys]) (subsequences xs))

You can get rid of the ugly call to 'tail':

     subsequences :: [a] -> [[a]]
     subsequences xs = [] : subsequences' xs
       where subsequences' []     = []
             subsequences' (x:xs) = [x] : concatMap (\ys -> [ys, x:ys])
                                                    (subsequences' xs)

> The only problem I see it that the output order is changed - does anybody
> feel that it is particularily important?

I don't think it matters much. This new version starts with things from 
the start of the list, as opposed to the end:

   subsequencesOld "abc" == ["","c","b","bc","a","ac","ab","abc"]
   subsequencesNew "abc" == ["","a","b","ab","c","ac","bc","abc"]

If anything, I think this makes more sense, especially since it works 
for infinite lists.

Twan
(Continue reading)

David Benbennick | 19 Dec 01:27

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

On Dec 18, 2007 12:56 PM, Bertram Felgenhauer
<bertram.felgenhauer <at> googlemail.com> wrote:
> finally, we could make it slightly more lazy

Good point, your version is much better.

The same issue applies to permutations.  I haven't had time to write
out the code yet, but I can imagine a version of permutations that
does:

permutations [1..] =
[1...],
[2,1,  ...],
[1,3,2 ...],
[2,3,1 ...],
[3,1,2 ...],
[3,2,1 ...],
...

so that the expression (take 10 $ map (take 10) $ permutations [1..])
isn't bottom.
Twan van Laarhoven | 19 Dec 03:25

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

David Benbennick wrote:

> On Dec 18, 2007 12:56 PM, Bertram Felgenhauer
> <bertram.felgenhauer <at> googlemail.com> wrote:
> 
>>finally, we could make it slightly more lazy
> 
> 
> Good point, your version is much better.
> 
> The same issue applies to permutations.  I haven't had time to write
> out the code yet, but I can imagine a version of permutations that
> does:  ...

Using mutual recursion between a version including the identity and one 
not including it, you can get:

     permutations1            :: [a] -> [[a]]
     permutations1 xxs        = xxs : permutations3b xxs
     permutations1' []        = []
     permutations1' (x:xs)    = tail $ concatMap interleave
                                         $ permutations1 xs
       where interleave []     =  [[x]]
             interleave (y:ys) =  (x:y:ys) : map (y:) (interleave ys)

Testing:

     > map (take 5) $ take 5 $ permutations1 [1..]
     [[1,2,3,4,5],[2,1,3,4,5],[2,3,1,4,5],[2,3,4,1,5],[2,3,4,5,1]]

(Continue reading)

David Benbennick | 19 Dec 03:52

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

On Dec 18, 2007 6:25 PM, Twan van Laarhoven <twanvl <at> gmail.com> wrote:
> David Benbennick wrote:
> > The same issue applies to permutations.  I haven't had time to write
> > out the code yet, but I can imagine a version of permutations that
> > does:  ...
>
> Using mutual recursion between a version including the identity and one
> not including it, you can get:

Actually, I would like permutations to satisfy:

  map (take n) (take (factorial n) $ permutations [1..]) == permutations [1..n]

Not only is that a neat property, but it forces maximal laziness.
That means, for example, that:

  take 6 $ map (take 3) $ permutations (1:2:3:undefined)

doesn't throw an exception.
Twan van Laarhoven | 20 Dec 01:57

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

David Benbennick wrote:
> Actually, I would like permutations to satisfy:
> 
>   map (take n) (take (factorial n) $ permutations [1..]) == permutations [1..n]
> 
> Not only is that a neat property, but it forces maximal laziness.

That would indeed be nice. After a lot of work I have come up with a 
function that does this:

   permutations4 xs = perms xs [[]] []
    where
     perms xxs ps qs = map (++xxs) ps
                    ++ case xxs of
                          []     -> []
                          (x:xs) -> perms xs
                                    (concatMap (interleave x) (ps ++ qs))
                                    (map (++[x]) (ps ++ qs))
     interleave x []      =  []
     interleave x (y:ys)  =  (x:y:ys) : map (y:) (interleave x ys)

Unfortunatly this function is really slow, 10.734375 sec, compared to 
3.875000 sec for the initial version (see previous message for details 
of testing). This is of course the result of way too many (++) calls.

Twan
Twan van Laarhoven | 20 Dec 05:43

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Twan van Laarhoven wrote:
> Unfortunatly this function is really slow, 10.734375 sec, compared to 
> 3.875000 sec for the initial version (see previous message for details 
> of testing). This is of course the result of way too many (++) calls.
> 
> Twan

I agree :). I have spent some time optimizing the function (both for 
readability and speed). I ended up with

   permutations5 xs = xs : perms xs [[]]
    where
     perms []     ps = []
     perms (x:xs) ps = concatMap interleave' ps
                    ++ perms xs
                      (concatMap interleave  ps)
      where interleave  []     = [[x]]
            interleave  (y:ys) = (x:y:ys)     : map (x:) (interleave  ys)
            interleave' []     = []
            interleave' (y:ys) = (x:y:ys++xs) : map (x:) (interleave' ys)

Or using 'foldr' instead of 'concatMap'

   permutations6 xs = xs : perms xs [[]]
    where
     perms []     ps = []
     perms (x:xs) ps = (flip . foldr) (interleave' id) ps
                     $ perms xs
                     $ (flip . foldr) (interleave  id) ps []
       where interleave  f []     r  =  f [x] : r
(Continue reading)

Chris Kuklewicz | 20 Dec 14:19

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Unfortunately, permutations5 has an error:

> *Main> permutations5 [1..3]
> [[1,2,3],[2,1,3],[3,2,1],[3,3,1],[3,2,2],[3,3,2]]
> *Main> permutations6 [1..3]
> [[1,2,3],[2,1,3],[3,2,1],[2,3,1],[3,1,2],[1,3,2]]
Twan van Laarhoven | 20 Dec 16:05

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Chris Kuklewicz wrote:
> Unfortunately, permutations5 has an error:
> 
> 
>>*Main> permutations5 [1..3]
>>[[1,2,3],[2,1,3],[3,2,1],[3,3,1],[3,2,2],[3,3,2]]
>>*Main> permutations6 [1..3]
>>[[1,2,3],[2,1,3],[3,2,1],[2,3,1],[3,1,2],[1,3,2]]

Oops, the "map (x:) ..." in interleave should be "map (y:) ...",

  permutations5 xs = xs : perms xs [[]]
    where
     perms []     ps = []
     perms (x:xs) ps = concatMap interleave' ps
                    ++ perms xs
                      (concatMap interleave  ps)
      where interleave  []     = [[x]]
            interleave  (y:ys) = (x:y:ys)     : map (y:) (interleave  ys)
            interleave' []     = []
            interleave' (y:ys) = (x:y:ys++xs) : map (y:) (interleave' ys)

Twan
David Benbennick | 20 Dec 18:25

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

It just occurred to me that permutations doesn't quite agree with the
mathematical concept:

>>> permutations "aaa"
["aaa","aaa","aaa","aaa","aaa","aaa"]

But that's not really a problem, since you can always just compose
permutations with (Set.toList . Set.fromList).

On Dec 20, 2007 7:05 AM, Twan van Laarhoven <twanvl <at> gmail.com> wrote:
> Chris Kuklewicz wrote:
> > Unfortunately, permutations5 has an error:

I suggest including some QuickCheck properties with this proposal,
especially a property that would have caught that error.  I'm thinking
of four properties:

1) If x = permutations y, then check length x is right.
2) Check each element of x is a permutation of y
3) Check each permutation of y is an element of x
4) Check for laziness

By the way, to improve mailing list laziness: the proposal is at
http://hackage.haskell.org/trac/ghc/ticket/1990
Yitzchak Gale | 20 Dec 23:41

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

David Benbennick wrote:
> Actually, I would like permutations to satisfy:
> map (take n) (take (factorial n) $ permutations [1..]) == permutations [1..n]

Twan van Laarhoven wrote:
>   permutations5 xs = xs : perms xs [[]]
>     where
>      perms []     ps = []
>      perms (x:xs) ps = concatMap interleave' ps
>                     ++ perms xs
>                       (concatMap interleave  ps)
>       where interleave  []     = [[x]]
>             interleave  (y:ys) = (x:y:ys)     : map (y:) (interleave  ys)
>             interleave' []     = []
>             interleave' (y:ys) = (x:y:ys++xs) : map (y:) (interleave' ys)

Twan, I think you are working a little too hard to satisfy
the consistency property. You only need to satisfy it for
permutations where the first n elements of the permutation
are the same as the first n elements of the original list.
Other than that, you can just use the faster function that you
defined earlier.

Here is a quick effort that beats permutations5 by using your
previous permutations3:

permutations7 xs = xs : (concat $ zipWith newPerms (init $ tail $ tails xs)
                                                   (init $ tail $ inits xs))
  where
    newPerms (t:ts) = map (++ts) . concatMap (interleave t) . permutations3
(Continue reading)

Twan van Laarhoven | 21 Dec 05:28

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Yitzchak Gale wrote:
> Twan, I think you are working a little too hard to satisfy
> the consistency property. You only need to satisfy it for
> permutations where the first n elements of the permutation
> are the same as the first n elements of the original list.
> Other than that, you can just use the faster function that you
> defined earlier.
> 
> Here is a quick effort that beats permutations5 by using your
> previous permutations3:
> 
> permutations7 xs = xs : (concat $ zipWith newPerms (init $ tail $ tails xs)
>                                                    (init $ tail $ inits xs))
>   where
>     newPerms (t:ts) = map (++ts) . concatMap (interleave t) . permutations3
>     interleave t [y]        = [[t, y]]
>     interleave t ys@(y:ys') = (t:ys) : map (y:) (interleave t ys')

That looks quite nice. Unfortunatly your function is too strict with the 
normal inits and tails. After replacing those with these lazier versions 
(which should be in Data.List) it works much better.

    inits' xxs = [] : case xxs of
                        []     -> []
                        (x:xs) -> map (x:) (inits' xs)

    tails' xxs = xxs : case xxs of
                        []     -> []
                        (_:xs) -> tails' xs

(Continue reading)

Yitzchak Gale | 22 Dec 19:28

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Twan van Laarhoven wrote:
>   permutations7':                        4.750 sec
>   permutations7', using 3 for recursion: 4.250 sec
>   permutations8:                         3.984 sec
>   permutations8b:                        2.250 sec
>   permutations8b, using 3 for recursion: 1.984 sec

Great work!

> My current preference is 8 or 8b, using a different function in the
> recursion is going to far for my taste.

Perhaps they could be combined somehow, e.g., re-use
interleave.

You should run your times with a control that just adds
up 10! copies of [1..10]. On my machine, that takes about
1 sec.

So when we subtract out the control, satisfying the consistency
property increases run time by at least a factor of 4.
The property is nice, but is it worth that penalty?
I'm not sure, I'd be interested in hearing other people's
opinions.

Regards,
Yitz
Yitzchak Gale | 22 Dec 19:34

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

> So when we subtract out the control, satisfying the consistency
> property increases run time by at least a factor of 4.

Oops, a factor of 3. But anyway, same point.

> The property is nice, but is it worth that penalty?
> I'm not sure, I'd be interested in hearing other people's
> opinions.

-Yitz
Twan van Laarhoven | 24 Dec 06:58

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Twan van Laarhoven wrote:
 >   permutations8b xs = xs : newPerms xs []
 >     where
 >      newPerms []     is = []
 >      newPerms (t:ts) is = foldr (interleave id) (newPerms ts (t:is))
 >                                 (permutations8b is)
 >       where interleave f []         r = r
 >             interleave f yys@(y:ys) r = f (t:yys++ts)
 >                                        : interleave (f . (y:)) ys r

Replacing interleave with

       interleave xs r = snd $ interleave' id xs r
       interleave' f []      r = (ts, r)
       interleave' f y(y:ys) r = let (us,zs) = interleave' (f . (y:)) ys r
                                 in  (y:us, f (t:y:us) : zs)

Gets rid of the (++). I call this version 8c. This has the run times:

     permuations8c, permutations3 in recursion    1.969 sec
     permuations8c, permutations8c in recursion   2.094 sec

The difference between these two variants is quite small, not worth the 
effort in my opinion. This is also getting quite close to the optimal 
non-lazy version (permutations3, 1.625 sec).

Yitzchak Gale wrote:
> So when we subtract out the control, satisfying the consistency
> property increases run time by at least a factor of 3.

(Continue reading)

Yitzchak Gale | 24 Dec 14:05

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

I wrote:
>> So when we subtract out the control, satisfying the consistency
>> property increases run time by at least a factor of 3.

Twan van Laarhoven wrote:
> What exactly are you calculating here?

The cost of the logic that actually generates the permutations,
after subtracting off the cost of adding up 10! lists of 10 numbers,
which is an artifact of our ad hoc performance test. True,
we also subtract off the cost of iterating over all of those
copies, which, due to laziness, is shared somewhat with the
generating algorithm. But that sharing is common to all algorithms,
so I think this is fair.

Anyway, no need to dwell on this point. It is not so relevant
anymore, see below.

>> The property is nice, but is it worth that penalty?
>> I'm not sure, I'd be interested in hearing other people's
>> opinions.

No one else responded. I am taking that as support for
the consistency condition and Twan's approach.
Besides, after Twan's further progress, there is hardly
a penalty anymore (less than a factor of 2 even according
to my logic, if you accept it).

As one further sanity check, let's see if we are re-inventing
the wheel by checking Knuth.
(Continue reading)

kahl | 19 Dec 02:33
Favicon

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990)

Bertram Felgenhauer proposed the following version of subsequences:

 > 
 > (3)
 >   subsequences :: [a] -> [[a]]
 >   subsequences xs = [] : case xs of
 >       []     -> []
 >       (x:xs) -> tail (concatMap (\ys -> [ys, x:ys]) (subsequences xs))

I think that lazyness is attractive, and sequence of results is
unimportant, so agree with his choice of (3).

I see additional improvements in saving the tail:

subsequences, nonEmptySubsequences :: [a] -> [[a]]
subsequences xs = [] : nonEmptySubsequences xs

nonEmptySubsequences [] = []
nonEmptySubsequences (x:xs) = [x] :
    -- concatMap (\ ys -> [ys, x:ys]) (nonEmptySubsequences xs)
    foldr f [] (nonEmptySubsequences xs)
  where f ys r = ys : (x : ys) : r

Testing this with:

main = do
  a : _ <- getArgs
  putStrLn $ show $ sum $ map sum $ subsequences [1 .. read a]

and ghc-6.8.2 -O produces the following user times
(Continue reading)

Twan van Laarhoven | 9 Jan 00:58

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990), discussion period end

Twan van Laarhoven wrote:
> Hello all,
> 
> As Thomas Hartman noted in a recent cafe thread [1], haskell 1.3 
> included the functions 'subsequences' and 'permutations'. I think these 
> functions are quite useful, and I don't know why they were ever removed. 
> This is a proposal to add these two functions to Data.List. The 
> implementation is taken directly from the Haskell 1.3 report [2].
> 
> Trac ticket: #1990
> Discussion period ends: January 7th (since there is a holiday comming up)

The discussion period for this proposal is over. Everyone here seems to 
agree it is a good idea.

The patch attached to the trac ticket contains the best versions discussed 
in this thread:
  - Wolfram(kahl <at> cas.mcmaster.ca)'s version of subsequences
  - my permutations8c

Can someone with the power to do so apply this patch?

Twan
Simon Marlow | 10 Jan 15:33

Re: Add 'subsequences' and 'permutations' to Data.List (ticket #1990), discussion period end

Twan van Laarhoven wrote:
> Twan van Laarhoven wrote:
>> Hello all,
>>
>> As Thomas Hartman noted in a recent cafe thread [1], haskell 1.3 
>> included the functions 'subsequences' and 'permutations'. I think 
>> these functions are quite useful, and I don't know why they were ever 
>> removed. This is a proposal to add these two functions to Data.List. 
>> The implementation is taken directly from the Haskell 1.3 report [2].
>>
>> Trac ticket: #1990
>> Discussion period ends: January 7th (since there is a holiday comming up)
> 
> The discussion period for this proposal is over. Everyone here seems to 
> agree it is a good idea.
> 
> The patch attached to the trac ticket contains the best versions 
> discussed in this thread:
>  - Wolfram(kahl <at> cas.mcmaster.ca)'s version of subsequences
>  - my permutations8c
> 
> Can someone with the power to do so apply this patch?

I've milestoned it for inclusion in GHC 6.10.

Cheers,
	Simon

Gmane