Twan van Laarhoven | 5 Oct 17:36

Proposal #2659: Add sortOn and friends to Data.List

Hello list,

Almost all uses of sortBy in user code use 'comparing', 'on' or a similar 
construction [1]. I think we should add a function that makes this common 
behavior more convenient:

     sortOn :: Ord b => (a -> b) -> [a] -> [a]

For consistency we should also add *On for the other *By functions in Data.List:

     nubOn           :: Eq b => (a -> b) -> [a] -> [a]
     deleteOn        :: Eq b => (a -> b) -> a -> [a] -> [a]
     deleteFirstsOn  :: Eq b => (a -> b) -> [a] -> [a] -> [a]
     unionOn         :: Eq b => (a -> b) -> [a] -> [a] -> [a]
     intersectOn     :: Eq b => (a -> b) -> [a] -> [a] -> [a]
     groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]
     sortOn          :: Ord b => (a -> b) -> [a] -> [a]
     insertOn        :: Ord b => (a -> b) -> a -> [a] -> [a]
     maximumOn       :: Ord b => (a -> b) -> [a] -> a
     minimumOn       :: Ord b => (a -> b) -> [a] -> a
    (nubSortOn       :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629

-- Naming --

"On": The reason for the "on" suffix is that it relates to the "on" function 
from Data.Function: "sortOn f = sort (compare `on` f)". In code, "sortOn fst" is 
reasonably natural, "sortBy fst" would be better but that is already taken.

"With": Another possible choice is the "with" suffix. There is some precedence 
for this choice [2]. A big disadvantage is that the "with" suffix is commonly 
(Continue reading)

Henning Thielemann | 5 Oct 18:00

Re: Proposal #2659: Add sortOn and friends to Data.List


On Sun, 5 Oct 2008, Twan van Laarhoven wrote:

> Almost all uses of sortBy in user code use 'comparing', 'on' or a similar 
> construction [1]. I think we should add a function that makes this common 
> behavior more convenient:
>
>    sortOn :: Ord b => (a -> b) -> [a] -> [a]
>
> For consistency we should also add *On for the other *By functions in 
> Data.List:
>
>    nubOn           :: Eq b => (a -> b) -> [a] -> [a]
>    deleteOn        :: Eq b => (a -> b) -> a -> [a] -> [a]
>    deleteFirstsOn  :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>    unionOn         :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>    intersectOn     :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>    groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]
>    sortOn          :: Ord b => (a -> b) -> [a] -> [a]
>    insertOn        :: Ord b => (a -> b) -> a -> [a] -> [a]
>    maximumOn       :: Ord b => (a -> b) -> [a] -> a
>    minimumOn       :: Ord b => (a -> b) -> [a] -> a
>   (nubSortOn       :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629

I also prefer these functions and have called them '*Key', because they 
work on a key:
    http://darcs.haskell.org/htam/src/Useful.hs

> Questions:
>  1. should sortOn be added to Data.List?
(Continue reading)

Duncan Coutts | 5 Oct 20:34
Gravatar

Re: Proposal #2659: Add sortOn and friends to Data.List

On Sun, 2008-10-05 at 17:39 +0200, Twan van Laarhoven wrote:
> Hello list,
> 
> 
> Almost all uses of sortBy in user code use 'comparing', 'on' or a similar 
> construction [1]. I think we should add a function that makes this common 
> behavior more convenient:
> 
>      sortOn :: Ord b => (a -> b) -> [a] -> [a]

I think fewer names and more combinations is usually best unless there
is a really compelling reason. Just because things are often used in
combination doesn't mean we have to make a new name to represent that
composition. As functional programmers we are very used to using
composition, especially simple function application.

What is wrong with

sortBy (comparing fieldFoo) ?

It's not very long, it's pretty descriptive and easily generalises using
the 'on' function.

The only thing missing here is equating. I still have to define that
myself to write:

groupBy (equating fieldBar)

I know I can write

(Continue reading)

Wolfgang Jeltsch | 7 Oct 13:17

Re: Proposal #2659: Add sortOn and friends to Data.List

Am Sonntag, 5. Oktober 2008 20:34 schrieb Duncan Coutts:
> I think fewer names and more combinations is usually best unless there
> is a really compelling reason. Just because things are often used in
> combination doesn't mean we have to make a new name to represent that
> composition. As functional programmers we are very used to using
> composition, especially simple function application.

I fully agree!

The identifiers of the proposed new functions are rather systematic.  So why 
not use the means of the language (combining higher-order functions) to 
express the underlying structure instead of relying on naming conventions and 
boilerplate code?

Best wishes,
Wolfgang
Henning Thielemann | 10 Oct 10:54

Re: Proposal #2659: Add sortOn and friends to Data.List


On Tue, 7 Oct 2008, Wolfgang Jeltsch wrote:

> Am Sonntag, 5. Oktober 2008 20:34 schrieb Duncan Coutts:
>> I think fewer names and more combinations is usually best unless there
>> is a really compelling reason. Just because things are often used in
>> combination doesn't mean we have to make a new name to represent that
>> composition. As functional programmers we are very used to using
>> composition, especially simple function application.
>
> I fully agree!
>
> The identifiers of the proposed new functions are rather systematic.  So why
> not use the means of the language (combining higher-order functions) to
> express the underlying structure instead of relying on naming conventions and
> boilerplate code?

  The problem is, that combinations of functions get larger and larger this 
way. I consider it good style to write short function definitions, which 
in turn use other functions, that make sense of their own. If every 
function definition is short, your argument would mean, that it is better 
not to define such functions at all, why not inlining all those 
definitions in the main program? Why having 'map' and 'filter', they are 
just 'foldr' with simple function arguments. I have once proposed to add 
'toMaybe' to Data.Maybe [1]. It was rejected, because it can easily be 
implemented with 'guard'. However, that's not the point. It's often the 
case that 'toMaybe' is simpler to apply and it denotes a frequent pattern. 
So I continue to define it in the Utility modules of many of my packages. 
The same discussion arose about 'concatMap' and 'intercalate'.
  I see two extremes:
(Continue reading)

Mitchell, Neil | 6 Oct 10:21
Favicon

RE: Proposal #2659: Add sortOn and friends to Data.List

Hi Twan,

> Almost all uses of sortBy in user code use 'comparing', 'on' 
> or a similar construction [1]. I think we should add a 
> function that makes this common behavior more convenient:
> 
>      sortOn :: Ord b => (a -> b) -> [a] -> [a]
> 
> For consistency we should also add *On for the other *By 
> functions in Data.List:
> 
>      nubOn           :: Eq b => (a -> b) -> [a] -> [a]
>      deleteOn        :: Eq b => (a -> b) -> a -> [a] -> [a]
>      deleteFirstsOn  :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>      unionOn         :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>      intersectOn     :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>      groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]
>      sortOn          :: Ord b => (a -> b) -> [a] -> [a]
>      insertOn        :: Ord b => (a -> b) -> a -> [a] -> [a]
>      maximumOn       :: Ord b => (a -> b) -> [a] -> a
>      minimumOn       :: Ord b => (a -> b) -> [a] -> a
>     (nubSortOn       :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629

All good, apart from deleteFirstsOn. I honestly don't know what that
function does, I was thinking possibly something like the first
component of tuples? But no real idea.

> When f is a slow function the call "sortOn f xs" recomputes f 
> more than necessary. A version that caches the result of f is 
> easy to define:
(Continue reading)

Twan van Laarhoven | 6 Oct 22:50

Re: Proposal #2659: Add sortOn and friends to Data.List

Mitchell, Neil wrote:
> Hi Twan,
> 
> 
>>Almost all uses of sortBy in user code use 'comparing', 'on' 
>>or a similar construction [1]. I think we should add a 
>>function that makes this common behavior more convenient:
>>
>>     sortOn :: Ord b => (a -> b) -> [a] -> [a]
>>
>>For consistency we should also add *On for the other *By 
>>functions in Data.List:
>>
>>     nubOn           :: Eq b => (a -> b) -> [a] -> [a]
>>     deleteOn        :: Eq b => (a -> b) -> a -> [a] -> [a]
>>     deleteFirstsOn  :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>>     unionOn         :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>>     intersectOn     :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>>     groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]
>>     sortOn          :: Ord b => (a -> b) -> [a] -> [a]
>>     insertOn        :: Ord b => (a -> b) -> a -> [a] -> [a]
>>     maximumOn       :: Ord b => (a -> b) -> [a] -> a
>>     minimumOn       :: Ord b => (a -> b) -> [a] -> a
>>    (nubSortOn       :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629
> 
> 
> All good, apart from deleteFirstsOn. I honestly don't know what that
> function does, I was thinking possibly something like the first
> component of tuples? But no real idea.

(Continue reading)

Favicon

Re: Proposal #2659: Add sortOn and friends to Data.List

How about capturing the pattern in higher order functions?

if we define:

> import Data.Function
> import Data.List

> data Cache b a = Cache b a
> fromCache (Cache _ a) = a
> toCache f a = Cache (f a) a
> cache (Cache b _) = b

> caching f g = map fromCache . g . map (toCache f)

then sortOn becomes:

> sortOn2 f = caching f (sortBy (compare `on` cache))

and with the instances

> instance Ord b => Ord (Cache b a) where
>    compare x y = compare (cache x) (cache y)
>
> instance Eq b => Eq (Cache b a) where
>    x == y = cache x == cache y

it gets even simpler:

> sortOn3 f = caching f sort

(Continue reading)

Sterling Clover | 7 Oct 00:36

Re: Proposal #2659: Add sortOn and friends to Data.List

On Mon, Oct 6, 2008 at 5:32 PM, Jean-Philippe Bernardy <bernardy <at> chalmers.se> wrote:
How about capturing the pattern in higher order functions?

I absolutely agree with the spirit of this. We're in a language with higher order functions and polymorphism -- the best there is for that! We should take advantage of this and do away with specialized implementations altogether, including By. The caching overhead is relatively minimal for simple cases, and for complex cases its a big win.

The problem with the "caching" solution as presented, however, is that it works for functions with the signature of sort. But most functions, such as maximum, have no such signature. But wait... polymorphism comes to the rescue!

First the preliminaries: 

> {-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, FunctionalDependencies #-}
> module Schwartz where
> import Data.List
> import Control.Applicative
> import Data.Function

Then a type for our Schwartzian transform:

> data Schwartz a b = Schwartz {sw_out :: !a, sw_trans :: !b}
> instance Eq b => Eq (Schwartz a b) where (==) = (==) `on` sw_trans
> instance Ord b => Ord (Schwartz a b) where compare = compare `on` sw_trans

Injection: 

> sw_in f = Schwartz <*> f

And a higher order function for the [a] -> [a] case:

> schwartzAWith :: (a -> b) -> ([Schwartz a b] -> [Schwartz a b]) -> [a] -> [a]
> schwartzAWith f g = map sw_out . g . map (sw_in f)

And then for the [a] -> a case:

> schwartzBWith :: (a -> b) -> ([Schwartz a b] -> Schwartz a b) -> [a] -> a
> schwartzBWith f g =     sw_out . g . map (sw_in f)

A bit clunky, but a typeclass that embodies the pattern:

> class ToSchwartz a b c d | c -> d where
>    using :: (a -> b) -> ([Schwartz a b] -> c) -> [a] -> d

Instances: 

> instance ToSchwartz a b [Schwartz a b] [a] where
>    using u = schwartzAWith u
>
> instance ToSchwartz a b (Schwartz a b)  a  where
>    using u = schwartzBWith u

And now....
GHCi> using negate sort [1..5]
[5,4,3,2,1]
GHCi> using negate maximum [1..5]
1

A solution that would please the most discerning Haskell programmer (or Mel Brooks).

--Sterl.

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Henning Thielemann | 10 Oct 11:03

Re: Proposal #2659: Add sortOn and friends to Data.List


On Mon, 6 Oct 2008, Jean-Philippe Bernardy wrote:

> How about capturing the pattern in higher order functions?
>
> if we define:
>
>> import Data.Function
>> import Data.List
>
>> data Cache b a = Cache b a
>> fromCache (Cache _ a) = a
>> toCache f a = Cache (f a) a
>> cache (Cache b _) = b
>
>> caching f g = map fromCache . g . map (toCache f)

I've already done this with plain pairs - a distinct type like Cache is of 
course better. See attachKey:
   http://darcs.haskell.org/htam/src/Useful.hs
Bart Massey | 6 Oct 11:39
Favicon
Gravatar

Re: Proposal #2659: Add sortOn and friends to Data.List

Twan van Laarhoven <twanvl <at> gmail.com> writes:
> Almost all uses of sortBy in user code use 'comparing',
> 'on' or a similar construction [1].

I certainly agree that there is room for improvement here.
But I hate to see the proliferation of functions, almost
all alike.  I guess I would support the proposal if that's
the way folks would like to go, though.

I would prefer:

  1) Get rid of any need for Data.Function by exporting
     'fix' and 'on' into the Prelude.  The rest of it is
     already in the Prelude anyway, and the implementations
     of 'fix' and 'on' are trivial.  (I'd say Data.Function
     is kind of a failed experiment at this point, but what
     do I know?)

  2) Move the sort functions into Data.Ord.  Obviously, they
     would have to be re-exported by Data.List.  It seems to
     me that 'sort' should have been there in the first
     place, and that 'sortBy" should be in the same place as
     'sort'.

  3) Incorporate the 'Down' (Is 'Descending' too long?) type
     in Data.Ord as suggested (but see below).

Now Data.Ord would be the one-stop shopping place for all
our sorting needs.

If all of this were done, I really wouldn't mind importing
Data.Ord to get the sorting functions and continuing to
write 'sortBy (comparing f)' or 'sortBy (p `on` f)' when I
wanted to do such a thing.  Even 'sortBy (comparing Down
`on` f)' isn't so bad.

Minor nits in the proposal:

  * You need to implement either (<=) or `comparing` for
    Down (preferably `comparing`), not just (<).  Otherwise
    sort nonterminates with a stack overflow.  This one took
    me a while to figure out.

    As far as I can tell you'll need to make Down an
    instance of Eq as well as Ord to get the latter instance
    to typecheck.

    These problems suggest that you haven't yet tried your
    proposal at all. :-) :-)

  * getDown might better be called fromDown?

    Bart Massey
    bart <at> cs.pdx.edu
Malcolm Wallace | 8 Oct 11:55

Re: Proposal #2659: Add sortOn and friends to Data.List

>   1) Get rid of any need for Data.Function by exporting
>      'fix' and 'on' into the Prelude.

This can't be done, for one simple reason: the Prelude is Haskell'98,
and therefore currently immutable.

Regards,
    Malcolm
apfelmus | 7 Oct 11:44
Favicon

Re: Proposal #2659: Add sortOn and friends to Data.List

Twan van Laarhoven wrote:

> -- Proposal--
> 
> Ticket:   #2659 (for sortOn and friends)
> Ticket:   #2660 (for Down newtype)
> Deadline: now+2 weeks = 2008-10-20
> 
> Questions:
>   1. should sortOn be added to Data.List?
>   2. should all other *On functions be added as well?
>   3. what name should these functions get?
>   4. should the sortOn' variations be added? What about the naming?
>   5. should Down be added to Data.Ord?

-1 concerning sortOn and friends

I think that

  sortBy (comparing f)

is a very good and modular name for the functionality intended, no need
to create another one. The only advantage of a special  sortOn  function
would be that you can document it, i.e. there's no haddock blurb for
sortBy (comparing f) .

Hm, though  sortOn  could be useful if it caches the values of  f  by
default, like the  sortOn'  proposed. But Jean-Philippe's code is a
beautiful solution for this situation.

+1 concerning Down,

but I don't like the name. In particular I don't like the "get" prefix
in "getDown". I'd simply use

  newtype Desc a = Desc a

without label or  fromDesc  if really necessary.

Regards,
apfelmus
Duncan Coutts | 7 Oct 23:04
Gravatar

Re: Proposal #2659: Add sortOn and friends to Data.List

On Tue, 2008-10-07 at 11:44 +0200, apfelmus wrote:

> I think that
> 
>   sortBy (comparing f)
> 
> is a very good and modular name for the functionality intended, no need
> to create another one. The only advantage of a special  sortOn  function
> would be that you can document it, i.e. there's no haddock blurb for
> sortBy (comparing f) .

Lets fix the sortBy documentation to mention the comparing idiom.

> Hm, though  sortOn  could be useful if it caches the values of  f  by
> default, like the  sortOn'  proposed. But Jean-Philippe's code is a
> beautiful solution for this situation.
> 
> 
> +1 concerning Down,
> 
> but I don't like the name. In particular I don't like the "get" prefix
> in "getDown". I'd simply use

Can't we do it with just some compare flip function:

sortBy (thing compare)

or

sortBy (thing $ comparing fieldFoo)

Would that work and what would be a good name?

Duncan
apfelmus | 9 Oct 12:13
Favicon

Re: Proposal #2659: Add sortOn and friends to Data.List

Duncan Coutts wrote:
>> +1 concerning Down,
>>
>> but I don't like the name. In particular I don't like the "get" prefix
>> in "getDown". I'd simply use
> 
> Can't we do it with just some compare flip function:
> 
> sortBy (thing compare)
> 
> or
> 
> sortBy (thing $ comparing fieldFoo)
> 
> Would that work and what would be a good name?

thing = flip  works :)

But  Down  could be useful on its own, for instance for converting
a heap that stores elements in maximum order into one that stores them
in minimum order, i.e. when there is an Ord class constraint instead of
a simple higher-order function. Some exploration of the design space here

  http://thread.gmane.org/gmane.comp.lang.haskell.cafe/35988

Hm, I think that the  Cache b a  thing is more general, though, it may
be worth extending this to a full proposal.

Regards,
apfelmus
Isaac Dupree | 15 Oct 22:06

Re: Proposal #2659: Add sortOn and friends to Data.List

apfelmus wrote:
> but I don't like the name. In particular I don't like the "get" prefix
> in "getDown". I'd simply use
> 
>   newtype Desc a = Desc a

what's up with "Down", "Descending" etc.?  "Ord" is a 
symmetric, mathematical concept, no more "ascending" than 
"descending"!  It's just a convention (created/used by 
functions like "sort") to sort ascending, and it annoys me a 
little that they're not called "sortAsc" (sortAscending) to 
say what they do.  I'd suggest something like "ReverseOrd" 
or "ReverseOrder", (or "Opposite..."?) (or some abbreviated 
version?)

-Isaac

Re: Proposal #2659: Add sortOn and friends to Data.List

Twan van Laarhoven wrote:
> Hello list,
>
>
> Almost all uses of sortBy in user code use 'comparing', 'on' or a similar
> construction [1]. I think we should add a function that makes this common
> behavior more convenient:
>
>     sortOn :: Ord b => (a -> b) -> [a] -> [a]
>
> For consistency we should also add *On for the other *By functions in
> Data.List:
>
>     nubOn           :: Eq b => (a -> b) -> [a] -> [a]
>     deleteOn        :: Eq b => (a -> b) -> a -> [a] -> [a]
>     deleteFirstsOn  :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>     unionOn         :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>     intersectOn     :: Eq b => (a -> b) -> [a] -> [a] -> [a]
>     groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]
>     sortOn          :: Ord b => (a -> b) -> [a] -> [a]
>     insertOn        :: Ord b => (a -> b) -> a -> [a] -> [a]
>     maximumOn       :: Ord b => (a -> b) -> [a] -> a
>     minimumOn       :: Ord b => (a -> b) -> [a] -> a
>    (nubSortOn       :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629

Too many functions, and too little added functionality, IMHO.

>     newtype Down a = Down { getDown :: a }

I like this one.

Bertram
roconnor | 9 Oct 20:25
Favicon

Re: Proposal #2659: Add sortOn and friends to Data.List

>>     groupOn         :: Eq b => (a -> b) -> [a] -> [[a]]

I don't know how other people use groupBy, but I'm almost always looking 
for something of the the type

Eq b => (a -> b) -> [a] -> [(b,[a])]

I just bring this up as something to think about.  It is easy to construct 
from groupOn, however it is almost as easy to construct from groupBy.  So 
groupOn doesn't really gain me much. :)

--

-- 
Russell O'Connor                                      <http://r6.ca/>
``All talk about `theft,''' the general counsel of the American Graphophone
Company wrote, ``is the merest claptrap, for there exists no property in
ideas musical, literary or artistic, except as defined by statute.''

Gmane