Ketil Malde | 1 Nov 08:45 2007
Picon
Picon

Re: Re: Why can't Haskell be faster?

Don Stewart <dons <at> galois.com> writes:

> goalieca:

>>    So in a few years time when GHC has matured we can expect performance to
>>    be on par with current Clean? So Clean is a good approximation to peak
>>    performance?

If I remember the numbers, Clean is pretty close to C for most
benchmarks, so I guess it is fair to say it is a good approximation to
practical peak performance.

Which proves that it is possible to write efficient low-level code in
Clean. 

> And remember usually Haskell is competing against 'high level' languages
> like python for adoption, where we're 5-500x faster anyway...

Unfortunately, they replaced line counts with bytes of gzip'ed code --
while the former certainly has its problems, I simply cannot imagine
what relevance the latter has (beyond hiding extreme amounts of
repetitive boilerplate in certain languages).

When we compete against Python and its ilk, we do so for programmer
productivity first, and performance second.  LOC was a nice measure,
and encouraged terser and more idiomatic programs than the current
crop of performance-tweaked low-level stuff.

BTW, Python isn't so bad, performance wise.  Much of what I do
consists of reading some files, building up some hashes (associative
(Continue reading)

Rodrigo Queiro | 1 Nov 10:09 2007
Picon

Re: Re: Why can't Haskell be faster?

I assume the reason the switched away from LOC is to prevent
programmers artificially reducing their LOC count, e.g. by using
a = 5; b = 6;
rather than
a = 5;
b = 6;

in languages where newlines aren't syntactically significant. When
gzipped, I guess that the ";\n" string will be represented about as
efficiently as just the single semi-colon.

On 01/11/2007, Ketil Malde <ketil+haskell <at> ii.uib.no> wrote:
> Don Stewart <dons <at> galois.com> writes:
>
> > goalieca:
>
> >>    So in a few years time when GHC has matured we can expect performance to
> >>    be on par with current Clean? So Clean is a good approximation to peak
> >>    performance?
>
> If I remember the numbers, Clean is pretty close to C for most
> benchmarks, so I guess it is fair to say it is a good approximation to
> practical peak performance.
>
> Which proves that it is possible to write efficient low-level code in
> Clean.
>
> > And remember usually Haskell is competing against 'high level' languages
> > like python for adoption, where we're 5-500x faster anyway...
>
(Continue reading)

Bryan O'Sullivan | 1 Nov 16:19 2007

Re: Re: Why can't Haskell be faster?

Ketil Malde wrote:

> Python used to do pretty well here compared
> to Haskell, with rather efficient hashes and text parsing, although I
> suspect ByteString IO and other optimizations may have changed that
> now. 

It still does just fine.  For typical "munge a file with regexps, lists, 
and maps" tasks, Python and Perl remain on par with comparably written 
Haskell.  This because the scripting-level code acts as a thin layer of 
glue around I/O, regexps, lists, and dicts, all of which are written in 
native code.

The Haskell regexp libraries actually give us something of a leg down 
with respect to Python and Perl.  The aggressive use of polymorphism in 
the return type of (=~) makes it hard to remember which of the possible 
return types gives me what information.  Not only did I write a regexp 
tutorial to understand the API in the first place, I have to reread it 
every time I want to match a regexp.

A suitable solution would be a return type of RegexpMatch a => Maybe a 
(to live alongside the existing types, but aiming to become the one 
that's easy to remember), with appropriate methods on a, but I don't 
have time to write up a patch.

	<b
ChrisK | 1 Nov 19:16 2007

Regex API ideas

Hi Bryan,

  I wrote the current regex API, so your suggestions are interesting to me.  The
also goes for anyone else's regex API opinions, of course.

Bryan O'Sullivan wrote:
> Ketil Malde wrote:
> 
>> Python used to do pretty well here compared
>> to Haskell, with rather efficient hashes and text parsing, although I
>> suspect ByteString IO and other optimizations may have changed that
>> now. 

> 
> It still does just fine.  For typical "munge a file with regexps, lists,
> and maps" tasks, Python and Perl remain on par with comparably written
> Haskell.  This because the scripting-level code acts as a thin layer of
> glue around I/O, regexps, lists, and dicts, all of which are written in
> native code.
> 
> The Haskell regexp libraries actually give us something of a leg down
> with respect to Python and Perl.

True, the pure Haskell library is not as fast as a C library.  In particular,
the current regex-tdfa handles lazy bytestring in a sub-optimal manner.  This
may eventually be fixed.

But the native code libraries have also been wrapped in the same API, and they
are quite fast when combined with strict ByteStrings.

(Continue reading)

Bryan O'Sullivan | 1 Nov 19:34 2007

Re: Regex API ideas

ChrisK wrote:

>> The Haskell regexp libraries actually give us something of a leg down
>> with respect to Python and Perl.
> 
> True, the pure Haskell library is not as fast as a C library.

Actually, I wasn't referring to the performance of the libraries, merely 
to the non-stick nature of the API.  For my purposes, regex-pcre 
performs well (though I owe you some patches to make it, and other regex 
back ends, compile successfully out of the box).

> But more interesting to me is learning what API you would like to see.
> What would you like the code that uses the API to be?

Python's regexp API is pretty easy to use, and also to remember.  Here's 
what it does for match objects.

http://docs.python.org/lib/match-objects.html

	<b
Tim Newsham | 1 Nov 19:46 2007
Picon

Re: Re: Why can't Haskell be faster?

> Unfortunately, they replaced line counts with bytes of gzip'ed code --
> while the former certainly has its problems, I simply cannot imagine
> what relevance the latter has (beyond hiding extreme amounts of
> repetitive boilerplate in certain languages).

Sounds pretty fair to me.  Programming is a job of compressing a solution 
set.  Excessive boilerplate might mean that you have to type a lot, but 
doesn't necessarily mean that you have to think a lot.  I think the 
previous line count was skewed in favor of very terse languages like 
haskell, especially languages that let you put many ideas onto a single 
line.  At the very least there should be a constant factor applied when 
comparing haskell line counts to python line counts, for example. 
(python has very strict rules about putting multiple things on the same 
line).

Obviously no simple measure is going to satisfy everyone, but I think the 
gzip measure is more even handed across a range of languages.  It probably 
more closely aproximates the amount of mental effort and hence time it 
requires to construct a program (ie. I can whip out a lot of lines of code 
in python very quickly, but it takes a lot more of them to do the same 
work as a single, dense, line of haskell code).

> When we compete against Python and its ilk, we do so for programmer
> productivity first, and performance second.  LOC was a nice measure,
> and encouraged terser and more idiomatic programs than the current
> crop of performance-tweaked low-level stuff.

The haskell entries to the shootout are very obviously written for speed 
and not elegance.  If you want to do better on the LoC measure, you can 
definitely do so (at the expense of speed).
(Continue reading)

Sebastian Sylvan | 1 Nov 19:58 2007
Picon

Re: Re: Why can't Haskell be faster?

On 01/11/2007, Tim Newsham <newsham <at> lava.net> wrote:
> > Unfortunately, they replaced line counts with bytes of gzip'ed code --
> > while the former certainly has its problems, I simply cannot imagine
> > what relevance the latter has (beyond hiding extreme amounts of
> > repetitive boilerplate in certain languages).
>
> Sounds pretty fair to me.  Programming is a job of compressing a solution
> set.  Excessive boilerplate might mean that you have to type a lot, but
> doesn't necessarily mean that you have to think a lot.  I think the
> previous line count was skewed in favor of very terse languages like
> haskell, especially languages that let you put many ideas onto a single
> line.  At the very least there should be a constant factor applied when
> comparing haskell line counts to python line counts, for example.
> (python has very strict rules about putting multiple things on the same
> line).
>
> Obviously no simple measure is going to satisfy everyone, but I think the
> gzip measure is more even handed across a range of languages.  It probably
> more closely aproximates the amount of mental effort and hence time it
> requires to construct a program (ie. I can whip out a lot of lines of code
> in python very quickly, but it takes a lot more of them to do the same
> work as a single, dense, line of haskell code).
>
> > When we compete against Python and its ilk, we do so for programmer
> > productivity first, and performance second.  LOC was a nice measure,
> > and encouraged terser and more idiomatic programs than the current
> > crop of performance-tweaked low-level stuff.
>
> The haskell entries to the shootout are very obviously written for speed
> and not elegance.  If you want to do better on the LoC measure, you can
(Continue reading)

Ketil Malde | 2 Nov 09:42 2007
Picon
Picon

Re: Re: Why can't Haskell be faster?

"Sebastian Sylvan" <sebastian.sylvan <at> gmail.com> writes:

[LOC vs gz as a program complexity metric]

>> Obviously no simple measure is going to satisfy everyone, but I think the
>> gzip measure is more even handed across a range of languages.  
>> It probably more closely aproximates the amount of mental effort [..]

I'm not sure I follow that reasoning?

At any rate, I think the ICFP contest is much better as a measure of
productivity. But, just like for performance, LOC for the shootout can
be used as a micro-benchmark. 

> Personally I think syntactic noise is highly distracting, and semantic
> noise is even worse!

This is important - productivity doesn't depend so much on the actual
typing, but the ease of refactoring, identifying and fixing bugs, i.e
*reading* code.

Verbosity means noise, and also lower information content in a
screenful of code.

I think there were some (Erlang?) papers where they showed a
correlation between program size (in LOC), time of development, and
possibly number of bugs?) - regardless of language.

> Token count would be good, but then we'd need a parser for
> each language, which is quite a bit of work to do...
(Continue reading)

Bulat Ziganshin | 2 Nov 09:34 2007
Picon

Re[2]: Re: Why can't Haskell be faster?

Hello Sebastian,

Thursday, November 1, 2007, 9:58:45 PM, you wrote:

> the ideal. Token count would be good, but then we'd need a parser for
> each language, which is quite a bit of work to do...

i think that wc (word count) would be good enough approximation

--

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin <at> gmail.com
Sebastian Sylvan | 2 Nov 13:03 2007
Picon

Re: Re[2]: Re: Why can't Haskell be faster?

On 02/11/2007, Bulat Ziganshin <bulat.ziganshin <at> gmail.com> wrote:
> Hello Sebastian,
>
> Thursday, November 1, 2007, 9:58:45 PM, you wrote:
>
> > the ideal. Token count would be good, but then we'd need a parser for
> > each language, which is quite a bit of work to do...
>
> i think that wc (word count) would be good enough approximation
>

Yes, as long as you police abuse ( eg
"if(somevar)somfunccall(foo,bar,baz)"shouldn't be treated as a single
word)).

--

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862

Gmane