Simon Peyton-Jones | 30 Nov 15:42 2012
Picon

GHC Performance Tsar

| > While writing a new nofib benchmark today I found myself wondering
| > whether all the nofib benchmarks are run just before each release,

I think we could do with a GHC Performance Tsar.  Especially now that Simon has changed jobs, we need to try
even harder to broaden the base of people who help with GHC.  It would be amazing to have someone who was
willing to:

 * Run nofib benchmarks regularly, and publish the results

 * Keep baseline figures for GHC 7.6, 7.4, etc so we can keep
   track of regressions

 * Investigate regressions to see where they come from; ideally
   propose fixes.

 * Extend nofib to contain more representative programs (as Johan is
   currently doing).

That would help keep us on the straight and narrow.  

Any offers?  It could be more than one person.
	
Simon

| -----Original Message-----
| From: glasgow-haskell-users-bounces <at> haskell.org [mailto:glasgow-haskell-
| users-bounces <at> haskell.org] On Behalf Of Simon Marlow
| Sent: 30 November 2012 12:11
| To: Johan Tibell
| Cc: glasgow-haskell-users
(Continue reading)

Tim Watson | 30 Nov 16:51 2012
Picon

Re: GHC Performance Tsar

Could we not configure travis-ci to run the benchmarks for us or something like that? A simple (free) ci
setup would be easier than finding a pair of hands to do this regularly I would've thought.

On 30 Nov 2012, at 14:42, Simon Peyton-Jones <simonpj <at> microsoft.com> wrote:

> | > While writing a new nofib benchmark today I found myself wondering
> | > whether all the nofib benchmarks are run just before each release,
> 
> I think we could do with a GHC Performance Tsar.  Especially now that Simon has changed jobs, we need to try
even harder to broaden the base of people who help with GHC.  It would be amazing to have someone who was
willing to:
> 
> * Run nofib benchmarks regularly, and publish the results
> 
> * Keep baseline figures for GHC 7.6, 7.4, etc so we can keep
>   track of regressions
> 
> * Investigate regressions to see where they come from; ideally
>   propose fixes.
> 
> * Extend nofib to contain more representative programs (as Johan is
>   currently doing).
> 
> That would help keep us on the straight and narrow.  
> 
> Any offers?  It could be more than one person.
>    
> Simon
> 
> | -----Original Message-----
(Continue reading)

Simon Peyton-Jones | 30 Nov 16:54 2012
Picon

RE: GHC Performance Tsar

| Could we not configure travis-ci to run the benchmarks for us or
| something like that? A simple (free) ci setup would be easier than
| finding a pair of hands to do this regularly I would've thought.

Of course automation is great.  The pair of hands is still needed to figure out what to do, set it up, make sure
it stays working, investigate regressions in performance.  But it'd be silly to run nofib *manually*
every time!

Simon

| -----Original Message-----
| From: Tim Watson [mailto:watson.timothy <at> gmail.com]
| Sent: 30 November 2012 15:51
| To: Simon Peyton-Jones
| Cc: Simon Marlow; Johan Tibell; glasgow-haskell-users
| Subject: Re: GHC Performance Tsar
| 
| Could we not configure travis-ci to run the benchmarks for us or
| something like that? A simple (free) ci setup would be easier than
| finding a pair of hands to do this regularly I would've thought.
| 
| On 30 Nov 2012, at 14:42, Simon Peyton-Jones <simonpj <at> microsoft.com>
| wrote:
| 
| > | > While writing a new nofib benchmark today I found myself wondering
| > | > whether all the nofib benchmarks are run just before each release,
| >
| > I think we could do with a GHC Performance Tsar.  Especially now that
| Simon has changed jobs, we need to try even harder to broaden the base
| of people who help with GHC.  It would be amazing to have someone who
(Continue reading)

Nicolas Trangez | 30 Nov 17:20 2012

Re: GHC Performance Tsar

On Fri, 2012-11-30 at 15:51 +0000, Tim Watson wrote:
> Could we not configure travis-ci to run the benchmarks for us or
> something like that? A simple (free) ci setup would be easier than
> finding a pair of hands to do this regularly I would've thought.

AFAIK Travis uses some IAAS service (EC2 if I'm not mistaken) to execute
CI jobs. Experience has shown benchmark results from VMs (especially
running on some public/shared IAAS service) are rather useless: there's
huge variance between results, even when executing the exact same
binaries, depending on host CPU and IO load (the latter even in case
your benchmark doesn't perform any IO itself), time of day (which
influences load), actual hardware your VM gets deployed on,...

Just my .02,

Nicolas
Johan Tibell | 30 Nov 17:48 2012
Picon

Re: GHC Performance Tsar

Hi Simon,

I will try to find some time to set up a automatic run of nofib on my
buildbot (which is powerful enough) and have it graph the results over
time (and perhaps even email us when a benchmark dips).

-- Johan
Bryan O'Sullivan | 30 Nov 17:52 2012

Re: GHC Performance Tsar

On Fri, Nov 30, 2012 at 8:48 AM, Johan Tibell <johan.tibell <at> gmail.com> wrote:


I will try to find some time to set up a automatic run of nofib on my
buildbot (which is powerful enough) and have it graph the results over
time (and perhaps even email us when a benchmark dips).
 
I'll pitch in with this too.
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Simon Peyton-Jones | 30 Nov 18:11 2012
Picon

RE: GHC Performance Tsar

If Bryan and Johan are the Performance Tsars the future looks bright.  Or at least fast.  Thank you.

 

Simon

 

From: Bryan O'Sullivan [mailto:bos <at> serpentine.com]
Sent: 30 November 2012 16:53
To: Johan Tibell
Cc: Simon Peyton-Jones; glasgow-haskell-users <at> haskell.org
Subject: Re: GHC Performance Tsar

 

On Fri, Nov 30, 2012 at 8:48 AM, Johan Tibell <johan.tibell <at> gmail.com> wrote:


I will try to find some time to set up a automatic run of nofib on my
buildbot (which is powerful enough) and have it graph the results over
time (and perhaps even email us when a benchmark dips).

 

I'll pitch in with this too.

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Johan Tibell | 30 Nov 18:38 2012
Picon

Re: GHC Performance Tsar

On Fri, Nov 30, 2012 at 9:11 AM, Simon Peyton-Jones
<simonpj <at> microsoft.com> wrote:
> If Bryan and Johan are the Performance Tsars the future looks bright.  Or at
> least fast.  Thank you.

If someone could point me to the build bot script that we run today
that would be a great start.

-- Johan
Ian Lynagh | 30 Nov 21:37 2012

Re: GHC Performance Tsar

On Fri, Nov 30, 2012 at 09:38:10AM -0800, Johan Tibell wrote:
> On Fri, Nov 30, 2012 at 9:11 AM, Simon Peyton-Jones
> <simonpj <at> microsoft.com> wrote:
> > If Bryan and Johan are the Performance Tsars the future looks bright.  Or at
> > least fast.  Thank you.
> 
> If someone could point me to the build bot script that we run today
> that would be a great start.

The code is at http://darcs.haskell.org/builder/

The config, including the build steps, is attached.

Thanks
Ian

Attachment (Config.hs): text/x-haskell, 10 KiB
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
John Wiegley | 30 Nov 20:59 2012

Re: GHC Performance Tsar

>>>>> Bryan O'Sullivan <bos <at> serpentine.com> writes:

> On Fri, Nov 30, 2012 at 8:48 AM, Johan Tibell <johan.tibell <at> gmail.com> wrote:
>     I will try to find some time to set up a automatic run of nofib on my
> buildbot (which is powerful enough) and have it graph the results over time
> (and perhaps even email us when a benchmark dips).

> I'll pitch in with this too.

I'd like to offer to help with benchmarking on Mac x86_64, if it would be
useful to add another architecture to the mix.  I just need a little hand-
holding to get starting.

--

-- 
John Wiegley
FP Complete                         Haskell tools, training and consulting
http://fpcomplete.com               johnw on #haskell/irc.freenode.net
Austin Seipp | 30 Nov 21:55 2012
Picon

Re: GHC Performance Tsar

I can also offer a decently spec'd linux x86_64 machine, and a
functional OS X x86_64 Mountain Lion machine too. If possible I'll
offer my ARMv7 board as well, which currently fails late in the stage2
build on DPH. I haven't figured that one out just yet. All these can
all be available on a regular basis (for nightly builds or whatever)
with little interruption. Any ARM machine now is slow enough to where
builds would need to be once per day at best, anyway.

I was thinking of something like arewefastyet.com that Mozilla has for
JavaScript, instead comparing different GHC versions. CodeSpeed seems
to be that and much more, after looking at the PyPy speed website. It
looks really nice. If it can accept JSON requests for build results
from certain platforms, I think that tying it into the current builder
infrastructure (which runs nofib every night anyway from my
understanding) would be relatively easy, and save a lot of effort. It
looks like it tracks the differences between runs (and stores them in
a database,) so you wouldn't need to use nofib-analyze or anything,
and can just submit raw metrics.

On Fri, Nov 30, 2012 at 1:59 PM, John Wiegley <johnw <at> fpcomplete.com> wrote:
> I'd like to offer to help with benchmarking on Mac x86_64, if it would be
> useful to add another architecture to the mix.  I just need a little hand-
> holding to get starting.
>

You can find some information about the Builder infrastructure here,
which currently controls the nightly build bots:
http://hackage.haskell.org/trac/ghc/wiki/Builder - I imagine any
solution will likely tie in with it (as Johan mentioned.)

If you want to run nofib manually for fun to test results locally,
there's this page:
http://hackage.haskell.org/trac/ghc/wiki/Building/RunningNoFib

--

-- 
Regards,
Austin
Nicolas Trangez | 30 Nov 17:57 2012

Re: GHC Performance Tsar

On Fri, 2012-11-30 at 08:48 -0800, Johan Tibell wrote:
> Hi Simon,
> 
> I will try to find some time to set up a automatic run of nofib on my
> buildbot (which is powerful enough) and have it graph the results over
> time (and perhaps even email us when a benchmark dips).

You might be interested in CodeSpeed [1], of which an instance runs at
[2]. It supports different benchmark suites using different
platforms/compiler(version)s/... across different hosts (which might be
interesting for cross-architecture comparison?), and benchmark results
can be submitted to the app using some HTTP call in JSON format.

When integrated with some CI system, it can also point at commits
related to a certain benchmark run.

Nicolas

[1] https://github.com/tobami/codespeed/
[2] http://speed.pypy.org/
David Terei | 1 Dec 00:44 2012
Picon

Re: GHC Performance Tsar

This is something I'd be happy to help out with.

On 30 November 2012 11:48, Johan Tibell <johan.tibell <at> gmail.com> wrote:
> Hi Simon,
>
> I will try to find some time to set up a automatic run of nofib on my
> buildbot (which is powerful enough) and have it graph the results over
> time (and perhaps even email us when a benchmark dips).
>
> -- Johan
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users <at> haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Ben Lippmeier | 5 Dec 06:40 2012
Picon

Re: GHC Performance Tsar


On 01/12/2012, at 1:42 AM, Simon Peyton-Jones wrote:

> | > While writing a new nofib benchmark today I found myself wondering
> | > whether all the nofib benchmarks are run just before each release,
> 
> I think we could do with a GHC Performance Tsar.  Especially now that Simon has changed jobs, we need to try
even harder to broaden the base of people who help with GHC.  It would be amazing to have someone who was
willing to:
> 
> * Run nofib benchmarks regularly, and publish the results
> 
> * Keep baseline figures for GHC 7.6, 7.4, etc so we can keep
>   track of regressions
> 
> * Investigate regressions to see where they come from; ideally
>   propose fixes.
> 
> * Extend nofib to contain more representative programs (as Johan is
>   currently doing).
> 
> That would help keep us on the straight and narrow.  

I was running a performance regression buildbot for a while a year ago, but gave it up because I didn't have
time to chase down the breakages. At the time we were primarily worried about the asymptotic performance
of DPH, and fretting about a few percent absolute performance was too much of a distraction. 

However: if someone wants to pick this up then they may get some use out of the code I wrote for it. The
dph-buildbot package in the DPH repository should still compile. This package uses
http://hackage.haskell.org/package/buildbox-1.5.3.1 which includes code for running tests,
collecting the timings, comparing against a baseline, making pretty reports etc. There is then a second
package buildbox-tools which has a command line tool for listing the benchmarks that have deviated from
the baseline by a particular amount.

Here is an example of a report that dph-buildbot made: 

http://log.ouroborus.net/limitingfactor/dph/nightly-20110809_000147.txt

Ben.
Ryan Newton | 6 Dec 15:50 2012
Picon

Re: GHC Performance Tsar

I'm particularly interested in parallel performance in the >8 core space.  (In fact, we saw some regressions from 7.2->7.4 that we never tracked down properly, but maybe can now.)


If the buildbot can make it easy to add a new "slave" machine that runs and uploads its result to a central location, then I would be happy to donate a few hours of dedicated time (no other logins) on a 32 core westmere machine, and hopefully other architectures soon.

Maybe, this use case is well-covered by creating a jenkins/travis slave and letting it move the data around?  (CodeSpeed looks pretty nice too.)

Cheers,
  -Ryan


On Wed, Dec 5, 2012 at 12:40 AM, Ben Lippmeier <benl <at> ouroborus.net> wrote:

On 01/12/2012, at 1:42 AM, Simon Peyton-Jones wrote:

> | > While writing a new nofib benchmark today I found myself wondering
> | > whether all the nofib benchmarks are run just before each release,
>
> I think we could do with a GHC Performance Tsar.  Especially now that Simon has changed jobs, we need to try even harder to broaden the base of people who help with GHC.  It would be amazing to have someone who was willing to:
>
> * Run nofib benchmarks regularly, and publish the results
>
> * Keep baseline figures for GHC 7.6, 7.4, etc so we can keep
>   track of regressions
>
> * Investigate regressions to see where they come from; ideally
>   propose fixes.
>
> * Extend nofib to contain more representative programs (as Johan is
>   currently doing).
>
> That would help keep us on the straight and narrow.


I was running a performance regression buildbot for a while a year ago, but gave it up because I didn't have time to chase down the breakages. At the time we were primarily worried about the asymptotic performance of DPH, and fretting about a few percent absolute performance was too much of a distraction.

However: if someone wants to pick this up then they may get some use out of the code I wrote for it. The dph-buildbot package in the DPH repository should still compile. This package uses http://hackage.haskell.org/package/buildbox-1.5.3.1 which includes code for running tests, collecting the timings, comparing against a baseline, making pretty reports etc. There is then a second package buildbox-tools which has a command line tool for listing the benchmarks that have deviated from the baseline by a particular amount.

Here is an example of a report that dph-buildbot made:

http://log.ouroborus.net/limitingfactor/dph/nightly-20110809_000147.txt

Ben.




_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Gmane