Eastsun | 4 Oct 20:25
Favicon

A suggestion to optimize the for loops


By the scala reference,the expression  
for(x <- e;if y) z
would translate to
e.filter((x1,x2..,xn) => y).foreach{case x => z}

It would call the filter method in this way,and sometimes the filter method
is very expensive.
Because the filter method may create a new object of type e.
And we can avoid this by translate the expression to:
e.foreach{ case x if y =>z; case _ => }

For example:

/////////////////////////////////////////////////////////////////////////////
Welcome to Scala version 2.7.2.RC1 (Java HotSpot(TM) Client VM, Java
1.6.0_10-rc).
Type in expressions to have them evaluated.
Type :help for more information.

scala> def test(call : => Unit){
     |     val start = System.currentTimeMillis
     |     var cnt = 0
     |     while(cnt < 10000){
     |        call
     |        cnt += 1
     |     }
     |     println((System.currentTimeMillis - start)+" ms")
     | }
test: (=> Unit)Unit
(Continue reading)

Harshad | 5 Oct 14:47
Gravatar

Re: A suggestion to optimize the for loops

I saw a 6 fold improvement in performance, when I used a larger range (1 to
10000) instead of (1 to 1000)

scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
13608 ms

scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum +=
n;case _ => }}
2384 ms

scala> // Running again to confirm warm up:

scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum +=
n;case _ => }}
2321 ms

scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
13477 ms

Eastsun wrote:

> scala> test(rawM1)
> 6219 ms
> 
> scala> test(rawM2)
> 2046 ms
> 

David Pollak | 5 Oct 15:27
Gravatar

Re: Re: A suggestion to optimize the for loops

A couple of observations:
  • Benchmarking anything on the JVM requires "warming up" HotSpot... running a few million iterations of what you're trying to test before running the actual benchmark.  Further, there are material differences in how HotSpot optimizes between 1.5 and 1.6, desktop and server mode, 32 and 64 bit mode, and Intel vs. SPARC.  Put another way, stuff on my MacBook Pro (1.5, 32 bit) has radically different (and much worse) performance characteristics than does stuff run on my Linux box (1.6, 64 bit)
  • There is an optimization mode for ScalaC... did you use that when compiling the code?
  • While your suggestion may work for List and Seq, for comprehensions work with *any* classes that have map, flatMap, filter, and foreach.  If the meaning of "filter" was different for my custom class, removing the filter call from the generated code would change the meaning of my program.
  • This seems like premature optimization to me.  I've been doing Scala coding for nearly 2 years.  I've rarely had to focus on the performance of a given method.  When I have had to focus on performance, I manually recode things as while loops which gives markedly better performance.  And I've got a track record of writing very high performance code... among other things, I've written the worlds fastest spreadheet engine... and yes, it was JVM based.


On Sun, Oct 5, 2008 at 5:47 AM, Harshad <harshad.rj <at> gmail.com> wrote:
I saw a 6 fold improvement in performance, when I used a larger range (1 to
10000) instead of (1 to 1000)

scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
13608 ms

scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum +=
n;case _ => }}
2384 ms

scala> // Running again to confirm warm up:

scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum +=
n;case _ => }}
2321 ms

scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
13477 ms

Eastsun wrote:

> scala> test(rawM1)
> 6219 ms
>
> scala> test(rawM2)
> 2046 ms
>





--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp

Re: Re: A suggestion to optimize the for loops

 
  • .... among other things, I've written the worlds fastest spreadheet engine... and yes, it was JVM based.


Interesting! Would you care educating the ignorant?

--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com
David Pollak | 5 Oct 15:58
Gravatar

Re: Re: A suggestion to optimize the for loops



On Sun, Oct 5, 2008 at 6:44 AM, Christos KK Loverdos <loverdos <at> gmail.com> wrote:
 
  • .... among other things, I've written the worlds fastest spreadheet engine... and yes, it was JVM based.


Interesting! Would you care educating the ignorant?

Integer was a multi-user server-based spreadsheet written in Java.  It was accessible by custom app, by HTML browser, by applet running in browser.  It accepted real-time data feeds (I invented the real-time spreadsheet in 1992) and had lots of other features (replication, audit logs, etc.)

On a single CPU, WinTel machine, Integer had numeric performance equal to that of Excel.   However, Integer broke the recalculation task down into separate groups and those groups could be distributed across many CPUs (the algorythm, if I do say so myself, was pretty clever, as it was O(n log n) for time and O(n) for space except in pathological cases... and it even detected pathological cases and switched to alternative algorythms.)

As part of being onstage for the Java 2 launch, Sun put Integer through its paces.  We ran Integer on an E-10000 with 32 CPUs.  Integer scaled linearly to using 16 CPUs and plateaued at 20 CPUs.   The tests had to do with calculating risk across large portfolios.  Integer outperformed every system (spreadsheet or custom app that Sun had ever seen) for this particular benchmark.
 


--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com



--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp

Re: Re: A suggestion to optimize the for loops

Impressive!

Thanx for the info.


Respectfully yours,

Christos.

On Sun, Oct 5, 2008 at 4:58 PM, David Pollak <feeder.of.the.bears <at> gmail.com> wrote:


On Sun, Oct 5, 2008 at 6:44 AM, Christos KK Loverdos <loverdos <at> gmail.com> wrote:
 
  • .... among other things, I've written the worlds fastest spreadheet engine... and yes, it was JVM based.


Interesting! Would you care educating the ignorant?

Integer was a multi-user server-based spreadsheet written in Java.  It was accessible by custom app, by HTML browser, by applet running in browser.  It accepted real-time data feeds (I invented the real-time spreadsheet in 1992) and had lots of other features (replication, audit logs, etc.)

On a single CPU, WinTel machine, Integer had numeric performance equal to that of Excel.   However, Integer broke the recalculation task down into separate groups and those groups could be distributed across many CPUs (the algorythm, if I do say so myself, was pretty clever, as it was O(n log n) for time and O(n) for space except in pathological cases... and it even detected pathological cases and switched to alternative algorythms.)

As part of being onstage for the Java 2 launch, Sun put Integer through its paces.  We ran Integer on an E-10000 with 32 CPUs.  Integer scaled linearly to using 16 CPUs and plateaued at 20 CPUs.   The tests had to do with calculating risk across large portfolios.  Integer outperformed every system (spreadsheet or custom app that Sun had ever seen) for this particular benchmark.
 


--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com



--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp



--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com
Sean McDirmid | 5 Oct 15:41

Re: Re: A suggestion to optimize the for loops

There is some ground in the middle for non-closure based looping in
Scala--I don't think its a matter of if but when. It is very annoying
to recode loops as while loops, though agree map/foreach/filter are
very convenient.

The question is can we get a base level of optimization that
inlines/eliminates simple closures in a predictable consistent way?
Right now, optimization in Scala is this "fragile/experimental/you
probably don't want to use it very much" kind of thing. Simple closure
elimination/inlining will have to go into the the standard
non-optimize version of scalac like tail recursion elimination already
is. For that to happen, the Scala compiler will have to be more
aggressive about inlining small foreach methods...to then inline the
closure argument...its not that hard actually. The question is
when/what to inline, methods like Range.foreach seem obvious.

This isn't really something we can leave up to the JIT. It has no idea
about closures, and sees them as just objects with real identity, hash
codes, equality, locks, and so on... Perhaps the CLR is better about
this, as closures/delegates aren't objects.

Sean

On Sun, Oct 5, 2008 at 9:27 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
> A couple of observations:
>
> Benchmarking anything on the JVM requires "warming up" HotSpot... running a
> few million iterations of what you're trying to test before running the
> actual benchmark.  Further, there are material differences in how HotSpot
> optimizes between 1.5 and 1.6, desktop and server mode, 32 and 64 bit mode,
> and Intel vs. SPARC.  Put another way, stuff on my MacBook Pro (1.5, 32 bit)
> has radically different (and much worse) performance characteristics than
> does stuff run on my Linux box (1.6, 64 bit)
> There is an optimization mode for ScalaC... did you use that when compiling
> the code?
> While your suggestion may work for List and Seq, for comprehensions work
> with *any* classes that have map, flatMap, filter, and foreach.  If the
> meaning of "filter" was different for my custom class, removing the filter
> call from the generated code would change the meaning of my program.
> This seems like premature optimization to me.  I've been doing Scala coding
> for nearly 2 years.  I've rarely had to focus on the performance of a given
> method.  When I have had to focus on performance, I manually recode things
> as while loops which gives markedly better performance.  And I've got a
> track record of writing very high performance code... among other things,
> I've written the worlds fastest spreadheet engine... and yes, it was JVM
> based.
>
> On Sun, Oct 5, 2008 at 5:47 AM, Harshad <harshad.rj <at> gmail.com> wrote:
>>
>> I saw a 6 fold improvement in performance, when I used a larger range (1
>> to
>> 10000) instead of (1 to 1000)
>>
>> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
>> 13608 ms
>>
>> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum
>> +=
>> n;case _ => }}
>> 2384 ms
>>
>> scala> // Running again to confirm warm up:
>>
>> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum
>> +=
>> n;case _ => }}
>> 2321 ms
>>
>> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
>> 13477 ms
>>
>> Eastsun wrote:
>>
>> > scala> test(rawM1)
>> > 6219 ms
>> >
>> > scala> test(rawM2)
>> > 2046 ms
>> >
>>
>>
>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Collaborative Task Management http://much4.us
> Follow me: http://twitter.com/dpp
> Git some: http://github.com/dpp
>

David Pollak | 5 Oct 15:46
Gravatar

Re: Re: A suggestion to optimize the for loops



On Sun, Oct 5, 2008 at 6:41 AM, Sean McDirmid <sean.mcdirmid <at> gmail.com> wrote:
There is some ground in the middle for non-closure based looping in
Scala--I don't think its a matter of if but when. It is very annoying
to recode loops as while loops, though agree map/foreach/filter are
very convenient.

The question is can we get a base level of optimization that
inlines/eliminates simple closures in a predictable consistent way?
Right now, optimization in Scala is this "fragile/experimental/you
probably don't want to use it very much" kind of thing. Simple closure
elimination/inlining will have to go into the the standard
non-optimize version of scalac like tail recursion elimination already
is. For that to happen, the Scala compiler will have to be more
aggressive about inlining small foreach methods...to then inline the
closure argument...its not that hard actually. The question is
when/what to inline, methods like Range.foreach seem obvious.

This isn't really something we can leave up to the JIT. It has no idea
about closures, and sees them as just objects with real identity, hash
codes, equality, locks, and so on...

Too bad you're dead wrong about this.  HotSpot will inline a variety of code.  In the case of function that are called from a single site (as is the case for the anonymous functions generated as part of the for comprehension and passed to filter/map, etc., these will be inlined... no stack frame generated, etc. 
 
Perhaps the CLR is better about
this, as closures/delegates aren't objects.

Sean


On Sun, Oct 5, 2008 at 9:27 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
> A couple of observations:
>
> Benchmarking anything on the JVM requires "warming up" HotSpot... running a
> few million iterations of what you're trying to test before running the
> actual benchmark.  Further, there are material differences in how HotSpot
> optimizes between 1.5 and 1.6, desktop and server mode, 32 and 64 bit mode,
> and Intel vs. SPARC.  Put another way, stuff on my MacBook Pro (1.5, 32 bit)
> has radically different (and much worse) performance characteristics than
> does stuff run on my Linux box (1.6, 64 bit)
> There is an optimization mode for ScalaC... did you use that when compiling
> the code?
> While your suggestion may work for List and Seq, for comprehensions work
> with *any* classes that have map, flatMap, filter, and foreach.  If the
> meaning of "filter" was different for my custom class, removing the filter
> call from the generated code would change the meaning of my program.
> This seems like premature optimization to me.  I've been doing Scala coding
> for nearly 2 years.  I've rarely had to focus on the performance of a given
> method.  When I have had to focus on performance, I manually recode things
> as while loops which gives markedly better performance.  And I've got a
> track record of writing very high performance code... among other things,
> I've written the worlds fastest spreadheet engine... and yes, it was JVM
> based.
>
> On Sun, Oct 5, 2008 at 5:47 AM, Harshad <harshad.rj <at> gmail.com> wrote:
>>
>> I saw a 6 fold improvement in performance, when I used a larger range (1
>> to
>> 10000) instead of (1 to 1000)
>>
>> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
>> 13608 ms
>>
>> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum
>> +=
>> n;case _ => }}
>> 2384 ms
>>
>> scala> // Running again to confirm warm up:
>>
>> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) => sum
>> +=
>> n;case _ => }}
>> 2321 ms
>>
>> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum += i}}
>> 13477 ms
>>
>> Eastsun wrote:
>>
>> > scala> test(rawM1)
>> > 6219 ms
>> >
>> > scala> test(rawM2)
>> > 2046 ms
>> >
>>
>>
>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Collaborative Task Management http://much4.us
> Follow me: http://twitter.com/dpp
> Git some: http://github.com/dpp
>



--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp
David MacIver | 5 Oct 16:13

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 2:46 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
>
>
> On Sun, Oct 5, 2008 at 6:41 AM, Sean McDirmid <sean.mcdirmid <at> gmail.com>
> wrote:
>>
>> There is some ground in the middle for non-closure based looping in
>> Scala--I don't think its a matter of if but when. It is very annoying
>> to recode loops as while loops, though agree map/foreach/filter are
>> very convenient.
>>
>> The question is can we get a base level of optimization that
>> inlines/eliminates simple closures in a predictable consistent way?
>> Right now, optimization in Scala is this "fragile/experimental/you
>> probably don't want to use it very much" kind of thing. Simple closure
>> elimination/inlining will have to go into the the standard
>> non-optimize version of scalac like tail recursion elimination already
>> is. For that to happen, the Scala compiler will have to be more
>> aggressive about inlining small foreach methods...to then inline the
>> closure argument...its not that hard actually. The question is
>> when/what to inline, methods like Range.foreach seem obvious.
>>
>> This isn't really something we can leave up to the JIT. It has no idea
>> about closures, and sees them as just objects with real identity, hash
>> codes, equality, locks, and so on...
>
> Too bad you're dead wrong about this.  HotSpot will inline a variety of
> code.  In the case of function that are called from a single site (as is the
> case for the anonymous functions generated as part of the for comprehension
> and passed to filter/map, etc., these will be inlined... no stack frame
> generated, etc.

First off, the primary benefit of inlining closures isn't primarily
about inlining the corresponding method calls. The biggest single
direct benefit would be to dramatically reduce the number of small
classes being generated (yes, this is a big deal. For those of us not
writing web applications the small classes are a big pain point
because they really hurt startup time).

Second off, even ignoring the small classes issue and focusing on code
performance hotspot's inlining capabilities are insufficient for the
purpose here. In order to see why we need to realise what inlining is
for rather than to treat it as a goal in and of itself.

The purpose of inlining is not primarily to remove the overhead of
function calls. Doing so certainly helps, but if that was all it did
it would be a much less important optimisation. The purpose of
inlining is to increase the scope of other optimisations, as most
optimisations run on the body of a single method so by inlining other
methods into the current one you have a wider range of applicable
optimisations.

Hotspot is very good at inlining. And that's fine and dandy. It's
great for widening the scope of optimisations that hotspot does. What
does it do for widening the scope of optimisations scalac does?
Absolutely nothing, because it runs after scalac has finished and spat
out the bytecode.

Among the optimisations that need doing are removing box/unbox code.
scala is much more polymorphic in its primitives than Java code tends
to be (the Range example that started this thread is an instance of
that) and type erasure makes that hurt a lot. Scala is pretty good
about generating code which uses primitives in some cases, but when
passing through calls to anonymous functions it sucks at it. Inlining
function calls allows you to widen the area it can leave things
unboxed because you've removed the passage through erased code. Again,
taking the Range as an example, you'd find the performance of it very
close to that of your manually rewritten while loops if scalac were
better about inlining the definition.

Could hotspot do some of this? Maybe. It certainly doesn't do it now.
Eventually it probably will - there is a lot of talk (and I think
prototype implementations?) about bringing in scalar replacement to
replace objects with exploded versions when they don't escape from
their allocating method, which would help in cases like this. But
there are limits to how smart it can be because Java makes strong
guarantees about object identity. Thus there will probably remain
valid optimisations for Scala which hotspot can't perform.

Sean McDirmid | 5 Oct 16:19

Re: Re: A suggestion to optimize the for loops

I'm wondering if we should consider proposing that functions not
inherit from AnyRef anymore. They don't really have a useful notion of
equality and people really shouldn't be calling synchronized on them.
By making functions not really objects, it would clear the way for
more aggressive optimizations in the future.

On Sun, Oct 5, 2008 at 10:13 PM, David MacIver <david.maciver <at> gmail.com> wrote:

> Could hotspot do some of this? Maybe. It certainly doesn't do it now.
> Eventually it probably will - there is a lot of talk (and I think
> prototype implementations?) about bringing in scalar replacement to
> replace objects with exploded versions when they don't escape from
> their allocating method, which would help in cases like this. But
> there are limits to how smart it can be because Java makes strong
> guarantees about object identity. Thus there will probably remain
> valid optimisations for Scala which hotspot can't perform.
>

David MacIver | 5 Oct 16:22

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 3:19 PM, Sean McDirmid <sean.mcdirmid <at> gmail.com> wrote:
> I'm wondering if we should consider proposing that functions not
> inherit from AnyRef anymore. They don't really have a useful notion of
> equality and people really shouldn't be calling synchronized on them.
> By making functions not really objects, it would clear the way for
> more aggressive optimizations in the future.

I don't think that would be particularly beneficial. For example a lot
of things extend Function, so many functions legitimately are AnyRefs.
The only functions one should really be aggressively optimised are
ones created locally as lambdas.

Sean McDirmid | 5 Oct 16:27

Re: Re: A suggestion to optimize the for loops

I don't see why we'd want to shut off inheritance, just put them in a
separate tree and don't give them anyref by default. I guess one model
that we could follow would be .NET with its separate inheritance tree
for delegates, though delegates seem to be objects there.

Sean

On Sun, Oct 5, 2008 at 10:22 PM, David MacIver <david.maciver <at> gmail.com> wrote:
> On Sun, Oct 5, 2008 at 3:19 PM, Sean McDirmid <sean.mcdirmid <at> gmail.com> wrote:
>> I'm wondering if we should consider proposing that functions not
>> inherit from AnyRef anymore. They don't really have a useful notion of
>> equality and people really shouldn't be calling synchronized on them.
>> By making functions not really objects, it would clear the way for
>> more aggressive optimizations in the future.
>
> I don't think that would be particularly beneficial. For example a lot
> of things extend Function, so many functions legitimately are AnyRefs.
> The only functions one should really be aggressively optimised are
> ones created locally as lambdas.
>

Sean McDirmid | 5 Oct 15:59

Re: Re: A suggestion to optimize the for loops

Are you positive this will always occur?

I don't know, I doubt it especially after the conversations I've had
with Hotspot engineers at Sun a few years ago. And even if the apply
method is inlined, the object for the closure will still be allocated
on the heap. We can argue that the object is short live and so will
never make it out of the nursery, but this will still cause some
pressure. Finally, there are so many special conditions that have to
occur for hotspot to kick in, and you have no idea of knowing if your
code will qualify for whatever magical optimization that hotspot
performs. An optimization that you can't depend on is useless (tm
Alastair Reid).

At anyrate, its easy enough to tell if inlining is occurring just by
micro-benchmarking a for loop (after warm up in hotspot), which I
think plenty of people have done. Mixed inconclusive results
follow....

Sean

On Sun, Oct 5, 2008 at 9:46 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
>
>
> Too bad you're dead wrong about this.  HotSpot will inline a variety of
> code.  In the case of function that are called from a single site (as is the
> case for the anonymous functions generated as part of the for comprehension
> and passed to filter/map, etc., these will be inlined... no stack frame
> generated, etc.
>
>>
>> Perhaps the CLR is better about
>> this, as closures/delegates aren't objects.
>>
>> Sean
>>
>>
>> On Sun, Oct 5, 2008 at 9:27 PM, David Pollak
>> <feeder.of.the.bears <at> gmail.com> wrote:
>> > A couple of observations:
>> >
>> > Benchmarking anything on the JVM requires "warming up" HotSpot...
>> > running a
>> > few million iterations of what you're trying to test before running the
>> > actual benchmark.  Further, there are material differences in how
>> > HotSpot
>> > optimizes between 1.5 and 1.6, desktop and server mode, 32 and 64 bit
>> > mode,
>> > and Intel vs. SPARC.  Put another way, stuff on my MacBook Pro (1.5, 32
>> > bit)
>> > has radically different (and much worse) performance characteristics
>> > than
>> > does stuff run on my Linux box (1.6, 64 bit)
>> > There is an optimization mode for ScalaC... did you use that when
>> > compiling
>> > the code?
>> > While your suggestion may work for List and Seq, for comprehensions work
>> > with *any* classes that have map, flatMap, filter, and foreach.  If the
>> > meaning of "filter" was different for my custom class, removing the
>> > filter
>> > call from the generated code would change the meaning of my program.
>> > This seems like premature optimization to me.  I've been doing Scala
>> > coding
>> > for nearly 2 years.  I've rarely had to focus on the performance of a
>> > given
>> > method.  When I have had to focus on performance, I manually recode
>> > things
>> > as while loops which gives markedly better performance.  And I've got a
>> > track record of writing very high performance code... among other
>> > things,
>> > I've written the worlds fastest spreadheet engine... and yes, it was JVM
>> > based.
>> >
>> > On Sun, Oct 5, 2008 at 5:47 AM, Harshad <harshad.rj <at> gmail.com> wrote:
>> >>
>> >> I saw a 6 fold improvement in performance, when I used a larger range
>> >> (1
>> >> to
>> >> 10000) instead of (1 to 1000)
>> >>
>> >> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum +=
>> >> i}}
>> >> 13608 ms
>> >>
>> >> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) =>
>> >> sum
>> >> +=
>> >> n;case _ => }}
>> >> 2384 ms
>> >>
>> >> scala> // Running again to confirm warm up:
>> >>
>> >> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) =>
>> >> sum
>> >> +=
>> >> n;case _ => }}
>> >> 2321 ms
>> >>
>> >> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum +=
>> >> i}}
>> >> 13477 ms
>> >>
>> >> Eastsun wrote:
>> >>
>> >> > scala> test(rawM1)
>> >> > 6219 ms
>> >> >
>> >> > scala> test(rawM2)
>> >> > 2046 ms
>> >> >
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Lift, the simply functional web framework http://liftweb.net
>> > Collaborative Task Management http://much4.us
>> > Follow me: http://twitter.com/dpp
>> > Git some: http://github.com/dpp
>> >
>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Collaborative Task Management http://much4.us
> Follow me: http://twitter.com/dpp
> Git some: http://github.com/dpp
>

David Pollak | 5 Oct 16:04
Gravatar

Re: Re: A suggestion to optimize the for loops



On Sun, Oct 5, 2008 at 6:59 AM, Sean McDirmid <sean.mcdirmid <at> gmail.com> wrote:
Are you positive this will always occur?

I participate on the JVM languages list, have had a bunch of conversations about this kind of issue with John Rose, Charlie Nutter, and others, etc.  This is the kind of thing HotSpot looks for now and it will be looking more and more aggressively for it in the future.
 


I don't know, I doubt it especially after the conversations I've had
with Hotspot engineers at Sun a few years ago. And even if the apply
method is inlined, the object for the closure will still be allocated
on the heap. We can argue that the object is short live and so will
never make it out of the nursery, but this will still cause some
pressure. Finally, there are so many special conditions that have to
occur for hotspot to kick in, and you have no idea of knowing if your
code will qualify for whatever magical optimization that hotspot
performs. An optimization that you can't depend on is useless (tm
Alastair Reid).

At anyrate, its easy enough to tell if inlining is occurring just by
micro-benchmarking a for loop (after warm up in hotspot), which I
think plenty of people have done. Mixed inconclusive results
follow....

Sean

On Sun, Oct 5, 2008 at 9:46 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
>
>
> Too bad you're dead wrong about this.  HotSpot will inline a variety of
> code.  In the case of function that are called from a single site (as is the
> case for the anonymous functions generated as part of the for comprehension
> and passed to filter/map, etc., these will be inlined... no stack frame
> generated, etc.
>
>>
>> Perhaps the CLR is better about
>> this, as closures/delegates aren't objects.
>>
>> Sean
>>
>>
>> On Sun, Oct 5, 2008 at 9:27 PM, David Pollak
>> <feeder.of.the.bears <at> gmail.com> wrote:
>> > A couple of observations:
>> >
>> > Benchmarking anything on the JVM requires "warming up" HotSpot...
>> > running a
>> > few million iterations of what you're trying to test before running the
>> > actual benchmark.  Further, there are material differences in how
>> > HotSpot
>> > optimizes between 1.5 and 1.6, desktop and server mode, 32 and 64 bit
>> > mode,
>> > and Intel vs. SPARC.  Put another way, stuff on my MacBook Pro (1.5, 32
>> > bit)
>> > has radically different (and much worse) performance characteristics
>> > than
>> > does stuff run on my Linux box (1.6, 64 bit)
>> > There is an optimization mode for ScalaC... did you use that when
>> > compiling
>> > the code?
>> > While your suggestion may work for List and Seq, for comprehensions work
>> > with *any* classes that have map, flatMap, filter, and foreach.  If the
>> > meaning of "filter" was different for my custom class, removing the
>> > filter
>> > call from the generated code would change the meaning of my program.
>> > This seems like premature optimization to me.  I've been doing Scala
>> > coding
>> > for nearly 2 years.  I've rarely had to focus on the performance of a
>> > given
>> > method.  When I have had to focus on performance, I manually recode
>> > things
>> > as while loops which gives markedly better performance.  And I've got a
>> > track record of writing very high performance code... among other
>> > things,
>> > I've written the worlds fastest spreadheet engine... and yes, it was JVM
>> > based.
>> >
>> > On Sun, Oct 5, 2008 at 5:47 AM, Harshad <harshad.rj <at> gmail.com> wrote:
>> >>
>> >> I saw a 6 fold improvement in performance, when I used a larger range
>> >> (1
>> >> to
>> >> 10000) instead of (1 to 1000)
>> >>
>> >> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum +=
>> >> i}}
>> >> 13608 ms
>> >>
>> >> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) =>
>> >> sum
>> >> +=
>> >> n;case _ => }}
>> >> 2384 ms
>> >>
>> >> scala> // Running again to confirm warm up:
>> >>
>> >> scala> test {var sum=0; (0 to 10000).foreach{ case n if(n%2 == 0) =>
>> >> sum
>> >> +=
>> >> n;case _ => }}
>> >> 2321 ms
>> >>
>> >> scala> test {var sum=0; for (i <- 1 to 10000; if (i%2) == 0) {sum +=
>> >> i}}
>> >> 13477 ms
>> >>
>> >> Eastsun wrote:
>> >>
>> >> > scala> test(rawM1)
>> >> > 6219 ms
>> >> >
>> >> > scala> test(rawM2)
>> >> > 2046 ms
>> >> >
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Lift, the simply functional web framework http://liftweb.net
>> > Collaborative Task Management http://much4.us
>> > Follow me: http://twitter.com/dpp
>> > Git some: http://github.com/dpp
>> >
>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Collaborative Task Management http://much4.us
> Follow me: http://twitter.com/dpp
> Git some: http://github.com/dpp
>



--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp
Sean McDirmid | 5 Oct 16:12

Re: Re: A suggestion to optimize the for loops

I don't see the optimization happening right now but I guess with
closures being added to the Java they will actually focus on this
more. That is a good thing for us, and hopefully, we'll get better
hot-code replace for closures also (especially when the JVM can tell
the difference between a closure and an object...).

On Sun, Oct 5, 2008 at 10:04 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:

> I participate on the JVM languages list, have had a bunch of conversations
> about this kind of issue with John Rose, Charlie Nutter, and others, etc.
> This is the kind of thing HotSpot looks for now and it will be looking more
> and more aggressively for it in the future.

Judson, Ross | 15 Oct 23:14
Favicon

RE: Re: A suggestion to optimize the for loops

Many (most?) closures generated by the Scala compiler are trivial. I
think I've noted in the past that runtime generation of things like
closure classes makes a good deal of sense. When you have hundreds of
essentially identical (in structure, and often in type) anonymous
classes, the generation of the closure's byte code can be deferred until
just before execution, on a lazy basis. Sun's JDK has asm built in. I've
found that runtime generation of some simple class "shapes" is
significantly faster than reflection/proxies, although I would expect
that advantage to disappear over time.

Runtime generation that does inlining and outlining in well-known places
could shed a lot of little classes!  :)

RJ

Arthur Peters | 15 Oct 23:34

Re: Re: A suggestion to optimize the for loops

Method handles are planned for inclusion in Java 7 (I think). Could closures in Scala be implemented as method handles instead of independent Java classes?

That would give a big space and time advantage I would think. However if there is no separate object representing the closure it would be impossible to encode local variable state inside the closure. So that might be out.

But there is also a lightweight bytecode loading proposal. That would allow for easy runtime generation of classes like you are talking about.

I think scala could gain a lot of performance and implementation cleanliness as the mlvm feature filter into the JVM.

-Arthur


On Wed, Oct 15, 2008 at 5:14 PM, Judson, Ross <rjudson <at> managedobjects.com> wrote:
Many (most?) closures generated by the Scala compiler are trivial. I
think I've noted in the past that runtime generation of things like
closure classes makes a good deal of sense. When you have hundreds of
essentially identical (in structure, and often in type) anonymous
classes, the generation of the closure's byte code can be deferred until
just before execution, on a lazy basis. Sun's JDK has asm built in. I've
found that runtime generation of some simple class "shapes" is
significantly faster than reflection/proxies, although I would expect
that advantage to disappear over time.

Runtime generation that does inlining and outlining in well-known places
could shed a lot of little classes!  :)

RJ

David MacIver | 5 Oct 16:18

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 2:41 PM, Sean McDirmid <sean.mcdirmid <at> gmail.com> wrote:
> There is some ground in the middle for non-closure based looping in
> Scala--I don't think its a matter of if but when. It is very annoying
> to recode loops as while loops, though agree map/foreach/filter are
> very convenient.
>
> The question is can we get a base level of optimization that
> inlines/eliminates simple closures in a predictable consistent way?
> Right now, optimization in Scala is this "fragile/experimental/you
> probably don't want to use it very much" kind of thing. Simple closure
> elimination/inlining will have to go into the the standard
> non-optimize version of scalac like tail recursion elimination already
> is. For that to happen, the Scala compiler will have to be more
> aggressive about inlining small foreach methods...to then inline the
> closure argument...its not that hard actually. The question is
> when/what to inline, methods like Range.foreach seem obvious.

Inlining is actually an orthogonal issue. Even if you inline the call
to filter, it still has to do the filtering. Of course, the filter
could be lazy (it is in the case of Range actually, it's just that the
default foreach on filtered projections isn't very smart. I might fix
that), in which case inlining would have the usual benefits, but for
things like arrays, etc. it's not.

David MacIver | 5 Oct 15:53

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 2:27 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
> A couple of observations:

A couple of observations on your observations:

> There is an optimization mode for ScalaC... did you use that when compiling
> the code?

I doubt it. The code actually ran. ;-)

> While your suggestion may work for List and Seq, for comprehensions work
> with *any* classes that have map, flatMap, filter, and foreach.  If the
> meaning of "filter" was different for my custom class, removing the filter
> call from the generated code would change the meaning of my program.

Dare I say that if you've written code such that the meaning of for
(...; if (stuff)) { ... } is different from for(...) { if(stuff) {
...} } then you've done something stupid and your code deserves to be
broken. :-)

> This seems like premature optimization to me.  I've been doing Scala coding
> for nearly 2 years.  I've rarely had to focus on the performance of a given
> method.  When I have had to focus on performance, I manually recode things
> as while loops which gives markedly better performance.  And I've got a

Um. So let me get this straight:

When you want performance, for loops are not fast enough. Therefore
for loop performance should not be improved?

That seems... odd logic.

It's also just wrong. :-)

It's certainly true that for integer ranges a while loop will
outperform a for one, but that's almost purely boxing overhead. It's
probably true for some random access sequences too, mainly arrays of
primitives. That's about it. For collections which are already storing
their contents boxed the overhead of a for loop is very very small. If
you were to switch to using an iterator and a while loop on that you
would in many cases *lose* performance. Internal iteration can be
dramatically faster than using an iterator (for example, this is the
case for all the new collections I added in 2.7.2).

David Pollak | 5 Oct 16:02
Gravatar

Re: Re: A suggestion to optimize the for loops



On Sun, Oct 5, 2008 at 6:53 AM, David MacIver <david.maciver <at> gmail.com> wrote:
On Sun, Oct 5, 2008 at 2:27 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
> A couple of observations:

A couple of observations on your observations:


> While your suggestion may work for List and Seq, for comprehensions work
> with *any* classes that have map, flatMap, filter, and foreach.  If the
> meaning of "filter" was different for my custom class, removing the filter
> call from the generated code would change the meaning of my program.

Dare I say that if you've written code such that the meaning of for
(...; if (stuff)) { ... } is different from for(...) { if(stuff) {
...} } then you've done something stupid and your code deserves to be
broken. :-)

If we're changing the rules of how for comprehensions are compiled, something may break.  Backward compatibility is important... and Martin has been selling that as a part of Scala for a while.  No matter how brain damaged my code is, breaking it is on its face, bad.
 


> This seems like premature optimization to me.  I've been doing Scala coding
> for nearly 2 years.  I've rarely had to focus on the performance of a given
> method.  When I have had to focus on performance, I manually recode things
> as while loops which gives markedly better performance.  And I've got a

Um. So let me get this straight:

When you want performance, for loops are not fast enough. Therefore
for loop performance should not be improved?

You'll never get the kind of loop performance from a for comprehension that you get from tuned while-loop code.  So, improving something 2x when most people either need 10x or nothing at all seems like a waste.

David
 


That seems... odd logic.

It's also just wrong. :-)

It's certainly true that for integer ranges a while loop will
outperform a for one, but that's almost purely boxing overhead. It's
probably true for some random access sequences too, mainly arrays of
primitives. That's about it. For collections which are already storing
their contents boxed the overhead of a for loop is very very small. If
you were to switch to using an iterator and a while loop on that you
would in many cases *lose* performance. Internal iteration can be
dramatically faster than using an iterator (for example, this is the
case for all the new collections I added in 2.7.2).



--
Lift, the simply functional web framework http://liftweb.net
Collaborative Task Management http://much4.us
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp
Andrew Gaydenko | 5 Oct 16:20

Re: Re: A suggestion to optimize the for loops

======= On Sunday 05 October 2008, David Pollak wrote: =======
...
>
> You'll never get the kind of loop performance from a for comprehension
> that you get from tuned while-loop code.  So, improving something 2x when
> most people either need 10x or nothing at all seems like a waste.
>
> David

David, 

It is off topic, but about you and performance :-) Probably my memory is 
wrong, but AFAIR there was a mailing list thread in which you resolved some 
problem wrt performance of Scala combinators. Will you be so kind to point 
me that discussion? The thing is, I use the combinators for simple text 
templating, and they are bottleneck in all request-processing-response 
chain in my httpd service (say, 900 req/seq with simple template vs 7000 
req/seq with direct the ~same stream writing).

Andrew

David MacIver | 5 Oct 16:25

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 3:02 PM, David Pollak
<feeder.of.the.bears <at> gmail.com> wrote:
>> > While your suggestion may work for List and Seq, for comprehensions work
>> > with *any* classes that have map, flatMap, filter, and foreach.  If the
>> > meaning of "filter" was different for my custom class, removing the
>> > filter
>> > call from the generated code would change the meaning of my program.
>>
>> Dare I say that if you've written code such that the meaning of for
>> (...; if (stuff)) { ... } is different from for(...) { if(stuff) {
>> ...} } then you've done something stupid and your code deserves to be
>> broken. :-)
>
> If we're changing the rules of how for comprehensions are compiled,
> something may break.  Backward compatibility is important... and Martin has
> been selling that as a part of Scala for a while.  No matter how brain
> damaged my code is, breaking it is on its face, bad.

And not improving something because making it better for non-broken
code might break broken code is worse.

>>
>> > This seems like premature optimization to me.  I've been doing Scala
>> > coding
>> > for nearly 2 years.  I've rarely had to focus on the performance of a
>> > given
>> > method.  When I have had to focus on performance, I manually recode
>> > things
>> > as while loops which gives markedly better performance.  And I've got a
>>
>> Um. So let me get this straight:
>>
>> When you want performance, for loops are not fast enough. Therefore
>> for loop performance should not be improved?
>
> You'll never get the kind of loop performance from a for comprehension that
> you get from tuned while-loop code.  So, improving something 2x when most
> people either need 10x or nothing at all seems like a waste.

Sorry, I don't know how to respond to an unbacked assertion which
flies in the face of observed facts. Could you rephrase that as a
useful sentence?

David MacIver | 5 Oct 17:06

Re: Re: A suggestion to optimize the for loops

On Sun, Oct 5, 2008 at 3:25 PM, David MacIver <david.maciver <at> gmail.com> wrote:
>> You'll never get the kind of loop performance from a for comprehension that
>> you get from tuned while-loop code.  So, improving something 2x when most
>> people either need 10x or nothing at all seems like a waste.
>
> Sorry, I don't know how to respond to an unbacked assertion which
> flies in the face of observed facts. Could you rephrase that as a
> useful sentence?

And in order to do something shocking and back an opinion with actual
evidence, here's some code:

trait IntAction{
  def apply(i : Int) : Unit;
}

object IntLoop{
  def times(size : Int, action : IntAction) {
    var i = 0;
    while(i < size){
      action(i);
      i += 1;
    }
  }
}

object LoopBenchmark extends scalax.testing.SizedBenchmark{
  var sum = 0;

  override val sizes = Array.range(0, 22).map(i => 1 << i)

  new Benchmark[Int]("while"){
    def data(size : Int) = size;
    def test(size : Int) = {
      var i = 0;
      while (i < size){
        i += 1;
        sum += i;
      }
    }
  }

  new Benchmark[Int]("IntLoop"){
    def data(size : Int) = size;
    def test(size : Int) = {
      IntLoop.times(size, new IntAction(){
        def apply(i : Int) { sum += i; }
      })
    }
  }
}

And here are some numbers for running it on the Java 6 server VM (the
client VM gives similar results):

scala> LoopBenchmark.main(null)
Size, while, IntLoop
1, 0.9288, 1.4625
2, 1.4897, 2.1465
4, 1.8854, 2.6657
8, 2.5849, 3.7209
16, 4.0271, 5.7881
32, 3.8079, 5.2768
64, 0.0688, 0.0668
128, 0.1212, 0.1197
256, 0.215, 0.2128
512, 0.3988, 0.4338
1024, 0.7729, 0.7717
2048, 1.5298, 1.5148
4096, 3.0078, 3.004
8192, 6.0164, 5.98
16384, 11.9394, 11.9403
32768, 25.1256, 24.3824
65536, 49.7022, 50.5058
131072, 95.8288, 95.803
262144, 194.632, 194.1631
524288, 381.6128, 381.4799
1048576, 762.965, 763.0022
2097152, 1525.4113, 1527.9294

scala> LoopBenchmark.main(null)
Size, while, IntLoop
1, 0.939, 1.4783
2, 1.5019, 2.1634
4, 1.903, 2.6865
8, 2.614, 3.7507
16, 4.063, 5.8215
32, 3.8529, 5.321
64, 0.1372, 0.1328
128, 0.2432, 0.2394
256, 0.4301, 0.4255
512, 0.7974, 0.8506
1024, 1.5462, 1.5431
2048, 3.046, 3.0461
4096, 6.0248, 7.0722
8192, 12.0352, 12.0585
16384, 24.0016, 24.1609
32768, 49.3035, 48.5215
65536, 98.1966, 98.4091
131072, 191.1968, 191.1784
262144, 389.0127, 387.3132
524288, 774.0732, 766.5339
1048576, 1530.3079, 1525.7336
2097152, 3050.924, 3054.0253

scala> LoopBenchmark.main(null)
Size, while, IntLoop
1, 0.9491, 1.4941
2, 1.5141, 2.1804
4, 1.9208, 2.7074
8, 2.6425, 3.782
16, 4.099, 5.8549
32, 3.8983, 5.364
64, 0.2054, 0.1997
128, 0.3644, 0.3592
256, 0.6442, 0.6388
512, 1.1962, 1.3032
1024, 2.3201, 2.3333
2048, 4.6057, 4.6231
4096, 9.0941, 10.0933
8192, 18.034, 18.1189
16384, 36.0821, 36.1985
32768, 73.9898, 73.4623
65536, 147.6996, 148.1828
131072, 287.9733, 290.9254
262144, 579.8721, 578.1591
524288, 1155.842, 1148.2189
1048576, 2293.5976, 2289.0789
2097152, 4584.8553, 4594.8766

The times are execution times in milliseconds. Note how the numbers
are near as dammit identical for non-trivial loops. For small loops
the fact that you have the single extra object creation does make a
difference (although to be honest I wouldn't trust the numbers for
small loops. Too much noise), but hotspot just inlines the
IntAction.apply and cheerfully turns the two loops into almost exactly
the same thing (in the first few runs of this it was a little more
variable with about 30% variation in performance. Amusingly enough, in
one run the IntAction version was twice as fast as the while loop!
Probably a bad optimisation on hotspot's part.

Now, as long as for loops are using type erased forms we obviously
can't get this sort of performance for loops over primitives, but with
a little bit of optimisation from scalac we probably can. And
certainly this should demonstrate to you than when there's no boxing
overhead while loops are *not* faster.

Eastsun | 5 Oct 16:08
Favicon

Re: Re: A suggestion to optimize the for loops


David MacIver wrote:
> 
> On Sun, Oct 5, 2008 at 2:27 PM, David Pollak
> <feeder.of.the.bears <at> gmail.com> wrote:
>> A couple of observations:
> 
>> While your suggestion may work for List and Seq, for comprehensions work
>> with *any* classes that have map, flatMap, filter, and foreach.  If the
>> meaning of "filter" was different for my custom class, removing the
>> filter
>> call from the generated code would change the meaning of my program.
> 
> Dare I say that if you've written code such that the meaning of for
> (...; if (stuff)) { ... } is different from for(...) { if(stuff) {
> ...} } then you've done something stupid and your code deserves to be
> broken. :-)
> 
>> This seems like premature optimization to me.  I've been doing Scala
>> coding
>> for nearly 2 years.  I've rarely had to focus on the performance of a
>> given
>> method.  When I have had to focus on performance, I manually recode
>> things
>> as while loops which gives markedly better performance.  And I've got a
> 
> Um. So let me get this straight:
> 
> When you want performance, for loops are not fast enough. Therefore
> for loop performance should not be improved?
> 
> That seems... odd logic.
> 
> 

That's the very thing I want to say:-)

-----
My scala solutions for  http://projecteuler.net/ Project Euler  problems: 
http://eastsun.javaeye.com/category/34059 Click here 
--

-- 
View this message in context: http://www.nabble.com/A-suggestion-to-optimize-the-for-loops-tp19815224p19824924.html
Sent from the Scala mailing list archive at Nabble.com.

Harshad | 5 Oct 16:42
Gravatar

Re: Re: A suggestion to optimize the for loops

David Pollak wrote:

> A couple of observations:
> 
>    - Benchmarking anything on the JVM requires "warming up" HotSpot...
>    running a few million iterations of what you're trying to test before
>    running the actual benchmark.

It sound too general a statement to me. It only makes sense, when
benchmarking leaf operations, and trying to find the asymptotic upper-bound
of the operation. It doesn't make sense when,
* trying to measure performances of non-leaf operations, and hence,
* trying to compare the constant time factor in the operation.

>    Further, there are material differences 
>    in how HotSpot optimizes between 1.5 and 1.6, desktop and server mode,
>    32 and
>    64 bit mode, and Intel vs. SPARC.  Put another way, stuff on my MacBook
>    Pro (1.5, 32 bit) has radically different (and much worse) performance
>    characteristics than does stuff run on my Linux box (1.6, 64 bit)

Yes, I didn't do a comprehensive benchmark. That said, we should see if the
optimisation helps on any given JVM, regardless if different JVMs don't
compare well to each other.

My JVM is:
(Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_03) on a x86 (Core 2 Duo)

>    - There is an optimization mode for ScalaC... did you use that when
>    compiling the code?

Yes, and that resulted in:
https://lampsvn.epfl.ch/trac/scala/ticket/1402

>    - While your suggestion may work for List and Seq, for comprehensions
>    work with *any* classes that have map, flatMap, filter, and foreach. 
>    If the meaning of "filter" was different for my custom class, removing
>    the filter call from the generated code would change the meaning of my
>    program.

It is a good point. IMHO, possible solutions could be either
* the (new) semantic meaning of for could be documented, or,
* the semantic meaning of filter could be clarified (and needs to be adhered
to)

Either way, it means a semantic change and perhaps only doable in a major
version bump.

Harshad | 5 Oct 20:49
Gravatar

Re: Re: A suggestion to optimize the for loops

Harshad wrote:

> David Pollak wrote:
> 
>> A couple of observations:
>> 
>>    - Benchmarking anything on the JVM requires "warming up" HotSpot...
>>    running a few million iterations of what you're trying to test before
>>    running the actual benchmark.
> 
> It sound too general a statement to me. It only makes sense, when
> benchmarking leaf operations, and trying to find the asymptotic
> upper-bound of the operation. It doesn't make sense when,
> * trying to measure performances of non-leaf operations, and hence,
> * trying to compare the constant time factor in the operation.

Ok, after more thought, I guess filter would count as a leaf operation,
because of the lazy projection.

Anyway, in that case, I _did_ execute the leaf a 100 million times (10000 *
10000), in each benchmark.


Gmane