Greg Ewing | 13 May 02:17 2008
Picon
Picon

Re: [Cython] Language stability

Stefan Behnel wrote:
> The latest syntax change regarding the
> for loop is not required for Cython

Concerning that, I've decided to remove the deprecation
warning, and continue supporting the old syntax. This will
be done in the next release; in the meantime, you can
apply the following patch:

http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/undeprecate-for-from.patch

> where the (IMHO much more obvious) cdef +
> range() syntax is optimised

Even in the presence of this optimisation, I don't consider
that the integer for-loop syntax is entirely redundant.

It concisely and clearly expresses all the possible combinations
of including or excluding the lower and upper bound, together
with iteration direction. This is something that I don't think
is obvious at all with range() once you get beyond the simplest
cases.

It's for this reason -- notational clarity -- that I
introduced the integer for-loop syntax, at least as much as
optimisation.

This is also the reason I want to simplify the syntax. The
'for i from...' version was a compromise -- I was originally
thinking of it as a possible addition to Python itself, and
(Continue reading)

Stefan Behnel | 13 May 14:15 2008
Picon

Re: [Cython] Language stability

Hi Greg,

Greg Ewing wrote:
>> where the (IMHO much more obvious) cdef +
>> range() syntax is optimised
>
> Even in the presence of this optimisation, I don't consider
> that the integer for-loop syntax is entirely redundant.

Not redundant, no. But less calling for improvement.

> The 'for i from...' version was a compromise

I understand that. Still, having two spellings for "for ... in ...", one
for Python, one for C, looks better than a completely different syntax
that just starts with "for". So I vote for

    for x in iterable:

and

    for x in 1 < x <= 5:

instead of the new

    for 1 < x <= 5:

purely for readability (and obviously keeping the old "from" spelling for
compatibility).

(Continue reading)

Greg Ewing | 14 May 01:43 2008
Picon
Picon

Re: [Cython] Language stability

Stefan Behnel wrote:

>     for x in iterable:
> 
> and
> 
>     for x in 1 < x <= 5:

That won't work, because it's ambiguous -- they're both
instances of 'for x in <expression>'.

In any case, simply changing 'from' to 'in' doesn't
address the reason I made the change.

--

-- 
Greg
Dag Sverre Seljebotn | 13 May 16:28 2008
Picon
Picon

Re: [Cython] Int looping


>> The 'for i from...' version was a compromise
>>     
>
> I understand that. Still, having two spellings for "for ... in ...", one
> for Python, one for C, looks better than a completely different syntax
> that just starts with "for". So I vote for
>
>     for x in iterable:
>
> and
>
>     for x in 1 < x <= 5:
>   
Is this (int looping) something you tend to do? When writing Python code 
I almost never end up doing it, rather I end up using "enumerate" (or 
for NumPy, "ndenumerate") when I need the indices. I'd rather we worked 
on improved high-level looping than inventing new syntax for low-level 
looping.

For instance, one could implement optimizations for "enumerate" in 
Cython as well as "range":

for idx, value in enumerate(a):
    BLOCK

could be turned into the much more efficient

for idx from 0 <= len(a):
    value = a[idx]
(Continue reading)

Dag Sverre Seljebotn | 13 May 16:35 2008
Picon
Picon

Re: [Cython] Int looping

*sigh* .. slow down...

Corrections:

> could be turned into the much more efficient
>
> for idx from 0 <= len(a):
>   
This should be:

for idx from 0 <= idx < len(a):

> for (T* iter = start; iter != v.end(); ++iter) ...
>
> could be done by something like
>
> for iterator in c_iteration(a.begin(), a.end()):
>   print iterator.get()
>   iterator.set(3)
>   
This should be:

for (T* iter = start; iter != end; ++iter) ...

could be done by something like:

for iterator in c_iteration(start, end):
   print iterator.get()
   iterator.set(3)

(Continue reading)

Stefan Behnel | 13 May 17:06 2008
Picon

Re: [Cython] Int looping

Dag Sverre Seljebotn wrote:
>
>>> The 'for i from...' version was a compromise
>>
>> I understand that. Still, having two spellings for "for ... in ...", one
>> for Python, one for C, looks better than a completely different syntax
>> that just starts with "for". So I vote for
>>
>>     for x in iterable:
>>
>> and
>>
>>     for x in 1 < x <= 5:
>>
> Is this (int looping) something you tend to do?

It happens.

> When writing Python code

Not in Python code, but in Cython code, especially in low-level C-ish
functions.

> I almost never end up doing it, rather I end up using "enumerate" (or
> for NumPy, "ndenumerate") when I need the indices. I'd rather we worked
> on improved high-level looping than inventing new syntax for low-level
> looping.

As Greg pointed out, it's there because it's convenient.

(Continue reading)

Dag Sverre Seljebotn | 13 May 17:35 2008
Picon
Picon

Re: [Cython] Int looping


>> for idx, value in enumerate(a):
>>     BLOCK
>>
>> could be turned into the much more efficient
>>
>> for idx from 0 <= len(a):
>>     value = a[idx]
>>     BLOCK
>>     
>
> Provided that a is indexable, which is getting less likely in recent
> Python code.
>   
Ahh. True. In fact, even when it is indexable, it would still be 
changing behaviour.

Still, the following gives about a small speedup (10-30% depending on 
the idx type and conversions required) in some simple benchmarks:

cdef int idx
for idx, value in enumerate(a):
    BLOCK

versus

cdef int idx = 0
for value in a:
    BLOCK
    idx += 1
(Continue reading)

Robert Bradshaw | 13 May 19:45 2008

Re: [Cython] [Pyrex] Language stability

On May 13, 2008, at 5:15 AM, Stefan Behnel wrote:

> Hi Greg,
>
> Greg Ewing wrote:
>>> where the (IMHO much more obvious) cdef +
>>> range() syntax is optimised
>>
>> Even in the presence of this optimisation, I don't consider
>> that the integer for-loop syntax is entirely redundant.
>
> Not redundant, no. But less calling for improvement.
>
>
>> The 'for i from...' version was a compromise
>
> I understand that. Still, having two spellings for "for ...  
> in ...", one
> for Python, one for C, looks better than a completely different syntax
> that just starts with "for". So I vote for
>
>     for x in iterable:
>
> and
>
>     for x in 1 < x <= 5:
>
> instead of the new
>
>     for 1 < x <= 5:
(Continue reading)

Stefan Behnel | 13 May 21:07 2008
Picon

Re: [Cython] [Pyrex] Language stability

Robert Bradshaw wrote:
> On May 13, 2008, at 5:15 AM, Stefan Behnel wrote:
>>     for x in iterable:
>>
>> and
>>
>>     for x in 1 < x <= 5:
>>
>> instead of the new
>>
>>     for 1 < x <= 5:
>>
>> purely for readability (and obviously keeping the old "from"
>> spelling for
>> compatibility).
>
> I'm -1 for having lots of multiple ways to do for loops

I agree, but since Greg brought up the third way of doing it because he
didn't like the integer loop syntax, I wanted to discuss what a good
syntax would be here *iff* he wants to change it.

> Also, "from" makes it
> clear that this is a special cython loop--consider the following:
>
> x = 1
>
> class A:
>      def __gt__(self, other):
>          return range(3,7)
(Continue reading)

Robert Bradshaw | 14 May 08:14 2008

Re: [Cython] [Pyrex] Language stability

On May 13, 2008, at 12:07 PM, Stefan Behnel wrote:
>
>> I do think optimizing enumerate/zip/etc is feasible and probably
>> worthwhile.
>
> I'm just concerned about too many special cases in the optimiser.
>
> If we start optimising these (and I would prefer giving range(), zip 
> () &
> friends their Py3 iterator semantics in this case), we should come  
> up with
> a generic way to support iterator chaining in C code rather than  
> making
> the looping code even more complicated and special casing.

Yes, I certainly agree. This is an area where the "visitor" paradigm  
makes much more sense.

> When we loop over a chain of iterators, this example
>
>     for i,k in enumerate(range(100)):
>         l += i*k
>
> could become something like this:
>
>     c_range_state = c_range_new(100)
>     c_enumerate_state = c_enumerate_new()
>
>     while 1:
>         temp1 = c_range_next(c_range_state); if (!temp1) ...
(Continue reading)

Stefan Behnel | 14 May 15:27 2008
Picon

Re: [Cython] [Pyrex] Language stability

Robert Bradshaw wrote:
> I'm not sure how much this would help (it would
> some for sure), as I believe many of these iterators are already
> written in C and it's a series of C calls. I think the biggest
> savings is that
>
>      cdef int i
>      for i, k in enumerate(L):
>          ...
>
> could increment i as a c int, and avoid all the tuple packing/
> unpacking.

... which is a rather big overhead for such a trivial iterator. The thing
is, many iterators really do extremely simple things, but require all the
tuple packing and function call indirection *on each iteration*. That's
different from functions like max() or the Py2 zip(), which are called
once and the do loads of stuff in C.

Just look at these:

http://docs.python.org/lib/itertools-functions.html

The really simple ones are: enumerate, range, chain, count, islice, repeat

count() and range() are actually equivalent in their iterable versions.
But I think enumerate() and islice(), and maybe chain() might be worth
optimising.

> Likewise, the tuple packing/unpacking could be avoided for
(Continue reading)

Robert Bradshaw | 14 May 18:41 2008

Re: [Cython] [Pyrex] Language stability

On May 14, 2008, at 6:27 AM, Stefan Behnel wrote:

> Robert Bradshaw wrote:
>> I'm not sure how much this would help (it would
>> some for sure), as I believe many of these iterators are already
>> written in C and it's a series of C calls. I think the biggest
>> savings is that
>>
>>      cdef int i
>>      for i, k in enumerate(L):
>>          ...
>>
>> could increment i as a c int, and avoid all the tuple packing/
>> unpacking.
>
> ... which is a rather big overhead for such a trivial iterator. The  
> thing
> is, many iterators really do extremely simple things, but require  
> all the
> tuple packing and function call indirection *on each iteration*.  
> That's
> different from functions like max() or the Py2 zip(), which are called
> once and the do loads of stuff in C.
>
> Just look at these:
>
> http://docs.python.org/lib/itertools-functions.html
>
> The really simple ones are: enumerate, range, chain, count, islice,  
> repeat
(Continue reading)

Jim Kleckner | 15 May 17:12 2008
Picon

Re: [Cython] [Pyrex] Language stability

Robert Bradshaw wrote:

> I'm -1 for having lots of multiple ways to do for loops (including  
> that list of PEPs--we're already up to 3). Also, "from" makes it  
> clear that this is a special cython loop--consider the following:
> 
> x = 1
> 
> class A:
>      def __gt__(self, other):
>          return range(3,7)
> 
> for x in 0 <= x < A():
>      print x
> 
> 
> This is valid Python (prints 3, 4, 5, 6), and would act completely  
> differently under your proposal.

This seems to me to demonstrate quite well that the newer syntax
is less desirable than the old.

Greg Ewing | 16 May 03:22 2008
Picon
Picon

Re: [Cython] [Pyrex] Language stability

Jim Kleckner wrote:
> Robert Bradshaw wrote:
> 
>> for x in 0 <= x < A():
>>      print x
>>
>> This is valid Python (prints 3, 4, 5, 6), and would act completely  
>> differently under your proposal.
> 
> This seems to me to demonstrate quite well that the newer syntax
> is less desirable than the old.

Not sure I follow -- the new syntax isn't ambiguous.
It can't be an ordinary for-loop, because it doesn't
have an 'in' in it.

--

-- 
Greg

Gmane