Robert Bradshaw | 19 Aug 06:11

[Cython] Any thoughts on a DEFINED keyword?

Does anyone else have any thoughts on this: http://trac.cython.org/ 
cython_trac/ticket/27

- Robert
Dag Sverre Seljebotn | 19 Aug 07:27

Re: [Cython] Any thoughts on a DEFINED keyword?

> Does anyone else have any thoughts on this: http://trac.cython.org/
> cython_trac/ticket/27

I'm not sure if I get the motivating usecase ... it looks like someone is
using include but should be using cimport. Whatever someones needs is to
require such protections it would be better to find a more high-level way
of going about it IMO (e.g. if using cimport is absolutely impossible,
perhaps that is an argument for include_once or similar).

The feature doesn't really hurt though, so I'm -0.1 (and may be +1 if I
could see a better/more detailed motivating usecase).

Dag Sverre

Stefan Behnel | 21 Aug 07:26

Re: [Cython] Any thoughts on a DEFINED keyword?

Hi,

Dag Sverre Seljebotn wrote:
> if using cimport is absolutely impossible,
> perhaps that is an argument for include_once or similar

What would the use case for an "include_once"?

Stefan
Chuck Blake | 19 Aug 19:59

Re: [Cython] Any thoughts on a DEFINED keyword?

Hey, guys.  Thanks for considering this.

I can think of at least 3 use cases, though there may be more.
As with most language mechanisms (beyond a tiny core), one can
convert each use case into special higher level substitute that
can do a more complete job.  There is also charm to just one
easy to implement/understand facility.

 1) Idempotent inclusion (already described)

    cimport does not do everything.  "cimport *" not working
    is just one example unlikely to go away.  Another would
    be cross-file inline-ability of tiny little functions.
    On those occasions when a user needs to 'include', it's
    helpful to have protections against multiple inclusion
    types of errors.

    As per my original post, that the same "do once" behavior
    can be achieved already through even uglier, lower level
    compile-time state comparison:

        DEF FOO_1 = 1
        DEF FOO_2 = 1
        IF FOO_1 == FOO_2:
            DEF FOO_2 = 2
            ...
    Yes, this could be replaced with an "include_once" as per
    Dag's reply, which is better since include-ees need no
    indent.  This is about as easy to implement as DEFINED.

(Continue reading)

Stefan Behnel | 19 Aug 20:21

Re: [Cython] Any thoughts on a DEFINED keyword?

Hi,

Chuck Blake wrote:
>  1) Idempotent inclusion (already described)
>  2) Compile-time constant defaulting/overriding
> 
>         In client-module:       # client knows objects in the B-tree
>             DEF M = 128         # are small and wants large branches
>             include "Btree.pxi" # to minimize TLB misses/latencies.
> 
>         In btree.pxi:           # but the impl module can provide a
>             if not DEFINED M:   # reasonable default definition
>                 DEF M = 10
>             int someArray[M]
> 
>     This general style of thing could also be done even better with
>     a smarter include:
>         defining_include "Btree.pxi" M=2, X=Y, ...

That's a funny idea. We could extend the "include" syntax into

    include("theinc.pxi", **kwargs)

with

    include "theinc.pxi"

being a special case allowed for backward compatibility. The keyword arguments
to the include would override any compile time DEFs done in the include.

(Continue reading)

Chuck Blake | 19 Aug 22:10

Re: [Cython] Any thoughts on a DEFINED keyword?

>Although a function might imply that include is not a keyword. Maybe
>    include "theinc.pxi" DEF A=1, ...

Yeah.  That's not a bad syntax, either.  It would be nice to have this
kind of local scoping of compile-time constants/parameterized file.

>I'm not totally clear about the semantic implications, BTW, and I'm not sure
>that the feature /itself/ is a good idea. I just think it doesn't look too bad.

If you find this idea interesting, you might also be interested in the idea
of adding just a little more power to compile-time defs to templating power
over included files with very little mechanism.

E.g., define max_tmpl.pxi:

    TYPE def NAMES(x)(TYPE a, TYPE b):
        return a > b if a else b

Then a client can simply

    include "max_tmpl.pxi" TYPE="int", NAMES(x)=("int_" + x)

and just use the defined function int_max() as they like...

If something like NAMES() can refer to its outer definitions in its own,
you can do a nested hierarchy of namespaces:

    include "max_tmpl.pxi" TYPE="int", NAMES(x)=NAMES("int_" + x)

>if you want to use bleading-edge Cython features in your code, you're pretty
(Continue reading)

Dag Sverre Seljebotn | 19 Aug 23:32

Re: [Cython] Any thoughts on a DEFINED keyword?

Chuck Blake wrote:
> E.g., define max_tmpl.pxi:
> 
>     TYPE def NAMES(x)(TYPE a, TYPE b):
>         return a > b if a else b
> 
> Then a client can simply
> 
>     include "max_tmpl.pxi" TYPE="int", NAMES(x)=("int_" + x)
> 
> and just use the defined function int_max() as they like...
> 
> If something like NAMES() can refer to its outer definitions in its own,
> you can do a nested hierarchy of namespaces:
> 
>     include "max_tmpl.pxi" TYPE="int", NAMES(x)=NAMES("int_" + x)
> 

This made my mind up: I do not think we should reinvent the wheel here.

Rather than focusing on improving the DEF, IF and include statements, we 
should look for an existing macro language for our uses (and rather 
deprecate DEF, IF and include than make them more powerful).

We could even find a preprocessing language that is made "official", 
distributed with Cython and automatically invoked and all (and even 
modify it to take care of IF, DEF and include as they are today and 
refactor this out of core Cython) -- I'm just saying that I'd like these 
kind of low-level string manipulations to stay out the Cython compiler 
core and use an existing package (any existing package) for this. The 
(Continue reading)

Chuck Blake | 20 Aug 00:39

Re: [Cython] Any thoughts on a DEFINED keyword?

> This made my mind up: I do not think we should reinvent the wheel here.

This was just following up what Stefan seemed interested in.  It was just
an alternative to using the far simpler use case of DEFINED which was
merely if not DEFINED(foo) and bears little relation to C++ templates.
You do seem to understand that from below, but I wanted to mention it
more explicitly.

> Rather than focusing on improving the DEF, IF and include statements, we
> should look for an existing macro language for our uses (and rather
> deprecate DEF, IF and include than make them more powerful).
> [..]
> C has a clear seperation between the language and the preprocessor, with
> great success.

That's fine if someone actually does the integration work of running it from
the tool chain.  Keeping the Cython compiler about more typed files rather
than include-like generic file combination and about typed values/names
rather than token pasting is probably wise separation of concerns.  That
being said, until there is some drop-in replacement for DEF/IF/include,
why not round out the triplet with DEFINED as any replacement will likely
have that sort of feature?  Seems like a practical answer until you have
some more fully functional textual manipulation system integrated.  And
Pyrex has DEF/IF/include as well, so you may be stuck with at least only
deprecating them/providing them via a different mechanism unless Greg
shares your separation biases.  He seemed to have trouble getting rid of
for "from" and I suspect "include" would share this trait.

> (Still, anything which is already implemented is, well, already implemented
> and "free" in some sense.).
(Continue reading)

Dag Sverre Seljebotn | 20 Aug 00:54

Re: [Cython] Any thoughts on a DEFINED keyword?

Chuck Blake wrote:
>> This made my mind up: I do not think we should reinvent the wheel here.
> 
> This was just following up what Stefan seemed interested in.  It was just
> an alternative to using the far simpler use case of DEFINED which was
> merely if not DEFINED(foo) and bears little relation to C++ templates.
> You do seem to understand that from below, but I wanted to mention it
> more explicitly.

Yes, I was following up myself, I could have been more clear. For the 
record, I'm still +0 on Chuck's proposal.

--

-- 
Dag Sverre
Dag Sverre Seljebotn | 20 Aug 01:15

Re: [Cython] Any thoughts on a DEFINED keyword?

Dag Sverre Seljebotn wrote:
> Chuck Blake wrote:
>>> This made my mind up: I do not think we should reinvent the wheel here.
>> This was just following up what Stefan seemed interested in.  It was just
>> an alternative to using the far simpler use case of DEFINED which was
>> merely if not DEFINED(foo) and bears little relation to C++ templates.
>> You do seem to understand that from below, but I wanted to mention it
>> more explicitly.
> 
> Yes, I was following up myself, I could have been more clear. For the 
> record, I'm still +0 on Chuck's proposal.
> 

Also, I think it is important at this stage for Cython that patches are 
accepted unless there are good reasons against it, i.e that without any 
-1 it gets applied. But that is just a personal opinion.

--

-- 
Dag Sverre
Dag Sverre Seljebotn | 19 Aug 22:01

Re: [Cython] Any thoughts on a DEFINED keyword?

Chuck Blake wrote:
> Hey, guys.  Thanks for considering this.
> 
> I can think of at least 3 use cases, though there may be more.
> As with most language mechanisms (beyond a tiny core), one can
> convert each use case into special higher level substitute that
> can do a more complete job.  There is also charm to just one
> easy to implement/understand facility.

This is going to be interesting :-)

Yes, most of the usecases I can see better higher-level constructs for 
(which I'll list briefly below for completeness, without making those 
the core message of this post I hope). However, you are very right in 
saying that this provides a solution *now*, not later. It can always be 
deprecated in some years.

However, note that another option is using other macro languages (like 
m4; or something more modern) to preprocess your Cython files for the 
same result. This will give you all the power you wish for and more.

On the one hand, your patch is already there. On the other hand, C++ 
meta-programming-through-templates stands as a permanent warning about 
the dangers of introducing too powerful low-level concepts rather than 
the appropriate high-level one :-)

(I'm still about +0 for this).

>  1) Idempotent inclusion (already described)
> 
(Continue reading)

Greg Ewing | 20 Aug 03:29

Re: [Cython] Any thoughts on a DEFINED keyword?

Chuck Blake wrote:

>  3) Compile-time function availability testing

A separate compile-time directive would be better
for this. Doing this with DEFINED would require the
runtime built-in functions to be part of the compile-
time namespace, which is not currently the case,
and IMO it would be conceptually wrong to make it so.

--

-- 
Greg
Robert Bradshaw | 20 Aug 03:47

Re: [Cython] Any thoughts on a DEFINED keyword?

On Aug 19, 2008, at 10:59 AM, Chuck Blake wrote:

> Hey, guys.  Thanks for considering this.
>
> I can think of at least 3 use cases, though there may be more.
> As with most language mechanisms (beyond a tiny core), one can
> convert each use case into special higher level substitute that
> can do a more complete job.  There is also charm to just one
> easy to implement/understand facility.
>
>  1) Idempotent inclusion (already described)
>
>     cimport does not do everything.  "cimport *" not working
>     is just one example unlikely to go away.

"cimport *" works right now.

> Another would
>     be cross-file inline-ability of tiny little functions.
>     On those occasions when a user needs to 'include', it's
>     helpful to have protections against multiple inclusion
>     types of errors.

I think we should rather support inline functions in .pxd files. Non- 
inline functions can be cimported. Includes should be much less used  
then the are now, but they have their place (when you actually want  
to textually include code in multiple places.)

>
>     As per my original post, that the same "do once" behavior
(Continue reading)

Stefan Behnel | 20 Aug 08:09

Re: [Cython] preprocessor integration

Hi,

Robert Bradshaw wrote:
> I think we should rather support inline functions in .pxd files. Non- 
> inline functions can be cimported. Includes should be much less used  
> then the are now, but they have their place (when you actually want  
> to textually include code in multiple places.)
> [...]
> I would *much* rather see a patch that lets distutils run a  
> preprocesor (your choice of the C preprocessor or something more  
> modern) rather than building more preprocessing into the language  
> itself.

Does it have to be distutils running the preprocessor? The use-cases I've seen
so far appeared to be based on an import or include, so I think it might be
enough to enable the preprocessor only for .pxd files (and maybe .pxi files,
depends on the targeted expressiveness of .pxd files). That way, any templates
would automatically end up in a separate file (where reuse is possible) and
could get configured from within Cython source code. IMHO, that sounds better
than inventing a new calling convention from within distutils. And in the
worst case, you can always write an almost empty .pyx file with a single
cimport, and /maybe/ an additional function call. :)

We might also consider making the preprocessor interface pluggable, so that
people could register their own preprocessor for a DSL. That way, we'd keep
Cython itself free from any futuristic niche language extensions and would
just require that whatever people output from their preprocessors is valid
.pxd code. You could then ship your preprocessor in a separate package
directory of your project sources, or distribute it through PyPI and require
it in your setup.py.
(Continue reading)

Robert Bradshaw | 20 Aug 20:43

Re: [Cython] preprocessor integration

On Wed, 20 Aug 2008, Stefan Behnel wrote:

> Hi,
>
> Robert Bradshaw wrote:
>> I think we should rather support inline functions in .pxd files. Non-
>> inline functions can be cimported. Includes should be much less used
>> then the are now, but they have their place (when you actually want
>> to textually include code in multiple places.)
>> [...]
>> I would *much* rather see a patch that lets distutils run a
>> preprocesor (your choice of the C preprocessor or something more
>> modern) rather than building more preprocessing into the language
>> itself.
>
> Does it have to be distutils running the preprocessor? The use-cases I've seen
> so far appeared to be based on an import or include, so I think it might be
> enough to enable the preprocessor only for .pxd files (and maybe .pxi files,
> depends on the targeted expressiveness of .pxd files). That way, any templates
> would automatically end up in a separate file (where reuse is possible) and
> could get configured from within Cython source code. IMHO, that sounds better
> than inventing a new calling convention from within distutils. And in the
> worst case, you can always write an almost empty .pyx file with a single
> cimport, and /maybe/ an additional function call. :)
>
> We might also consider making the preprocessor interface pluggable, so that
> people could register their own preprocessor for a DSL. That way, we'd keep
> Cython itself free from any futuristic niche language extensions and would
> just require that whatever people output from their preprocessors is valid
> .pxd code. You could then ship your preprocessor in a separate package
(Continue reading)

Chuck Blake | 21 Aug 00:08

[Cython] toolchain integration (was Re: preprocessor integration)

Something worth mentioning at this point is that *proper* preprocessor
integration has unelaborated and perhaps unconsidered subtleties.

You should be wary of things like cpp caring about lexical structure
(not expanding macros inside text that looks like a C strings,
deleting text you don't want because it looks like a C comment,
as in glob('''foo/*'''), and so on.

There are also consequences for the main language parser since one
wants to propagate source coordinates to be a good toolchain citizen.
Pyrexc/Cython are already not such good citizens.  They could but
do not put file and line number directives in the generated C.

Yes, perhaps in Pyrex's/Cython's case every other output C line would
need to be a #line and one may want to turn it off to look at generated
code.  Yet, it seems to me the default should be to generate these
for naive or "regular" users who upon seeing an uninitialized variable
warning or what have you at line "foo.pyx:55" not at some line in the
generated code.  Those who want #file/#line off (to debug the code
generator or declutter the generated code) are likely more empowered
to easily disable #file/#line generation than those confused by source
coordinates and wanting "unused variable" and "unitialized variable"
type warnings. :)

Even more significantly than things like data flow gcc warnings, this
would let me set a breakpoint in GDB like "breakpoint foo.pyx:55".
Yes, gdb doesn't yet understand Cython, and I'd have to use "inf loc"
or something to look at the local variables, but it would be nice to
even just make it easier to stop at the right Cython function (rather
than mangling or mucking about in source code). 
(Continue reading)

Robert Bradshaw | 21 Aug 01:37

Re: [Cython] toolchain integration (was Re: preprocessor integration)

On Wed, 20 Aug 2008, Chuck Blake wrote:

> Something worth mentioning at this point is that *proper* preprocessor
> integration has unelaborated and perhaps unconsidered subtleties.
>
> You should be wary of things like cpp caring about lexical structure
> (not expanding macros inside text that looks like a C strings,
> deleting text you don't want because it looks like a C comment,
> as in glob('''foo/*'''), and so on.

Yes, these are good points. I don't want to go down the path of trying to 
add our own special pre-processing language into Cython...

> There are also consequences for the main language parser since one
> wants to propagate source coordinates to be a good toolchain citizen.
> Pyrexc/Cython are already not such good citizens.  They could but
> do not put file and line number directives in the generated C.
>
> Yes, perhaps in Pyrex's/Cython's case every other output C line would
> need to be a #line and one may want to turn it off to look at generated
> code.  Yet, it seems to me the default should be to generate these
> for naive or "regular" users who upon seeing an uninitialized variable
> warning or what have you at line "foo.pyx:55" not at some line in the
> generated code.  Those who want #file/#line off (to debug the code
> generator or declutter the generated code) are likely more empowered
> to easily disable #file/#line generation than those confused by source
> coordinates and wanting "unused variable" and "unitialized variable"
> type warnings. :)

As a bit of defense, this is because nearly all C-level compile 
(Continue reading)

Greg Ewing | 21 Aug 03:04

Re: [Cython] toolchain integration (was Re: preprocessor integration)

Chuck Blake wrote:

> Pyrexc/Cython are already not such good citizens.  They could but
> do not put file and line number directives in the generated C.

There are reasons for that. In theory, Pyrex should never
generate invalid C code, so if there are any C compilation
errors, I usually want to see where they came from in the
C file so that I can figure out what went wrong.

I know this doesn't apply to things like misspelled names
referring to external C functions, but there's no way
of selectively reporting .pyx file positions just for
those, and they're usually easy to track down by searching
for the relevant name.

There's also the problem that some of the code Pyrex
generates isn't clearly attributable to any particular
place in a .pxd or .pyx file.

--

-- 
Greg
Chuck Blake | 21 Aug 09:13

Re: [Cython] toolchain integration (was Re: preprocessor integration)

I'm not sure you read what I said very carefully { about use case 3, too,
but that use case is admittedly pretty lame :-/ }.

Greg Ewing wrote:
>There are reasons for that. In theory, Pyrex should never
>generate invalid C code, so

I never mentioned compiler errors, but only warnings.  Obviously, errors are
more important for the code generator writer, as are file & line numbers in
the generated code.  Yet, some warnings come from data flow analysis in gcc.
Yes, you could reinvent such analysis.  It sure seems easier to emit #line
and let gcc (or whatever compiler) tell users and try to keep warnings
produced by the generated code to a minimum.  They're already very thin.
I've mostly only gotten incompletely filled-in struct initializer warnings.

>if there are any C compilation errors, I usually want to
>see where they came from in the C file so that I can figure
>out what went wrong.

I noted that more powerful users [ system authors obviously included :) ]
might prefer generated lines but are also more empowered to flip the switch
to get whichever they want.  Of course, I'd just happy to be able to turn
it on any old way, (and don't really expect to be able to any time soon).
Just trying to motivate the best default and motivate the behavior...

>there's no way of selectively reporting .pyx file positions just for those

Well, don't be selective.  :)  Just emit a #line for everything and let
the whole feature be turned on or off.

(Continue reading)

Greg Ewing | 21 Aug 09:35

Re: [Cython] toolchain integration (was Re: preprocessor integration)

Chuck Blake wrote:
> I'm not sure you read what I said very carefully { about use case 3, too,
> but that use case is admittedly pretty lame :-/ }.

I didn't mean to say you shouldn't change it. I was
just explaining how it came to be the way it is now.
My original intention was that Pyrex would always
generate correct C code, so the user should never
see an error (or even a warning) from the C compiler.
Given that, there was no need to emit #line directives,
and in fact some reasons not to.

It's become apparent since that it's not feasible
for Pyrex to guarantee correct C code in all cases,
since there's always a possibility of a mismatch
between Pyrex declarations and external .h files.
So there may be some benefit in generating #line
directives after all.

--

-- 
Greg
Robert Bradshaw | 21 Aug 10:13

Re: [Cython] toolchain integration (was Re: preprocessor integration)

On Aug 21, 2008, at 12:13 AM, Chuck Blake wrote:

> I'm not sure you read what I said very carefully { about use case  
> 3, too,
> but that use case is admittedly pretty lame :-/ }.
>
> Greg Ewing wrote:
>> There are reasons for that. In theory, Pyrex should never
>> generate invalid C code, so
>
> I never mentioned compiler errors, but only warnings.  Obviously,  
> errors are
> more important for the code generator writer, as are file & line  
> numbers in
> the generated code.  Yet, some warnings come from data flow  
> analysis in gcc.
> Yes, you could reinvent such analysis.  It sure seems easier to  
> emit #line
> and let gcc (or whatever compiler) tell users and try to keep warnings
> produced by the generated code to a minimum.  They're already very  
> thin.
> I've mostly only gotten incompletely filled-in struct initializer  
> warnings.

I second the fact that Pyrex should never generate invalid C code.  
Short of invalid extern declarations, have you ever had any C errors/ 
warnings that were due to bad Pyrex/Cython code?

>> if there are any C compilation errors, I usually want to
>> see where they came from in the C file so that I can figure
(Continue reading)

Chuck Blake | 21 Aug 17:15

Re: [Cython] toolchain integration (was Re: preprocessor integration)

Robert Bradshaw [robertwb@...] wrote:
>I second the fact that Pyrex should never generate invalid C code.  

:)

>If there are errors, it probably won't be fixed by changing the pyx  
>file, rather the compiler itself needs fixing.

My concern was again more about warnings from things in user-level pyx code
that do not necessarily constitute "invalid code", but *could* constitute it.
You know, cases end users are better judges of, ultimately, which usually is
close to tantamount to a "good warning use case".

Examples I'm thinking of are things like expressions "always being true due
to the range of an unsigned type" (or for other reasons), or "using variables
before they're initialized", and that sort of thing. { I've actually found
a few bugs in my own code this way, though only with mucking about in the
generated code or doing a "cython-demangle" in my brain, but such things are
less friendly to more casual users. }

>(but I'm still not convinced that it  should be on by default).

Yes, for tracking down bugs in Cython you probably don't want #file/#line, but
the typical user (hopefully!) isn't in that business.  Like Linus defaulting
drivers in Linux to what he likes, you surely have the discretionary power to
pick what Cython debuggers particularly want/use most of the time. :)  More
seriously, it might make sense at this stage of Cython to default to not
emitting #line and at a later stage to default to emitting it.  The choice
also interacts with meta-programs that could generate bogus C, too.  It would
probably be best to just import Cython compiler flags from a $CYTHON_FLAGS,
(Continue reading)


Gmane