Dieter Maurer | 29 Jun 2012 11:25
Picon

[Cython] Potential bug: hole in "C <-> Python" conversion

I have

cdef extern from *:
        ctypedef char const_unsigned_char "const unsigned char"

cdef const_unsigned_char *c_data = data

leads to "Cannot convert Python object to 'const_unsigned_char *'"
while "cdef char *c_data = data" works.

Should the "ctypedef char const_unsigned_char" not ensure
that "char" and "const_unsigned_char" are used as synonyms?

--
Dieter
Stefan Behnel | 29 Jun 2012 11:42
Picon
Favicon

Re: [Cython] Potential bug: hole in "C <-> Python" conversion

Dieter Maurer, 29.06.2012 11:25:
> I have
> 
> cdef extern from *:
>         ctypedef char const_unsigned_char "const unsigned char"

This is an incorrect declaration. "char" != "unsigned char".

> cdef const_unsigned_char *c_data = data
> 
> leads to "Cannot convert Python object to 'const_unsigned_char *'"
> while "cdef char *c_data = data" works.
> 
> Should the "ctypedef char const_unsigned_char" not ensure
> that "char" and "const_unsigned_char" are used as synonyms?

I assume you are not using the latest Cython (0.17pre) from github, are
you? It should have a fix for this.

Also note that libc.string contains declarations for "const char*" and friends.

Stefan
Dieter Maurer | 29 Jun 2012 12:18
Picon

Re: [Cython] Potential bug: hole in "C <-> Python" conversion

Stefan Behnel wrote at 2012-6-29 11:42 +0200:
>Dieter Maurer, 29.06.2012 11:25:
>> I have
>>
>> cdef extern from *:
>>         ctypedef char const_unsigned_char "const unsigned char"
>
>This is an incorrect declaration. "char" != "unsigned char".

You are right. I cheat to get "Cython" convert between "unsigned char*"
and "bytes" in the same way as it does for "char *".

For this conversion, there is no real difference between
"char *" and "unsigned char *" (apart from a C level warning
about a pointer of a bad type passed to "PyString_FromStringAndSize").

>> cdef const_unsigned_char *c_data = data
>>
>> leads to "Cannot convert Python object to 'const_unsigned_char *'"
>> while "cdef char *c_data = data" works.
>>
>> Should the "ctypedef char const_unsigned_char" not ensure
>> that "char" and "const_unsigned_char" are used as synonyms?
>
>I assume you are not using the latest Cython (0.17pre) from github, are
>you? It should have a fix for this.

You are right.

I am using the "cython" version which comes with my operating
(Continue reading)

Stefan Behnel | 29 Jun 2012 13:07
Picon
Favicon

Re: [Cython] Potential bug: hole in "C <-> Python" conversion

Dieter Maurer, 29.06.2012 12:18:
> Stefan Behnel wrote at 2012-6-29 11:42 +0200:
>> Also note that libc.string contains declarations for "const char*" and friends.
> 
> Unformatunately

Nice word, took me a while to make my brain split the characters correctly. ;)

> I need "const unsigned char*" and "const xmlChar *"
> (where "xmlChar" is defined as "unsigned char").

Ah, right, libxml2 - an excellent example. lxml is still suffering from the
decision of its initial author to ignore C compiler warnings ("for now")
and use plain char* instead. Lesson learned: DON'T DO THAT!

I recently started cleaning that up (which is why Cython now understands
and coerces "unsigned char*" as well), but you wouldn't believe how much
work it is to get "const" right after the fact if you have a sufficiently
large code base. The current (udiff) patch in my patch queue is some 3000
lines and still growing, but at least the compiler warnings look like
they'd soon fit on a single page. That's about the point where I need to
start tackling the really tough problems.

> I used the "libc.string" definitions as a blueprint for mine.

Sure, as long as the types are correct. lxml will have them declared in
tree.pxd at some point.

BTW, you might want to upgrade to a more recent Cython in any case. 0.13 is
almost two years old and lacks a lot of nice language features. lxml 2.4
(Continue reading)

Stefan Behnel | 29 Jun 2012 14:02
Picon
Favicon

Re: [Cython] Potential bug: hole in "C <-> Python" conversion

Stefan Behnel, 29.06.2012 13:07:
> Dieter Maurer, 29.06.2012 12:18:
>> I need "const unsigned char*" and "const xmlChar *"
>> (where "xmlChar" is defined as "unsigned char").
> 
> Ah, right, libxml2 - an excellent example. lxml is still suffering from the
> decision of its initial author to ignore C compiler warnings ("for now")
> and use plain char* instead. Lesson learned: DON'T DO THAT!

I added a doc section about using "const" with "char*".

https://sage.math.washington.edu:8091/hudson/job/cython-docs/doclinks/1/src/tutorial/strings.html#dealing-with-const

Stefan

Gmane