Picon
Picon

Re: [Cython] Creating python bindings for C library using Cython

Chris Colbert wrote:
> python strings are automagically converted to char* by cython.
I'll beat Stefan to it and note some very important points here. This 
applies if you actually have a string (e.g. could be stored in "unicode" 
in Py2), not if you have raw byte data ("bytes" in Py3).

1) A C char* is not a string as such, just a sequence of bytes. The C 
library should have a definition of what kind of encoding it wants its 
strings in, and one should call the library like this:

encoded = mystr.encode(encoding)
cdef char* encoded_buf = encoded
c_func(encoded_buf)

2) Note that char* does not say anything about releasing memory etc. 
I.e., the following will likely crash:

c_func(mystr.encode(encoding))

because the temporary object returned from the encode method is 
deallocated. *Always* keep a reference to the original Python object for 
the duration someone could use the char*.

(Perhaps somebody could make this a FAQ entry in the wiki if it is not 
already?)

Dag Sverre
Stefan Behnel | 1 Jun 22:04
Picon
Favicon

Re: [Cython] Creating python bindings for C library using Cython


Dag Sverre Seljebotn wrote:
> Chris Colbert wrote:
>> python strings are automagically converted to char* by cython.
> I'll beat Stefan to it and note some very important points here.
> [...]
> (Perhaps somebody could make this a FAQ entry in the wiki if it is not 
> already?)

http://wiki.cython.org/FAQ#HowdoIpassaPythonstringparameterontoaClibrary.3F

Now guess who wrote that entry :)

Stefan

Chris Colbert | 1 Jun 21:14
Picon

Re: [Cython] Creating python bindings for C library using Cython

I never mind being corrected/updated/put-in-my-place with such a great answer.

So there you have it :)

Cheers!

On Mon, Jun 1, 2009 at 3:09 PM, Dag Sverre Seljebotn <dagss-oe7qfRrRQfeIHVDce9W+Bg@public.gmane.orgat.uio.no> wrote:
Chris Colbert wrote:
> python strings are automagically converted to char* by cython.
I'll beat Stefan to it and note some very important points here. This
applies if you actually have a string (e.g. could be stored in "unicode"
in Py2), not if you have raw byte data ("bytes" in Py3).

1) A C char* is not a string as such, just a sequence of bytes. The C
library should have a definition of what kind of encoding it wants its
strings in, and one should call the library like this:

encoded = mystr.encode(encoding)
cdef char* encoded_buf = encoded
c_func(encoded_buf)

2) Note that char* does not say anything about releasing memory etc.
I.e., the following will likely crash:

c_func(mystr.encode(encoding))

because the temporary object returned from the encode method is
deallocated. *Always* keep a reference to the original Python object for
the duration someone could use the char*.

(Perhaps somebody could make this a FAQ entry in the wiki if it is not
already?)

Dag Sverre
_______________________________________________
Cython-dev mailing list
Cython-dev-F/1GfIIGwJtbRRN4PJnoQQ@public.gmane.org
http://codespeak.net/mailman/listinfo/cython-dev


Gmane