Robert Bradshaw | 28 Jun 2012 10:59
Picon

[Cython] Automatic C++ conversions

I've been looking how painful it is to constantly convert between
Python objects and string in C++. Yes, it's easy to write a utility,
but this should be as natural (if not more so, as the length is
explicit) than bytes <-> char*. Several other of the libcpp classes
(vector, map) have natural Python analogues too.

What would people think about making it possible to declare these in a
C++ file? Being able to make arbitrary mappings anywhere between types
is contextless global state that I'd rather avoid, but perhaps special
methods defined on the class such as

cdef extern from "<string>" namespace "std":
    cdef cppclass string:
        def __object__(sting s):
            return s.c_str()[s.size()]
        def __create__(object o):
            return string(<char*>o, len(o))
        ...

(names open to suggestions) Then one could write

cdef extern from *:
    string c_func(string)

def f(x):
    return c_func(x)

- Robert
Stefan Behnel | 28 Jun 2012 11:54
Picon
Favicon

Re: [Cython] Automatic C++ conversions

Robert Bradshaw, 28.06.2012 10:59:
> I've been looking how painful it is to constantly convert between
> Python objects and string in C++.

You mean std::string (as I think it's called)? Can't we just special case
that in the same way that we special case char* and friends? Basically just
one type more in that list. And it would give you efficient
encoding/decoding more or less for free.

I mean, well, it would likely break existing code to start doing that (in
the same way that we broke code by enabling type inference for convertible
pointers), but as long as it helps more than it breaks ...

> Yes, it's easy to write a utility,
> but this should be as natural (if not more so, as the length is
> explicit) than bytes <-> char*. Several other of the libcpp classes
> (vector, map) have natural Python analogues too.

And you would want to enable coercion to those, too? Have a vector copy
into a Python list automatically? (Although that's trivially done with a
list comprehension, maybe the other way is more interesting...)

I think, as long as there is one obvious mapping for a given type, I
wouldn't mind letting Cython apply it automatically.

> What would people think about making it possible to declare these in a
> C++ file? Being able to make arbitrary mappings anywhere between types
> is contextless global state that I'd rather avoid, but perhaps special
> methods defined on the class such as
> 
(Continue reading)

Robert Bradshaw | 28 Jun 2012 12:07
Picon

Re: [Cython] Automatic C++ conversions

On Thu, Jun 28, 2012 at 2:54 AM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
> Robert Bradshaw, 28.06.2012 10:59:
>> I've been looking how painful it is to constantly convert between
>> Python objects and string in C++.
>
> You mean std::string (as I think it's called)? Can't we just special case
> that in the same way that we special case char* and friends?

Yes, we could. If we do that it'd make sense to special case list and
vector and pair and and map and set as well, though perhaps those are
special enough to hard code them, and it makes the language simpler to
not have more special methods.

> Basically just
> one type more in that list. And it would give you efficient
> encoding/decoding more or less for free.
>
> I mean, well, it would likely break existing code to start doing that (in
> the same way that we broke code by enabling type inference for convertible
> pointers), but as long as it helps more than it breaks ...

I don't think it'd be backwards compatible, currently it's just an error.

>> Yes, it's easy to write a utility,
>> but this should be as natural (if not more so, as the length is
>> explicit) than bytes <-> char*. Several other of the libcpp classes
>> (vector, map) have natural Python analogues too.
>
> And you would want to enable coercion to those, too? Have a vector copy
> into a Python list automatically? (Although that's trivially done with a
(Continue reading)

Stefan Behnel | 28 Jun 2012 14:10
Picon
Favicon

Re: [Cython] Automatic C++ conversions

Robert Bradshaw, 28.06.2012 12:07:
> On Thu, Jun 28, 2012 at 2:54 AM, Stefan Behnel wrote:
>> Robert Bradshaw, 28.06.2012 10:59:
>>> I've been looking how painful it is to constantly convert between
>>> Python objects and string in C++.
>>
>> You mean std::string (as I think it's called)? Can't we just special case
>> that in the same way that we special case char* and friends?
> 
> Yes, we could.

Then I think it makes sense to do that. Basically, the std::string type
would set its is_string flag and then we need the actual coercion code for it.

> If we do that it'd make sense to special case list and
> vector and pair and and map and set as well, though perhaps those are
> special enough to hard code them, and it makes the language simpler to
> not have more special methods.

Ok, then it's

std::string <=> bytes
std::vector <=> list
std::map <=> dict
std::set <=> set

Potentially also:

std::pair => tuple (maybe 2-tuple => std::pair with a runtime length test?)

(Continue reading)

Robert Bradshaw | 30 Jun 2012 00:38
Picon

Re: [Cython] Automatic C++ conversions

On Thu, Jun 28, 2012 at 5:10 AM, Stefan Behnel <stefan_ml@...> wrote:
> Robert Bradshaw, 28.06.2012 12:07:
>> On Thu, Jun 28, 2012 at 2:54 AM, Stefan Behnel wrote:
>>> Robert Bradshaw, 28.06.2012 10:59:
>>>> I've been looking how painful it is to constantly convert between
>>>> Python objects and string in C++.
>>>
>>> You mean std::string (as I think it's called)? Can't we just special case
>>> that in the same way that we special case char* and friends?
>>
>> Yes, we could.
>
> Then I think it makes sense to do that. Basically, the std::string type
> would set its is_string flag and then we need the actual coercion code for it.

I just leveraged the object <-> char* conversion in our utility code.

>> If we do that it'd make sense to special case list and
>> vector and pair and and map and set as well, though perhaps those are
>> special enough to hard code them, and it makes the language simpler to
>> not have more special methods.
>
> Ok, then it's
>
> std::string <=> bytes
> std::vector <=> list
> std::map <=> dict
> std::set <=> set
>
> Potentially also:
(Continue reading)

Stefan Behnel | 30 Jun 2012 01:06
Picon
Favicon

Re: [Cython] Automatic C++ conversions

Robert Bradshaw, 30.06.2012 00:38:
> I implemented
> 
> std::string <=> bytes
> std::map <=> dict
> iterable => std::vector => list
> iterable => std::list => list
> iterable => std::set => set
> 2-iterable => std::pair => 2-tuple

Very cool.

>> What about allowing list(<C++ iterator>) etc.? As long as the item type can
>> be coerced at compile time, this should be doable:
>>
>> <C++ iterator> => Python iterator
>>
>> and it would even be easy to implement in Cython code using a generator
>> function.
> 
> The tricky part is memory management; one would have to make sure the
> iterable is valid as long as the Python object is around (whereas its
> usually bound to the lifetime of its container).

Ok, that's a problem then. We won't normally have any control over the
container. That makes for-in a much more interesting solution.

> Even more useful, however, would be supporting the "for ... in" syntax
> for C++ iterators, which I plan to implement soon if no one beats me
> to it.
(Continue reading)

Robert Bradshaw | 30 Jun 2012 01:20
Picon

Re: [Cython] Automatic C++ conversions

On Fri, Jun 29, 2012 at 4:06 PM, Stefan Behnel <stefan_ml@...> wrote:
> Robert Bradshaw, 30.06.2012 00:38:
>> I implemented
>>
>> std::string <=> bytes
>> std::map <=> dict
>> iterable => std::vector => list
>> iterable => std::list => list
>> iterable => std::set => set
>> 2-iterable => std::pair => 2-tuple
>
> Very cool.
>
>
>>> What about allowing list(<C++ iterator>) etc.? As long as the item type can
>>> be coerced at compile time, this should be doable:
>>>
>>> <C++ iterator> => Python iterator
>>>
>>> and it would even be easy to implement in Cython code using a generator
>>> function.
>>
>> The tricky part is memory management; one would have to make sure the
>> iterable is valid as long as the Python object is around (whereas its
>> usually bound to the lifetime of its container).
>
> Ok, that's a problem then. We won't normally have any control over the
> container. That makes for-in a much more interesting solution.
>
>
(Continue reading)

Sturla Molden | 2 Jul 2012 14:49
Picon
Gravatar

Re: [Cython] Automatic C++ conversions

On 30.06.2012 01:06, Stefan Behnel wrote:

>> std::string<=>  bytes
>> std::map<=>  dict
>> iterable =>  std::vector =>  list
>> iterable =>  std::list =>  list
>> iterable =>  std::set =>  set
>> 2-iterable =>  std::pair =>  2-tuple
>
> Very cool.

I think (in C++11) std::unordered_set and std::unordered_map should be 
used instead. They are hash-based with O(1) lookup.

std::set and std::map are binary search threes with average O(log n) 
lookup and worst-case O(n**2).

Also beware that C++11 has a std:tuple type.

Sturla Molden
Sturla Molden | 2 Jul 2012 14:52
Picon
Gravatar

Re: [Cython] Automatic C++ conversions

On 02.07.2012 14:49, Sturla Molden wrote:

> I think (in C++11) std::unordered_set and std::unordered_map should be
> used instead. They are hash-based with O(1) lookup.
>
> std::set and std::map are binary search threes with average O(log n)
> lookup and worst-case O(n**2).

Sorry typo, that should be worst-case O(n).

Sturla
Robert Bradshaw | 2 Jul 2012 18:09
Picon

Re: [Cython] Automatic C++ conversions

On Mon, Jul 2, 2012 at 5:49 AM, Sturla Molden <sturla <at> molden.no> wrote:
> On 30.06.2012 01:06, Stefan Behnel wrote:
>
>>> std::string<=>  bytes
>>> std::map<=>  dict
>>> iterable =>  std::vector =>  list
>>> iterable =>  std::list =>  list
>>> iterable =>  std::set =>  set
>>> 2-iterable =>  std::pair =>  2-tuple
>>
>>
>> Very cool.
>
>
> I think (in C++11) std::unordered_set and std::unordered_map should be used
> instead. They are hash-based with O(1) lookup.
>
> std::set and std::map are binary search threes with average O(log n) lookup
> and worst-case O(n**2).
>
> Also beware that C++11 has a std:tuple type.

The object => C++ coercion is always explicit, there's no choice made
here. This is used to do, e.g.

    cdef map<int, vector<double>> my_cpp_map = o

or

    cdef extern from "mylibrary.h":
(Continue reading)


Gmane