Daniele Varrazzo | 24 Jan 2011 01:33
Picon
Gravatar

Details about the psycopg porting

Hello,

I've written to the Psycopg mailing list about the details in the
psycopg2 porting to Python 3. You can also read everything here:
<http://initd.org/psycopg/articles/2011/01/24/psycopg2-porting-python-3-report/>.

There is a couple of points still open, so if you want to take a look
at them I'd be happy to receive comments before releasing the code.

Regards,

-- Daniele
Lennart Regebro | 24 Jan 2011 08:21
Picon
Gravatar

Re: Details about the psycopg porting

On Mon, Jan 24, 2011 at 01:33, Daniele Varrazzo
<daniele.varrazzo@...> wrote:
> Hello,
>
> I've written to the Psycopg mailing list about the details in the
> psycopg2 porting to Python 3. You can also read everything here:
> <http://initd.org/psycopg/articles/2011/01/24/psycopg2-porting-python-3-report/>.
>
> There is a couple of points still open, so if you want to take a look
> at them I'd be happy to receive comments before releasing the code.

"Is there an interface in Python 3 to know if a file is binary or text?"

You can check if it inherits from io.TextIOBase or not. I think that's
the official way. correct me if I'm wrong.
For the other issues I guess I would have to know psycopg2 to be able
to help. :-)

//Lennart
Antoine Pitrou | 24 Jan 2011 16:20

Re: Details about the psycopg porting


Hello,

> I've written to the Psycopg mailing list about the details in the
> psycopg2 porting to Python 3. You can also read everything here:
> <http://initd.org/psycopg/articles/2011/01/24/psycopg2-porting-python-3-report/>.
> 
> There is a couple of points still open, so if you want to take a look
> at them I'd be happy to receive comments before releasing the code.

From your article:

> the data (bytes) from the libpq are passed to file.write() using
> PyObject_CallFunction(func, "s#", buffer, len)”

You shouldn't use "s#" as it will implicitly decode the buffer to unicode.
Instead, use "y#" to write bytes.

> Is there an interface in Python 3 to know if a file is binary or text?

`isinstance(myfile, io.TextIOBase)` should do the trick. Or the corresponding C
call, using PyObject_IsInstance().

> In binary mode the file always returns bytes (str in py2, unicode in py3)

I suppose you mean "str in py2, bytes in py3".

> bytea fields are returned as MemoryView, from which is easy to get bytes

Is this because it is easier for you to return a memoryview? Otherwise it would 
(Continue reading)

Daniele Varrazzo | 24 Jan 2011 17:10
Picon
Gravatar

Re: Details about the psycopg porting

On Mon, Jan 24, 2011 at 3:20 PM, Antoine Pitrou <solipsis@...> wrote:
>
> Hello,
>
>> I've written to the Psycopg mailing list about the details in the
>> psycopg2 porting to Python 3. You can also read everything here:
>> <http://initd.org/psycopg/articles/2011/01/24/psycopg2-porting-python-3-report/>.
>>
>> There is a couple of points still open, so if you want to take a look
>> at them I'd be happy to receive comments before releasing the code.
>
> From your article:
>
>> the data (bytes) from the libpq are passed to file.write() using
>> PyObject_CallFunction(func, "s#", buffer, len)”
>
> You shouldn't use "s#" as it will implicitly decode the buffer to unicode.
> Instead, use "y#" to write bytes.

Yes, the #s is a leftover from before the conversion: I just have to
decide whether it's better to always emit bytes and break on text
files or if to check for the file capability. Because text mode is the
default for open() I think the former would be surprising: I'll go for
the second option if not overly complex (seems trivial if
PyTextIOBase_Type is available in C without the need of importing
anything from Python, annoying otherwise).

>> In binary mode the file always returns bytes (str in py2, unicode in py3)
>
> I suppose you mean "str in py2, bytes in py3".
(Continue reading)

Antoine Pitrou | 24 Jan 2011 17:21

Re: Details about the psycopg porting

Daniele Varrazzo <daniele.varrazzo <at> ...> writes:
> >> the data (bytes) from the libpq are passed to file.write() using
> >> PyObject_CallFunction(func, "s#", buffer, len)”
> >
> > You shouldn't use "s#" as it will implicitly decode the buffer to unicode.
> > Instead, use "y#" to write bytes.
> 
> Yes, the #s is a leftover from before the conversion: I just have to
> decide whether it's better to always emit bytes and break on text
> files or if to check for the file capability. Because text mode is the
> default for open() I think the former would be surprising: I'll go for
> the second option if not overly complex (seems trivial if
> PyTextIOBase_Type is available in C without the need of importing
> anything from Python, annoying otherwise).

No, you'll have to import. The actual TextIOBase ABC is declared in Python.
(see Lib/io.py if you are curious)

> >> bytea fields are returned as MemoryView, from which is easy to get bytes
> >
> > Is this because it is easier for you to return a memoryview? Otherwise it
would
> > make more sense to return a bytes object.
> 
> In Py2 bytea is converted to buffer objects, passing through a "chunk"
> object implementing the buffer interface. so yes, MemoryView is a more
> direct port.

Well, does it point to some external memory managed by pgsql itself? Otherwise 
bytes or bytearray would still be a better choice IMO (as in better-known and 
(Continue reading)

Daniele Varrazzo | 25 Jan 2011 01:24
Picon
Gravatar

Re: Details about the psycopg porting

On Mon, Jan 24, 2011 at 4:21 PM, Antoine Pitrou <solipsis@...> wrote:
> Daniele Varrazzo <daniele.varrazzo <at> ...> writes:
>> >> the data (bytes) from the libpq are passed to file.write() using
>> >> PyObject_CallFunction(func, "s#", buffer, len)”
>> >
>> > You shouldn't use "s#" as it will implicitly decode the buffer to unicode.
>> > Instead, use "y#" to write bytes.
>>
>> Yes, the #s is a leftover from before the conversion: I just have to
>> decide whether it's better to always emit bytes and break on text
>> files or if to check for the file capability. Because text mode is the
>> default for open() I think the former would be surprising: I'll go for
>> the second option if not overly complex (seems trivial if
>> PyTextIOBase_Type is available in C without the need of importing
>> anything from Python, annoying otherwise).
>
> No, you'll have to import. The actual TextIOBase ABC is declared in Python.
> (see Lib/io.py if you are curious)

Annoying, then :) Will give it a try.

>> >> bytea fields are returned as MemoryView, from which is easy to get bytes
>> >
>> > Is this because it is easier for you to return a memoryview? Otherwise it
> would
>> > make more sense to return a bytes object.
>>
>> In Py2 bytea is converted to buffer objects, passing through a "chunk"
>> object implementing the buffer interface. so yes, MemoryView is a more
>> direct port.
(Continue reading)


Gmane