Stefan Behnel | 22 Apr 12:57 2012
Picon

[Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

Hi,

I keep seeing test crashes in Py3.2 debug builds on Jenkins with the latest
master, referring to ref-counting problems. Here, for example:

https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py32-ext/328/console

and likewise in the series of builds starting at #312.

I'm not sure what changed in that corner, it looks like one of those
problems that was there for a while and suddenly starts showing when you
touch that innocent looking brick at the other end of the house that was
holding the balance, until now. In this case, that brick was this commit:

https://github.com/cython/cython/commit/c8f61a668d0ce8af1020f520253e1b5f623cf349

I reverted it here and that fixed the problem at least temporarily:

https://github.com/cython/cython/commit/5aac8caf1ba21933cc85a51af3319c78fc08d675

but it seems to be back now (after my refnanny optimisations). Before
reverting my changes, I was able to reproduce it somewhat reliably on
sage.math by running these three tests together (non-forking):

	memslice numpy_bufacc numpy_memoryview

None of the tests shows the problem when run by itself. I can't tell if
it's also in the latest py3k because I don't have a NumPy lying around that
works there. So 3.2 is basically the latest Python version this can be
tested with, and it doesn't occur in the 3.1 tests. The tests use NumPy
(Continue reading)

mark florisson | 22 Apr 13:54 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

On 22 April 2012 11:57, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
> Hi,
>
> I keep seeing test crashes in Py3.2 debug builds on Jenkins with the latest
> master, referring to ref-counting problems. Here, for example:
>
> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py32-ext/328/console
>
> and likewise in the series of builds starting at #312.
>
> I'm not sure what changed in that corner, it looks like one of those
> problems that was there for a while and suddenly starts showing when you
> touch that innocent looking brick at the other end of the house that was
> holding the balance, until now. In this case, that brick was this commit:
>
> https://github.com/cython/cython/commit/c8f61a668d0ce8af1020f520253e1b5f623cf349
>
> I reverted it here and that fixed the problem at least temporarily:
>
> https://github.com/cython/cython/commit/5aac8caf1ba21933cc85a51af3319c78fc08d675
>
> but it seems to be back now (after my refnanny optimisations). Before
> reverting my changes, I was able to reproduce it somewhat reliably on
> sage.math by running these three tests together (non-forking):
>
>        memslice numpy_bufacc numpy_memoryview
>
> None of the tests shows the problem when run by itself. I can't tell if
> it's also in the latest py3k because I don't have a NumPy lying around that
> works there. So 3.2 is basically the latest Python version this can be
(Continue reading)

Stefan Behnel | 22 Apr 14:34 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

mark florisson, 22.04.2012 13:54:
> On 22 April 2012 11:57, Stefan Behnel wrote:
>> I keep seeing test crashes in Py3.2 debug builds on Jenkins with the latest
>> master, referring to ref-counting problems. Here, for example:
>>
>> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py32-ext/328/console
>>
>> and likewise in the series of builds starting at #312.
>>
>> I'm not sure what changed in that corner, it looks like one of those
>> problems that was there for a while and suddenly starts showing when you
>> touch that innocent looking brick at the other end of the house that was
>> holding the balance, until now. In this case, that brick was this commit:
>>
>> https://github.com/cython/cython/commit/c8f61a668d0ce8af1020f520253e1b5f623cf349
>>
>> I reverted it here and that fixed the problem at least temporarily:
>>
>> https://github.com/cython/cython/commit/5aac8caf1ba21933cc85a51af3319c78fc08d675
>>
>> but it seems to be back now (after my refnanny optimisations). Before
>> reverting my changes, I was able to reproduce it somewhat reliably on
>> sage.math by running these three tests together (non-forking):
>>
>>        memslice numpy_bufacc numpy_memoryview
>>
>> None of the tests shows the problem when run by itself. I can't tell if
>> it's also in the latest py3k because I don't have a NumPy lying around that
>> works there. So 3.2 is basically the latest Python version this can be
>> tested with, and it doesn't occur in the 3.1 tests. The tests use NumPy
(Continue reading)

mark florisson | 22 Apr 16:31 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

On 22 April 2012 13:34, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
> mark florisson, 22.04.2012 13:54:
>> On 22 April 2012 11:57, Stefan Behnel wrote:
>>> I keep seeing test crashes in Py3.2 debug builds on Jenkins with the latest
>>> master, referring to ref-counting problems. Here, for example:
>>>
>>> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py32-ext/328/console
>>>
>>> and likewise in the series of builds starting at #312.
>>>
>>> I'm not sure what changed in that corner, it looks like one of those
>>> problems that was there for a while and suddenly starts showing when you
>>> touch that innocent looking brick at the other end of the house that was
>>> holding the balance, until now. In this case, that brick was this commit:
>>>
>>> https://github.com/cython/cython/commit/c8f61a668d0ce8af1020f520253e1b5f623cf349
>>>
>>> I reverted it here and that fixed the problem at least temporarily:
>>>
>>> https://github.com/cython/cython/commit/5aac8caf1ba21933cc85a51af3319c78fc08d675
>>>
>>> but it seems to be back now (after my refnanny optimisations). Before
>>> reverting my changes, I was able to reproduce it somewhat reliably on
>>> sage.math by running these three tests together (non-forking):
>>>
>>>        memslice numpy_bufacc numpy_memoryview
>>>
>>> None of the tests shows the problem when run by itself. I can't tell if
>>> it's also in the latest py3k because I don't have a NumPy lying around that
>>> works there. So 3.2 is basically the latest Python version this can be
(Continue reading)

mark florisson | 22 Apr 16:41 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

On 22 April 2012 15:31, mark florisson <markflorisson88 <at> gmail.com> wrote:
> On 22 April 2012 13:34, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
>> mark florisson, 22.04.2012 13:54:
>>> On 22 April 2012 11:57, Stefan Behnel wrote:
>>>> I keep seeing test crashes in Py3.2 debug builds on Jenkins with the latest
>>>> master, referring to ref-counting problems. Here, for example:
>>>>
>>>> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py32-ext/328/console
>>>>
>>>> and likewise in the series of builds starting at #312.
>>>>
>>>> I'm not sure what changed in that corner, it looks like one of those
>>>> problems that was there for a while and suddenly starts showing when you
>>>> touch that innocent looking brick at the other end of the house that was
>>>> holding the balance, until now. In this case, that brick was this commit:
>>>>
>>>> https://github.com/cython/cython/commit/c8f61a668d0ce8af1020f520253e1b5f623cf349
>>>>
>>>> I reverted it here and that fixed the problem at least temporarily:
>>>>
>>>> https://github.com/cython/cython/commit/5aac8caf1ba21933cc85a51af3319c78fc08d675
>>>>
>>>> but it seems to be back now (after my refnanny optimisations). Before
>>>> reverting my changes, I was able to reproduce it somewhat reliably on
>>>> sage.math by running these three tests together (non-forking):
>>>>
>>>>        memslice numpy_bufacc numpy_memoryview
>>>>
>>>> None of the tests shows the problem when run by itself. I can't tell if
>>>> it's also in the latest py3k because I don't have a NumPy lying around that
(Continue reading)

Stefan Behnel | 23 Apr 10:23 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

mark florisson, 22.04.2012 16:41:
> On 22 April 2012 15:31, mark florisson wrote:
>> I think the problem here
>> is that a single memoryview object is traversed multiple times through
>> different traverse functions, and that the refcount doesn't match the
>> number of traverses. Indeed, the refcount is only one, as the actual
>> count is the acquisition count. So we shouldn't traverse the
>> memoryview objects in memoryview slices, i.e. not
>> _memoryviewslice.from_slice.memview. I'll come up with a commit
>> shortly, would you be willing to test it?
> 
> A fix is here: https://github.com/markflorisson88/cython/commit/cd32184f3f782b6d7275cf430694b59801ce642a
> 
> Lets see what jenkins has to say :)

Seems to like it.

> BTW, tp_clear calls Py_CLEAR on Py_buffer.obj, shouldn't it call
> releasebuffer instead?

Where is that? The memoryview class calls __Pyx_ReleaseBuffer() here.

BTW, why are some of the self arguments in MemoryView.pyx explicitly typed
as "memoryview"?

Stefan
mark florisson | 23 Apr 10:32 2012
Picon

Re: [Cython] test crash in Py3.2 (NumPy/memoryview/refcounting related?)

On 23 April 2012 09:23, Stefan Behnel <stefan_ml@...> wrote:
> mark florisson, 22.04.2012 16:41:
>> On 22 April 2012 15:31, mark florisson wrote:
>>> I think the problem here
>>> is that a single memoryview object is traversed multiple times through
>>> different traverse functions, and that the refcount doesn't match the
>>> number of traverses. Indeed, the refcount is only one, as the actual
>>> count is the acquisition count. So we shouldn't traverse the
>>> memoryview objects in memoryview slices, i.e. not
>>> _memoryviewslice.from_slice.memview. I'll come up with a commit
>>> shortly, would you be willing to test it?
>>
>> A fix is here: https://github.com/markflorisson88/cython/commit/cd32184f3f782b6d7275cf430694b59801ce642a
>>
>> Lets see what jenkins has to say :)
>
> Seems to like it.
>
>
>> BTW, tp_clear calls Py_CLEAR on Py_buffer.obj, shouldn't it call
>> releasebuffer instead?
>
> Where is that? The memoryview class calls __Pyx_ReleaseBuffer() here.

Yes, but ModuleNode generates a tp_clear for Py_buffer cdef class
attributes that clears Py_buffer.obj, which means a subsequent
__dealloc__ calling release buffer cannot call the release buffer
function on the original object.

> BTW, why are some of the self arguments in MemoryView.pyx explicitly typed
(Continue reading)

Stefan Behnel | 23 Apr 10:42 2012
Picon

[Cython] tp_clear() of buffer client objects (was: Re: test crash in Py3.2 (NumPy/memoryview/refcounting related?))

Stefan Behnel, 23.04.2012 10:23:
> mark florisson, 22.04.2012 16:41:
>> On 22 April 2012 15:31, mark florisson wrote:
>>> I think the problem here
>>> is that a single memoryview object is traversed multiple times through
>>> different traverse functions, and that the refcount doesn't match the
>>> number of traverses. Indeed, the refcount is only one, as the actual
>>> count is the acquisition count. So we shouldn't traverse the
>>> memoryview objects in memoryview slices, i.e. not
>>> _memoryviewslice.from_slice.memview. I'll come up with a commit
>>> shortly, would you be willing to test it?
>>
>> BTW, tp_clear calls Py_CLEAR on Py_buffer.obj, shouldn't it call
>> releasebuffer instead?
> 
> Where is that? The memoryview class calls __Pyx_ReleaseBuffer() here.

Ah, found it. Yes, tp_clear() would be called before __dealloc__() in
reference cycles, so that's a problem. I'm not sure tp_clear should do
something as (potentially) involved as freeing the buffer, but if the
Py_buffer is owned by the object, then I guess it just has to do that.
Otherwise, it would leak the buffer.

The problem is that this also impacts user code, though, so a change might
break code, e.g. when it needs to do some cleanup on its own before freeing
the buffer. It would make the code more correct, sure, but it would still
break it. Guess we have to take that route, though...

Stefan
(Continue reading)

mark florisson | 23 Apr 11:10 2012
Picon

Re: [Cython] tp_clear() of buffer client objects (was: Re: test crash in Py3.2 (NumPy/memoryview/refcounting related?))

On 23 April 2012 09:42, Stefan Behnel <stefan_ml@...> wrote:
> Stefan Behnel, 23.04.2012 10:23:
>> mark florisson, 22.04.2012 16:41:
>>> On 22 April 2012 15:31, mark florisson wrote:
>>>> I think the problem here
>>>> is that a single memoryview object is traversed multiple times through
>>>> different traverse functions, and that the refcount doesn't match the
>>>> number of traverses. Indeed, the refcount is only one, as the actual
>>>> count is the acquisition count. So we shouldn't traverse the
>>>> memoryview objects in memoryview slices, i.e. not
>>>> _memoryviewslice.from_slice.memview. I'll come up with a commit
>>>> shortly, would you be willing to test it?
>>>
>>> BTW, tp_clear calls Py_CLEAR on Py_buffer.obj, shouldn't it call
>>> releasebuffer instead?
>>
>> Where is that? The memoryview class calls __Pyx_ReleaseBuffer() here.
>
> Ah, found it. Yes, tp_clear() would be called before __dealloc__() in
> reference cycles, so that's a problem. I'm not sure tp_clear should do
> something as (potentially) involved as freeing the buffer, but if the
> Py_buffer is owned by the object, then I guess it just has to do that.
> Otherwise, it would leak the buffer.
>
> The problem is that this also impacts user code, though, so a change might
> break code, e.g. when it needs to do some cleanup on its own before freeing
> the buffer. It would make the code more correct, sure, but it would still
> break it. Guess we have to take that route, though...
>
> Stefan
(Continue reading)

Stefan Behnel | 23 Apr 11:32 2012
Picon

Re: [Cython] tp_clear() of buffer client objects

mark florisson, 23.04.2012 11:10:
> On 23 April 2012 09:42, Stefan Behnel wrote:
>> Stefan Behnel, 23.04.2012 10:23:
>>> mark florisson, 22.04.2012 16:41:
>>>> On 22 April 2012 15:31, mark florisson wrote:
>>>>> I think the problem here
>>>>> is that a single memoryview object is traversed multiple times through
>>>>> different traverse functions, and that the refcount doesn't match the
>>>>> number of traverses. Indeed, the refcount is only one, as the actual
>>>>> count is the acquisition count. So we shouldn't traverse the
>>>>> memoryview objects in memoryview slices, i.e. not
>>>>> _memoryviewslice.from_slice.memview. I'll come up with a commit
>>>>> shortly, would you be willing to test it?
>>>>
>>>> BTW, tp_clear calls Py_CLEAR on Py_buffer.obj, shouldn't it call
>>>> releasebuffer instead?
>>>
>>> Where is that? The memoryview class calls __Pyx_ReleaseBuffer() here.
>>
>> Ah, found it. Yes, tp_clear() would be called before __dealloc__() in
>> reference cycles, so that's a problem. I'm not sure tp_clear should do
>> something as (potentially) involved as freeing the buffer, but if the
>> Py_buffer is owned by the object, then I guess it just has to do that.
>> Otherwise, it would leak the buffer.
>>
>> The problem is that this also impacts user code, though, so a change might
>> break code, e.g. when it needs to do some cleanup on its own before freeing
>> the buffer. It would make the code more correct, sure, but it would still
>> break it. Guess we have to take that route, though...
>
(Continue reading)


Gmane