Oliver Willekens | 26 Jul 2012 16:53
Picon
Favicon

Returning an array to python numpy gives segmentation fault

Hi,

the numpy.i interface provides us with most of the items I need for a project, however, I have a case whereby I build an array in C++ and need to pass it to numpy. The size of the array is unknown at the beginning, so I cannot use the numpy.i typedefs for ARGOUT_ARRAY1 (as these require you to supply the dimension of the array to be returned).

Below is my code, which runs into a segmentation fault upon calling test.func(). Gdb backtrace is provided as well.

======= test.cxx =========
#include "test.h"
#include <iostream>
#include <Python.h>
#include <numpy/npy_common.h>
#include <numpy/arrayobject.h>

using namespace std;
void func(double * points, int nbr_points, PyObject **py_A)
{
    npy_intp *dims = new npy_intp[1];
    dims[0] = 3; //makes the length along the first (and only, in this example) dimension equal to 3
    double * data = new double[3];
   
    data[0] = points[0];
    data[1] = (double) nbr_points;
    data[2] = 3.3;

    *py_A = PyArray_SimpleNew(1,dims,NPY_DOUBLE);
    double * t = (double*)PyArray_DATA((PyArrayObject*) (*py_A));
    t = data;
    delete[] dims;
    //delete[] data;*/
}

void func2(double * points, int nbr_points)
{
    cout << points[0] << " " << nbr_points << endl;
}


============ test.h ===========
#include <Python.h>
//#include <numpy/npy_common.h>
//#include <numpy/arrayobject.h>
 
 
void func(double * points, int nbr_points, PyObject **py_A);
void func2(double * points, int nbr_points);



========== test.i ===========
%module test
 
%{
#define SWIG_FILE_WITH_INIT
#include "test.h"
%}
 
%include "numpy.i"
 
 
%init %{
    import_array();
%}
 
%include "typemaps.i"
 
// make use of the typemap provided by numpi.i
%apply (double* IN_ARRAY1, int DIM1 ) {(double * points, int nbr_points)}
 
// this makes use of typemaps.i, as I see no way to do the above using numpy.i
%apply PyObject **OutValue {PyObject **py_A}
 
 
%include "test.h"
 
============== typemaps.i ==============
// Code originates from M. Fiers
/*
 * This tells SWIG to treat an double * argument with name 'OutValue' as
 * an output value.  We'll append the value to the current result which
 * is guaranteed to be a List object by SWIG.
 */
 
%typemap(argout) PyObject **OutValue {
    PyObject *o, *o2, *o3;
   
    o = *$1;
   
    if ((!$result) || ($result == Py_None)) {
        $result = o;
    } else {
        if (!PyTuple_Check($result)) {
            PyObject *o2 = $result;
            $result = PyTuple_New(1);
            PyTuple_SetItem($result,0,o2);
        }
        o3 = PyTuple_New(1);
        PyTuple_SetItem(o3,0,o);
        o2 = $result;
        $result = PySequence_Concat(o2,o3);
        Py_DECREF(o2);
        Py_DECREF(o3);
    }
}

// This is used because the arguments are not used as input arguments, but as output arguments.
%typemap(in,numinputs=0) PyObject **OutValue(PyObject *temp) {
    $1 = &temp;
}



==========compilation sequence ========
#!/bin/bash

if [ $# -ne 1 ]; then
    echo "please indicate the filename (without extensions) as the first CL option"
    exit
fi

file=$(basename "$1" )
echo "working on '$file'"
echo "swigging the main interface file..."
swig -c++ -python $file.i

echo "compiling the C/C++ file with python2.7 included..."
g++ -O0 -g -fPIC -I/usr/include/python2.7 -c ${file}.cxx ${file}_wrap.cxx

echo "linking object files..."
g++ -O0 -g -shared $file.o ${file}_wrap.o -o _$file.so

echo "done"

=========testrun.py=========
import numpy as np
import test

a=np.zeros(3)
b=np.array([1., 2. , 3.])

c=test.func2(b)
d=test.func(a)


================ gdb backtrace =========
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Python 2.7.3 (default, Apr 20 2012, 22:39:59)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import testrun
1 3

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff28d89d6 in func (points=0xecabb0, nbr_points=3, py_A=0x7fffffffa6e8) at test.cxx:18
18        *py_A = PyArray_SimpleNew(1,dims,NPY_DOUBLE);
(gdb) bt
#0  0x00007ffff28d89d6 in func (points=0xecabb0, nbr_points=3, py_A=0x7fffffffa6e8) at test.cxx:18
#1  0x00007ffff28dd83a in _wrap_func (args=0x7ffff7f55ed0) at test_wrap.cxx:3454
#2  0x000000000042a485 in PyEval_EvalFrameEx ()
#3  0x00000000004317f2 in PyEval_EvalCodeEx ()
#4  0x000000000054a078 in PyImport_ExecCodeModuleEx ()
#5  0x000000000050d091 in ?? ()
#6  0x000000000050da8b in ?? ()
#7  0x0000000000488793 in ?? ()
#8  0x000000000050e2e4 in ?? ()
#9  0x0000000000432f0b in ?? ()
#10 0x00000000004c7c76 in PyObject_Call ()
#11 0x00000000004c7d36 in PyEval_CallObjectWithKeywords ()
#12 0x000000000042c8a5 in PyEval_EvalFrameEx ()
#13 0x00000000004317f2 in PyEval_EvalCodeEx ()
#14 0x000000000054bd50 in PyRun_InteractiveOneFlags ()
#15 0x000000000054c045 in PyRun_InteractiveLoopFlags ()
#16 0x000000000054ce9f in Py_Main ()
#17 0x00007ffff68e576d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#18 0x000000000041b931 in _start ()
(gdb)


Does anyone have an idea why this is not working or perhaps a suggestion for an alternative? I would be grateful for either :-)

- Oliver

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Swig-user mailing list
Swig-user <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/swig-user
David Froger | 29 Jul 2012 21:25
Picon

Re: Returning an array to python numpy gives segmentation fault

Hi Oliver,

> the numpy.i interface provides us with most of the items I need for a
> project, however, I have a case whereby I build an array in C++ and need to
> pass it to numpy. The size of the array is unknown at the beginning, so I
> cannot use the numpy.i typedefs for ARGOUT_ARRAY1 (as these require you to
> supply the dimension of the array to be returned).

Could argoutview arrays be what you need?
http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#argoutview-arrays

> ======= test.cxx =========
> #include "test.h"
> #include <iostream>
> #include <Python.h>
> #include <numpy/npy_common.h>
> #include <numpy/arrayobject.h>
> 
> using namespace std;
> void func(double * points, int nbr_points, PyObject **py_A)

I'm not sure what 'func' do. Why is there a PyObject **py_A argument?

Could it be re-written like this? :

    void func(double ** points, int * nbr_points) {
      C++ allocate and give values to points and nbr_points
    }

    When this function is called from Python, a typemap convert
    (double **npoints, int *nbr_points) to a numpy array.

Regards,
David

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Oliver Willekens | 30 Jul 2012 00:04
Picon
Favicon

Re: Returning an array to python numpy gives segmentation fault



2012/7/29 David Froger <david.froger <at> gmail.com>
Hi Oliver,

> the numpy.i interface provides us with most of the items I need for a
> project, however, I have a case whereby I build an array in C++ and need to
> pass it to numpy. The size of the array is unknown at the beginning, so I
> cannot use the numpy.i typedefs for ARGOUT_ARRAY1 (as these require you to
> supply the dimension of the array to be returned).

Could argoutview arrays be what you need?
http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#argoutview-arrays

I don't think so, unless I'm missing the point of argoutview. From the example at http://www.scipy.org/Cookbook/SWIG_NumPy_examples , it looks like here too the size of the array to be returned is known at compile time, which is not the case.


> ======= test.cxx =========
> #include "test.h"
> #include <iostream>
> #include <Python.h>
> #include <numpy/npy_common.h>
> #include <numpy/arrayobject.h>
>
> using namespace std;
> void func(double * points, int nbr_points, PyObject **py_A)

I'm not sure what 'func' do. Why is there a PyObject **py_A argument?

That's actually the name of the array to be returned, the points and nbr_points data is merely used to pass 1 array (points) in, together with its dimensions (nbr_points). But inside func, a new array gets created (py_A) which should be returned to numpy (probably together with its dimensions, which are known only at runtime). In your comment below, you continue with points as the output array, which is just as good, my example wasn't clear).


Could it be re-written like this? :

    void func(double ** points, int * nbr_points) {
      C++ allocate and give values to points and nbr_points
    }

    When this function is called from Python, a typemap convert
    (double **npoints, int *nbr_points) to a numpy array.

Regards,
David


Yes, I believe you understand the required functionality. Just to confirm, the function could be called without any arguments from Python, it just needs to return a numpy array (or 2, so the C++ code cannot simply return a pointer). Another example of the required function would be (with some pseudo code):

=== Python call ===
(a,b) = test.return_2_random_arrays()

=== C++ ===
void return_2_random_arrays(double * points, int * points_dimensions, double * serial_numbers, int * serial_numbers_dim)
{
// points is an nx2 array
int rows = rand();
points_dimensions = {rows, 2};
points = new double[rows * 2];
// some functionality here to fill this contiguous array with random data

// serial_numbers is an mx1 array
rows = rand(); // take another number for m
serial_numbers_dim = {rows, 1};
serial_numbers = new double[ rows ];
// some functionality here to fill this array
}

Thanks in advance,

Oliver
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Swig-user mailing list
Swig-user <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/swig-user
David Froger | 30 Jul 2012 10:34
Picon

Re: Returning an array to python numpy gives segmentation fault

On Mon, 30 Jul 2012 00:04:34 +0200, Oliver Willekens <oliver.willekens <at> elis.ugent.be> wrote:
> 2012/7/29 David Froger <david.froger <at> gmail.com>
> 
> > Hi Oliver,
> >
> > > the numpy.i interface provides us with most of the items I need for a
> > > project, however, I have a case whereby I build an array in C++ and need
> > to
> > > pass it to numpy. The size of the array is unknown at the beginning, so I
> > > cannot use the numpy.i typedefs for ARGOUT_ARRAY1 (as these require you
> > to
> > > supply the dimension of the array to be returned).
> >
> > Could argoutview arrays be what you need?
> >
> > http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#argoutview-arrays
> >
> 
> I don't think so, unless I'm missing the point of argoutview. From the
> example at http://www.scipy.org/Cookbook/SWIG_NumPy_examples , it looks
> like here too the size of the array to be returned is known at compile
> time, which is not the case.

Actually,  is this example,  the  typemap is not applied on  my_ones (which take
the size of the array as argument),  but on my_set_ones (which does not take the
size of the array as argument), and then my_set_ones is rename to set_ones...

But another question:  Do you want the C++ array to be copied in the numpy array
(safe,  ARGOUT_ARRAY1 + a helper function  should words),  or the numpy array to
point on the  C++ array (hard to  make it safe,  only required is  array is very
big and can not be duplicated, ARGOUTVIEW_ARRAY1 should works) ?

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
David Froger | 30 Jul 2012 10:53
Picon

Re: Returning an array to python numpy gives segmentation fault

> But another question:  Do you want the C++ array to be copied in the numpy array
> (safe,  ARGOUT_ARRAY1 + a helper function  should words),  or the numpy array to
> point on the  C++ array (hard to  make it safe,  only required is  array is very
> big and can not be duplicated, ARGOUTVIEW_ARRAY1 should works) ?

It may rather be this third case :  the array will no more be accessed from C++,
and we  just want the  data to exists in  the numpy array,  which  can be safely
destroy by Python.  So the C++ function  allocates the array with size not known
by the caller, and the caller is responsible to delete the array.  If C++ is the
caller,  a (double  *array,  int size) is  returned,  (and because there  is two
returned values, there are returned using arguments, so actually double **array,
int *size), and if Python is the caller,  (double *array, int size) is converted
to a numpy array by the typemap.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Oliver Willekens | 31 Jul 2012 01:30
Picon
Favicon

Re: Returning an array to python numpy gives segmentation fault



2012/7/30 David Froger <david.froger <at> gmail.com>
> But another question:  Do you want the C++ array to be copied in the numpy array
> (safe,  ARGOUT_ARRAY1 + a helper function  should words),  or the numpy array to
> point on the  C++ array (hard to  make it safe,  only required is  array is very
> big and can not be duplicated, ARGOUTVIEW_ARRAY1 should works) ?

It may rather be this third case :  the array will no more be accessed from C++,
and we  just want the  data to exists in  the numpy array,  which  can be safely
destroy by Python.  So the C++ function  allocates the array with size not known
by the caller, and the caller is responsible to delete the array.  If C++ is the
caller,  a (double  *array,  int size) is  returned,  (and because there  is two
returned values, there are returned using arguments, so actually double **array,
int *size), and if Python is the caller,  (double *array, int size) is converted
to a numpy array by the typemap.

Hmm, you lost me on this third case. Python is the caller, C++ is responsible for the array creation and initialization and should then pass it (preferably in a safe manner) to numpy/python. Once that's done, C++ no longer needs to take care of the array, so a copy (as in your "other question") would be sufficient, the array is not that large. Which helper function are you thinking of?
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Swig-user mailing list
Swig-user <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/swig-user
David Froger | 31 Jul 2012 08:56
Picon

Re: Returning an array to python numpy gives segmentation fault

On Tue, 31 Jul 2012 01:30:24 +0200, Oliver Willekens <oliver.willekens <at> elis.ugent.be> wrote:
> 2012/7/30 David Froger <david.froger <at> gmail.com>
> 
> > > But another question:  Do you want the C++ array to be copied in the
> > numpy array
> > > (safe,  ARGOUT_ARRAY1 + a helper function  should words),  or the numpy
> > array to
> > > point on the  C++ array (hard to  make it safe,  only required is  array
> > is very
> > > big and can not be duplicated, ARGOUTVIEW_ARRAY1 should works) ?
> >
> > It may rather be this third case :  the array will no more be accessed
> > from C++,
> > and we  just want the  data to exists in  the numpy array,  which  can be
> > safely
> > destroy by Python.  So the C++ function  allocates the array with size not
> > known
> > by the caller, and the caller is responsible to delete the array.  If C++
> > is the
> > caller,  a (double  *array,  int size) is  returned,  (and because there
> >  is two
> > returned values, there are returned using arguments, so actually double
> > **array,
> > int *size), and if Python is the caller,  (double *array, int size) is
> > converted
> > to a numpy array by the typemap.
> >
> 
> Hmm, you lost me on this third case. Python is the caller, C++ is
> responsible for the array creation and initialization and should then pass
> it (preferably in a safe manner) to numpy/python. Once that's done, C++ no
> longer needs to take care of the array, so a copy (as in your "other
> question") would be sufficient, the array is not that large. Which helper
> function are you thinking of?

The helper function was not a good idea, sorry... 

But I think ARGOUTVIEW_ARRAY1 is exactly what you need.

An example is attached.

Because once the function is called,  "C++ no  longer neeeds to take care of the
array", it's completely safe (and there is no copy of the data).

If the fonction is "void foo(*dim1, **data)", what's happening when it is called
from Python with "array = foo()", is:
    int dim1;
    double* data;
    foo(&dim1,&data); // now dim1 and data have values
    create and return the numpy array from dim1 and data
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Swig-user mailing list
Swig-user <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/swig-user
Oliver Willekens | 31 Jul 2012 10:25
Picon
Favicon

Re: Returning an array to python numpy gives segmentation fault


2012/7/31 David Froger <david.froger <at> gmail.com>
On Tue, 31 Jul 2012 01:30:24 +0200, Oliver Willekens <oliver.willekens <at> elis.ugent.be> wrote:
> 2012/7/30 David Froger <david.froger <at> gmail.com>
>
> > > But another question:  Do you want the C++ array to be copied in the
> > numpy array
> > > (safe,  ARGOUT_ARRAY1 + a helper function  should words),  or the numpy
> > array to
> > > point on the  C++ array (hard to  make it safe,  only required is  array
> > is very
> > > big and can not be duplicated, ARGOUTVIEW_ARRAY1 should works) ?
> >
> > It may rather be this third case :  the array will no more be accessed
> > from C++,
> > and we  just want the  data to exists in  the numpy array,  which  can be
> > safely
> > destroy by Python.  So the C++ function  allocates the array with size not
> > known
> > by the caller, and the caller is responsible to delete the array.  If C++
> > is the
> > caller,  a (double  *array,  int size) is  returned,  (and because there
> >  is two
> > returned values, there are returned using arguments, so actually double
> > **array,
> > int *size), and if Python is the caller,  (double *array, int size) is
> > converted
> > to a numpy array by the typemap.
> >
>
> Hmm, you lost me on this third case. Python is the caller, C++ is
> responsible for the array creation and initialization and should then pass
> it (preferably in a safe manner) to numpy/python. Once that's done, C++ no
> longer needs to take care of the array, so a copy (as in your "other
> question") would be sufficient, the array is not that large. Which helper
> function are you thinking of?

The helper function was not a good idea, sorry...

But I think ARGOUTVIEW_ARRAY1 is exactly what you need.

An example is attached.

Because once the function is called,  "C++ no  longer neeeds to take care of the
array", it's completely safe (and there is no copy of the data).

If the fonction is "void foo(*dim1, **data)", what's happening when it is called
from Python with "array = foo()", is:
    int dim1;
    double* data;
    foo(&dim1,&data); // now dim1 and data have values
    create and return the numpy array from dim1 and data

Thank you David. Indeed, your example (very illuminating) works well. I had no idea the ARGOUTVIEW_ARRAY1 could be used like this.

I'll tinker around with your code and see how far it gets me.
Thanks again,

Oliver
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Swig-user mailing list
Swig-user <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/swig-user

Gmane