23 Jun 2012 12:02
Support integration with other tree changing libxml2 based libraries
Dieter Maurer <dieter <at> handshake.de>
2012-06-23 10:02:14 GMT
2012-06-23 10:02:14 GMT
I am working on an integration of `lxml` and `libxmlsec` (the XML security library) and I have hit an important problem: `libxmlsec` functions can change the libxml2 document (tree) and thereby seriously confuse `lxml`. The major problem is that `libxmlsec` may unlink and release subtrees leading to a `SIGSEGV` in `lxml` code when it later accesses those subtrees. Fortunately, `libxmlsec` can be told not to release unlinked subtrees but leave that to the application. But now, my application must do that: release the subtree if and only if `lxml` will not do that at a later time (because it has a reference to some node in the subtree). Looking at the public `lxml` API, I have not found such a function. I have come up with the following first version of an `lxml_safe_release`: cdef int lxml_safe_release(_Document doc, xmlNode* c_node) except -1: # we let `lxml` get rid of the subtree by wrapping *c_node* into a # proxy and then releasing it again. if elementFactory(doc, c_node) == NULL: return -1 return 0 I hope that this will be sufficient to prevent SIGSEGV. However, I doubt that it is already enough that references into unlinked subtrees really work correctly. In similar situations, `lxml` calls `moveNodeToDocument` in order to get namespace references inside the unlinked subtree self contained. `moveNodeToDocument` is not public and far to complicated that I would like to include a copy in my code. I propose that future `lxml` versions should include a public(Continue reading)
RSS Feed