Alex Klizhentas | 1 May 20:14
Picon

Custom Elements question

Hi All,
Got a question:

I've extended the ElementBase object using the approach described in the tutorial, but SubElement does not work as desired:

class NodeBase(etree.ElementBase):
     def append(self,child):

 print "aaa"
 return etree.ElementBase.append(self,child)

etree.SubElement(root,"child") #no "aaa" printed

OK, but when taking your code to the module:

def SubElement(parent, tag, attrib={}, **extra):
    attrib = attrib.copy()
    attrib.update(extra)
    element = parent.makeelement(tag, attrib)
    parent.append(element)
    return element

SubElement(root,"child") # "aaa" is here!

and overriding
    def makeelement(self, tag, attrib):
        return Node(tag, attrib)

in the NodeBase just does not help,

Any advice will be appreciated,
Alex
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 1 May 20:28
Picon

Re: Custom Elements question

Hi,

Alex Klizhentas wrote:
> I've extended the ElementBase object using the approach described in the
> tutorial, but SubElement does not work as desired:
> 
> class NodeBase(etree.ElementBase):
>      def append(self,child):
>  print "aaa"
>  return etree.ElementBase.append(self,child)
> 
> etree.SubElement(root,"child") #no "aaa" printed

That's because SubElement() does not call .append().

> OK, but when taking your code to the module:
> 
> def SubElement(parent, tag, attrib={}, **extra):
>     attrib = attrib.copy()
>     attrib.update(extra)
>     element = parent.makeelement(tag, attrib)
>     parent.append(element)
>     return element
> 
> SubElement(root,"child") # "aaa" is here!

As expected, as you call .append() explicitly here.

> and overriding
>     def makeelement(self, tag, attrib):
>         return Node(tag, attrib)
> 
> in the NodeBase just does not help,

SubElement() does not call .makeelement() either. It's implemented in plain C.
Could you explain a bit why you want to do this and how your .append() differs
from the normal append code?

Stefan
Alex Klizhentas | 1 May 21:11
Picon

Re: Custom Elements question

Thanks for the comments,

The idea behind this is to allow the XML tree to notify observers when it's contents are changed: the node is added, removed or moved.

That's why I'm going to override the ElementBase members so that they will notify observers on the certain actions performed.

Everything works fine, except this usefult SubElement function that did not work as expected, now you've clarified the things,

Thanks
Alex

2008/5/1 Stefan Behnel <stefan_ml <at> behnel.de>:
Hi,

Alex Klizhentas wrote:
> I've extended the ElementBase object using the approach described in the
> tutorial, but SubElement does not work as desired:
>
> class NodeBase(etree.ElementBase):
>      def append(self,child):
>  print "aaa"
>  return etree.ElementBase.append(self,child)
>
> etree.SubElement(root,"child") #no "aaa" printed

That's because SubElement() does not call .append().


> OK, but when taking your code to the module:
>
> def SubElement(parent, tag, attrib={}, **extra):
>     attrib = attrib.copy()
>     attrib.update(extra)
>     element = parent.makeelement(tag, attrib)
>     parent.append(element)
>     return element
>
> SubElement(root,"child") # "aaa" is here!

As expected, as you call .append() explicitly here.


> and overriding
>     def makeelement(self, tag, attrib):
>         return Node(tag, attrib)
>
> in the NodeBase just does not help,

SubElement() does not call .makeelement() either. It's implemented in plain C.
Could you explain a bit why you want to do this and how your .append() differs
from the normal append code?

Stefan



--
Regards,
Alex
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 2 May 08:49
Picon

Re: Custom Elements question


Alex Klizhentas wrote:
>> Alex Klizhentas wrote:
>>> I've extended the ElementBase object using the approach described in the
>>> tutorial, but SubElement does not work as desired:
>>>
>>> class NodeBase(etree.ElementBase):
>>>      def append(self,child):
>>>  print "aaa"
>>>  return etree.ElementBase.append(self,child)
>>>
>>> etree.SubElement(root,"child") #no "aaa" printed
>> That's because SubElement() does not call .append().
>>
>>
>>> OK, but when taking your code to the module:
>>>
>>> def SubElement(parent, tag, attrib={}, **extra):
>>>     attrib = attrib.copy()
>>>     attrib.update(extra)
>>>     element = parent.makeelement(tag, attrib)
>>>     parent.append(element)
>>>     return element
>
> The idea behind this is to allow the XML tree to notify observers when it's
> contents are changed: the node is added, removed or moved.
>
> That's why I'm going to override the ElementBase members so that they will
> notify observers on the certain actions performed.
>
> Everything works fine, except this usefult SubElement function that did not
> work as expected, now you've clarified the things,

Ah, sure. Then it's best to use a pure Python implementation of SubElement
instead, as the one above.

Stefan
Stefan Behnel | 2 May 16:30
Picon

Re: Custom Elements question

Hi,

another bit of reasoning here.

Stefan Behnel wrote:
> Alex Klizhentas wrote:
>> I've extended the ElementBase object using the approach described in the
>> tutorial, but SubElement does not work as desired:
>>
>> class NodeBase(etree.ElementBase):
>>      def append(self,child):
>>  print "aaa"
>>  return etree.ElementBase.append(self,child)
>>
>> etree.SubElement(root,"child") #no "aaa" printed
> 
> That's because SubElement() does not call .append().
[...]
> SubElement() does not call .makeelement() either. It's implemented in plain C.

One important reason is that this allows lxml.etree to append the new libxml2
node at the C level *before* the decision is taken which Python class should
be used to represent it. This might have an impact on the class lookup if it
considers the parental relation when taking its decision (lxml.objectify does
that, for example).

But that's the only difference I can see between etree.SubElement() and your
Python implementation. And you could even work around it by doing something
like this:

def SubElement(parent, tag, attrib={}, **extra):
     attrib = attrib.copy()
     attrib.update(extra)
     element = parent.makeelement(tag, attrib)
     parent.append(element)
     del element
     return parent[-1]

However, you might want to avoid that if you know you won't need it, e.g. when
using the "namespace" or "default" lookup scheme.

Stefan
Alex Klizhentas | 2 May 19:21
Picon

Re: Custom Elements question

Thanks Stefan,

All the nodes in that tree should have the same type, that's why the default class lookup scheme for parser works fine.

BTW, I have one more question, to set the xml:id i use the following construct:

def xml_id(v):
    # helper function to create name space attributes
    return {'{http://www.w3.org/XML/1998/namespace}id': v}

and the following construct:

N.child1("text",xml_id("some_id"))

following the examples from the site.

to get the id I use:

class NodeBase(etree.ElementBase):
    ...   
    def get_node_id(self,id):
        searched = self.find(".//*[ <at> {http://www.w3.org/XML/1998/namespace}id='%s']"%(id,))
        if searched is None:
            raise NodeNotFoundError(id)
        return searched


I have two questions:

1. what way is faster to get the element by Id? should I use find or xpath to achieve the better performance?
2. is there a way to set xml:id using xml - prefix?

Thanks,
Alex

2008/5/2 Stefan Behnel <stefan_ml <at> behnel.de>:
Hi,

another bit of reasoning here.

Stefan Behnel wrote:
> Alex Klizhentas wrote:
>> I've extended the ElementBase object using the approach described in the
>> tutorial, but SubElement does not work as desired:
>>
>> class NodeBase(etree.ElementBase):
>>      def append(self,child):
>>  print "aaa"
>>  return etree.ElementBase.append(self,child)
>>
>> etree.SubElement(root,"child") #no "aaa" printed
>
> That's because SubElement() does not call .append().
[...]
> SubElement() does not call .makeelement() either. It's implemented in plain C.

One important reason is that this allows lxml.etree to append the new libxml2
node at the C level *before* the decision is taken which Python class should
be used to represent it. This might have an impact on the class lookup if it
considers the parental relation when taking its decision (lxml.objectify does
that, for example).

But that's the only difference I can see between etree.SubElement() and your
Python implementation. And you could even work around it by doing something
like this:

def SubElement(parent, tag, attrib={}, **extra):
    attrib = attrib.copy()
    attrib.update(extra)
    element = parent.makeelement(tag, attrib)
    parent.append(element)
    del element
    return parent[-1]

However, you might want to avoid that if you know you won't need it, e.g. when
using the "namespace" or "default" lookup scheme.

Stefan




--
Regards,
Alex
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 2 May 19:42
Picon

Re: Custom Elements question

Hi,

Alex Klizhentas wrote:
> I have one more question, to set the xml:id i use the following construct:
> 
> def xml_id(v):
>     # helper function to create name space attributes
>     return {'{http://www.w3.org/XML/1998/namespace}id': v}
> 
> and the following construct:
> 
> N.child1("text",xml_id("some_id"))
> 
> following the examples from the site.
> 
> to get the id I use:
> 
> class NodeBase(etree.ElementBase):
>     ...
>     def get_node_id(self,id):
>         searched = self.find(".//*[@{
> http://www.w3.org/XML/1998/namespace}id='%s']"%(id,))
>         if searched is None:
>             raise NodeNotFoundError(id)
>         return searched
> 
> I have two questions:
> 
> 1. what way is faster to get the element by Id? should I use find or xpath
> to achieve the better performance?

timeit will tell you that. But it really depends on the data. element.find()
stops short after the first hit, so that's probably faster on average if the
document is large. OTOH, XPath() is implemented in C and could easily beat the
Python code behind find("..@attr...") for smaller documents...

Try this:

      find_id = etree.ETXPath(
            ".//*[@{http://www.w3.org/XML/1998/namespace}id=$id]")
      ...
      def get_node_id(self,id):
          el = find_id(self, id=id)

> 2. is there a way to set xml:id using xml - prefix?

No, but if you know you run single-threaded, you can reuse the attrib dict and
just change the value. That's faster than recreating it each time.

Stefan

Gmane