27 Aug 11:35
Re: .text_content() should leave spaces. Tests included
From: Stefan Behnel <stefan_ml <at> behnel.de>
Subject: Re: .text_content() should leave spaces. Tests included
Newsgroups: gmane.comp.python.lxml.devel
Date: 2008-08-27 09:35:51 GMT
Subject: Re: .text_content() should leave spaces. Tests included
Newsgroups: gmane.comp.python.lxml.devel
Date: 2008-08-27 09:35:51 GMT
Max Ivanov wrote:
>>> for el in doc.iter():
>>> if el.text and (el.tag not in self.inlinetags):
>>> el.text = ''.join((' ',el.text))
>>> if el.tail and (el.tag not in self.inlinetags):
>>> el.tail += ' '
>>> if el.tag == 'br':
>>> if el.tail and not el.tail.startswith('\n'):
>>> el.tail = '\n'+el.tail
>>> else:
>>> el.tail = '\n'
>>> el.drop_tag()
>>
>> You're modifying the tree here, which is inacceptable for a function
>> that
>> returns a (partial) string serialisation. Apart from that, this seems
>> like a workable solution to your problem.
>
> What's wrong with modifying tree?
I was seeing it in the context of the text_content() method, where tree
modification must not happen.
Stefan
RSS Feed