Stefan Behnel | 4 May 10:59
Picon

[Fwd: Re: (no subject)]

[Forwarding to the list...]
From: <mharper3 <at> uiuc.edu>

Also, adding

elem.clear()

into the loop still eventually leads to a memory error, just much later. This
should be clearing every element, so I'm not quite sure if I understand what
clear() actually does. Should I segment the file into smaller pieces so that
the tree is unloaded as each piece finishes?

I apologize if my questions are trivial. I appreciate your responses greatly.

 -- Marc
Stefan Behnel | 4 May 12:53
Picon

Re: parsing a large file with iterparse()

Hi,

Stefan Behnel wrote:
> From: <mharper3 <at> uiuc.edu>
> 
> Also, adding
> 
> elem.clear()
> 
> into the loop still eventually leads to a memory error, just much later. This
> should be clearing every element, so I'm not quite sure if I understand what
> clear() actually does.

According to the docs:

"""
clear()
Resets an element. This function removes all subelements, clears all
attributes and sets the text and tail properties to None.
"""

So it does not remove the element itself. I don't know what your XML looks
like, but if it's something like

   <root>
      <a>...</a> * a zillion
   </root>

and you handle the end event of the <a> element and clear() it, you still end
up with a tree that has a zillion empty <a/> children.
(Continue reading)


Gmane