17 May 20:58
Re: UIMA internals memory footprint
Kirk, In this test are you running a CPE or just an AnalysisEngine? If it is a CPE do you know what your CAS Pool size is? When a CAS is created it does allocate a large heap which is then filled as you create annotations. By default I believe this is 500,000 cells (2MB) per CAS, but this can be overridden (see UIMAFramework.getDefaultPerformanceTuningPropeties()). So this can defintely be one source of memory overhead. As you saw it does not grow with larger documents, it will only grow if you create enough annotations to fill up the allocated space. -Adam On 5/17/07, Kirk True <kirk@...> wrote: > Hi all, > > I have begun getting seeing heavy memory use when processing largish > documents through a UIMA pipeline. I wanted to make sure what I'm > seeing with regard to UIMA's internal memory use is on par with > expectations. > > It looks like either for a 1,500,000 byte or a 15,000,000 byte document > with the same annotations (100,000 10-character annotations), we incur > a ~13 MB "overhead" for internal UIMA data structures. Is this in line > with expectations? > > Details: >(Continue reading)
that JCas
objects stick around, while plain old FeatureStructures get can get
garbage collected after each annotator has run. So JCas objects behave
like the rest of the CAS in that respect, and unlike FeatureStructure
objects. Not beating on the JCas, just trying to explain sources of
memory consumption in the final analysis, after processing, so to speak.
--Thilo
RSS Feed