Re: Current state of garbage collection in Haskell
Thomas Schilling <nominolo <at> googlemail.com>
2012-07-29 18:29:32 GMT
GHC does not provide any form of real-time guarantees (and support for
them is not planned).
That said, it's not as bad as it sounds:
- Collecting the first (young) generation is fast and you can control
the size of that first generation via runtime system (RTS) options.
- The older generation is collected rarely and can be collected in parallel.
- You can explicitly invoke the GC via System.Mem.performGC
In a multi-threaded / multi-core program collecting the first
generation still requires stopping all application threads even though
only one thread (CPU) will perform GC (and having other threads help
out usually doesn't work out due to locality issues). This can be
particularly expensive if the OS decides to deschedule an OS thread,
as then the GHC RTS has to wait for the OS. You can avoid that
particular problem by properly configuring the OS via (linux boot
isolcpus=... and taskset(8)). The GHC team has been working on a
independent *local* GC, but it's unlikely to make it into the main
branch at this time. It turned out to be very difficult to implement,
with not large enough gains. Building a fully-concurrent GC is
(AFAICT) even harder.
I don't know how long the pause times for your 500MB live heap would
be. Generally, you want your heap to be about twice the size of your
live data. Other than that it depends heavily on the characteristics
of you heap objects. E.g., if it's mostly arrays of unboxed
non-pointer data, then it'll be very quick to collect (since the GC
doesn't have to do anything with the contents of these arrays). If
it's mostly many small objects with pointers to other objects, GC will
be very expensive and bound by the latency of your RAM. So, I suggest
you run some tests with realistic heaps.
Regarding keeping up, Simon Marlow is the main person working on GHC's
GC (often collaborating with others) and he keeps a list of papers on
his homepage: http://research.microsoft.com/en-us/people/simonmar/
If you have further questions about GHC's GC, you can ask them on the
glasgow-haskell-users <at> haskell.org mailing list (but please consult the
GHC user's guide section on RTS options first).
On 29 July 2012 08:52, C K Kashyap <ckkashyap <at> gmail.com> wrote:
> I was looking at a video that talks about GC pauses. That got me curious
> about the current state of GC in Haskell - say ghc 7.4.1.
> Would it suffer from lengthy pauses when we talk about memory in the range
> of 500M +?
> What would be a good way to keep abreast with the progress on haskell GC?
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
Push the envelope. Watch it bend.