Re: Using mutable array after an unsafeFreezeArray, and GC details
Brandon Simmons <brandon.m.simmons <at> gmail.com>
2014-05-10 20:57:40 GMT
On Fri, May 9, 2014 at 2:06 AM, Edward Z. Yang <ezyang <at> mit.edu> wrote:
> Hello Brandon,
> Excerpts from Brandon Simmons's message of 2014-05-08 16:18:48 -0700:
>> I have an unusual application with some unusual performance problems
>> and I'm trying to understand how I might use unsafeFreezeArray to help
>> me, as well as understand in detail what's going on with boxed mutable
>> arrays and GC. I'm using the interface from 'primitive' below.
>> First some basic questions, then a bit more background
>> 1) What happens when I do `newArray s x >>= \a-> unsafeFreezeArray a
>> >> return a` and then use `a`? What problems could that cause?
> Your code as written wouldn't compile, but assuming you're talking about
> the primops newArray# and unsafeFreezeArray#, what this operation does
> is allocate a new array of pointers (initially recorded as mutable), and
> then freezes it in-place (by changing the info-table associated with
> it), but while maintaining a pointer to the original mutable array. Nothing bad
> will happen immediately, but if you use this mutable reference to mutate
> the pointer array, you can cause a crash (in particular, if the array
> makes it to the old generation, it will not be on the mutable list and
> so if you mutate it, you may be missing roots.)
>> 2) And what if a do a `cloneMutableArray` on `a` and likewise use the
>> resulting array?
> If you do the clone before freezing, that's fine for all use-cases;
> if you do the clone after, you will end up with the same result as (1).
>> Background: I've been looking into an issue  in a library in which
>> as more mutable arrays are allocated, GC dominates (I think I verified
>> this?) and all code gets slower in proportion to the number of mutable
>> arrays that are hanging around.
>> I've been trying to understand how this is working internally. I don't
>> quite understand how the "remembered set" works with respect to
>> MutableArray. As best I understand: the remembered set in generation G
>> points to certain objects in older generations, which objects hold
>> references to objects in G. Then for MutableArrays specifically,
>> card-marking is used to mark regions of the array with garbage in some
>> So my hypothesis is the slowdown is associated with the size of the
>> remembered set, and whatever the GC has to do on it. And in my tests,
>> freezing the array seems to make that overhead (at least the overhead
>> proportional to number of arrays) disappear.
> You're basically correct. In the current GC design, mutable arrays of
> pointers are always placed on the mutable list. The mutable list of
> generations which are not being collected are always traversed; thus,
> the number of pointer arrays corresponds to a linear overhead for minor GCs.
> Here is a feature request tracking many of the infelicities that our
> current GC design has: https://ghc.haskell.org/trac/ghc/ticket/7662
> The upshot is that the Haskell GC is very nicely tuned for mostly
> immutable workloads, but there are some bad asymptotics when your
> heap has lots of mutable objects. This is generally a hard problem:
> tuned GC implementations for mutable languages are a lot of work!
> (Just ask the JVM implementors.)
>> Now I'm really lost in the woods though. My hope is that I might be
>> able to safely use unsafeFreezeArray to help me here . Here are the
>> particulars of how I use MutableArray in my algorithm, which are
>> somewhat unusual:
>> - keep around a small template `MutableArray Nothing`
>> - use cloneMutableArray for fast allocation of new arrays
>> - for each array only a *single* write (CAS actually) happens at each position
>> In fact as far as I can reason, there ought to be no garbage to
>> collect at all until the entire array becomes garbage (the initial
>> value is surely shared, especially since I'm keeping this template
>> array around to clone from, right?). In fact I was even playing with
>> the idea of rolling a new CAS that skips the card-marking stuff.
> I don't understand your full workload, but if you have a workload that
> involves creating an array, mutating it over a short period of time,
> and then never mutating it afterwards, you should simply freeze it after
> you are done writing it. Once frozen, the array will no longer be kept
> on the mutable list and you won't pay for it when doing GC. However,
> the fact that you are doing a CAS makes it seem to me that your workflow
> may be more complicated than that...
Another silly question: when card-marking happens after a write or
CAS, does that indicate "this segment maybe contains old-to-new
generation references, so be sure to preserve (scavenge?) them from
collection "? In my initial question I was thinking of the cards as
indicating "here be garbage" (e.g. a previous overwritten array
value), but I think I had the wrong idea about how copying GC works
generally (shouldn't it really be called Non-Garbage Preservation?).