What does unpacking an MVar really mean?
2012-07-31 04:25:51 GMT
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I'm not sure I totally understand your question about 'unpacking' an MVar, but I'm going to assume you mean data structures that use the {-# UNPACK #-} pragma, like in Control.Concurrent.Future [1] and Control.Concurrent.NamedLock [2].
Here is how MVar is defined in GHC [3]:
data MVar a = MVar (MVar# RealWorld a)
A "MVar# s a" is an unboxed pointer to a structure understood by the GHC runtime.
So yes, you can imagine a MVar as a pointer-to-pointer. The structure it points at likely has another pointer to the embedded boxed "a", so it may even be pointer-to-pointer-to-pointer.
The MVar data structure exists to allow laziness; for example
let x = unsafePerformIO (newMVar ()) in ()
is likely to not allocate an MVar#, just a thunk that would create an
MVar if it was evaluated. Unboxed objects (represented by convetion
with # in GHC), on the other hand, are strict, so if you have an MVar#
RealWorld (), you know it points to a valid MVar#.
From [2] we have
data NLItem = NLItem {-# UNPACK #-} !Int
{-# UNPACK #-} !(MVar ())
All the {-# UNPACK #-} pragma does is embed the contents of a strict single-constructor data declaration *directly* into the structure containing it; it's like you declared NLItem as such:
data NLItem = NLItem Word# (MVar# RealWorld ())
except that if you call functions that want an Int or MVar thunk, the compiler will automatically re-box them in a new I#/MVar constructor.
Many copies of pointers to the same MVar# may exist; they are all 'identical' MVars; equality is defined as such:
instance Eq (MVar a) where
(MVar mvar1#) == (MVar mvar2#) = sameMVar# mvar1# mvar2#
where sameMVar# is a primitive that is probably just raw pointer equality.
Because of this, boxed MVars can be garbage collected without necessarily garbage-collecting the MVar# it holds, if a live reference to that MVar# still exists elsewhere.
-- ryan
[1] http://hackage.haskell.org/packages/archive/future/2.0.0/doc/html/src/Control-Concurrent-Future.html
[2] http://hackage.haskell.org/packages/archive/named-lock/0.1/doc/html/src/Control-Concurrent-NamedLock.html
[3] http://www.haskell.org/ghc/docs/7.4.1/html/libraries/base/src/GHC-MVar.html#MVar
I admit I don't know exactly how MVars are implemented, but given that they can be aliased and have indefinite extent, I would think that they look something vaguely like a cdatatype ** var, basically a pointer to an MVar (which is itself a pointer, modulo some other things such as a thread queue.)And, I would think that "unpacking" such an structure would basically be eliminating one layer of indirection, so it would then look vague like a cdatatype * var. But again, given aliasing and indefinite extent, this would seem to be a difficult thing to do.Actually this isn't too difficult if an MVar only exists in a single unpacked structure: other references to the MVar can simply be pointers into the structure. But the case where an MVar is unpacked into two different structures suggests that, at least in some cases, an "unpacked" MVar is still a cdatatype ** var;So, is my understanding more or less correct? Does anybody have a good, succinct explanation of how MVars are implemented, and how they are unpacked?One final question, assuming that unpacking an MVar really does eliminate a layer of indirection, and that other references to that MVar are simply pointers into a larger structure, what happens to that larger structure when there are no more references to it (but still some references to the MVar?) Given the complications that must arise out of a doubly "unpacked" MVar, I'm going to guess that the larger structure does get garbage collected in this case, and that the MVar becomes dislodged from this structure. Would that MVar then be placed directly inside another unpacked reference, if one is available?Best,Leon
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Let me clarify a bit.
Because of this, boxed MVars can be garbage collected without necessarily garbage-collecting the MVar# it holds, if a live reference to that MVar# still exists elsewhere.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Leon Smith wrote:
> I am familiar with the source of Control.Concurrent.MVar, and I do see {-#
> UNPACK #-}'ed MVars around, for example in GHC's IO manager. What I
> should have asked is, what does an MVar# look like? This cannot be
> inferred from Haskell source; though I suppose I could have tried to read
> the Runtime source.
So let's have a brief look at the source. MVar# is an RTS specific
heap object which contains three pointers:
(from ghc/includes/rts/storage/Closures.h)
typedef struct {
StgHeader header;
struct StgMVarTSOQueue_ *head;
struct StgMVarTSOQueue_ *tail;
StgClosure* value;
} StgMVar;
The 'value' pointer refers to the actual value held by the mutable
variable, if any. The 'head' and 'tail' pointers are used for
managing a linked list of threads blocked on the mutable variable.
An MVar (if evaluated) contains just a pointer the MVar# object.
To access the value of an MVar, one starts with a pointer to the MVar
heap object. Then,
1. Make sure that the MVar is evaluated, using standard lazy
evaluation (follow indirections, enter thunks, ...).
In the best case that's a check of a tag bit in the pointer.
2. Read the pointer to the MVar# in the MVar.
3. access the 'value' field of the StgMVar record, which
results in another pointer to a heap object representing
the actual data held by the MVar.
(In reality the code has to check whether the MVar is full
or not, and block if necessary. This is quite involved; see
stg_takeMVarzh in ghc/rts/PrimOps.cmm)
That's two dereferences and some bookkeeping work.
In loops, the compiler will often unpack the MVar, so that you can
expect the first two steps to be performed just once.
Unpacking an MVar into a bigger record means that the pointer to the
MVar# will be stored in the record directly, rather than a pointer
to an MVar object that holds a pointer to the MVar#.
Note that MVar# itself cannot be unpacked -- the StgMVar record will
always be a separate heap object.
> I was asking the dual question: if the MVar# exists in some data
> structure, can that data structure still be garbage collected when there
> is a reference to the MVar#, but not the data structure it is contained
> within.
Yes, because the data structure only contains a pointer to an MVar#
(StgMVar record) that will live on.
Best regards,
Bertram
Note that MVar# itself cannot be unpacked -- the StgMVar record willalways be a separate heap object.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RSS Feed