Leon Smith | 31 Jul 06:25 2012
Picon

What does unpacking an MVar really mean?

I admit I don't know exactly how MVars are implemented,  but given that they can be aliased and have indefinite extent,   I would think that they look something vaguely like a  cdatatype ** var,  basically a pointer to an MVar (which is itself a pointer,  modulo some other things such as a thread queue.)

And,  I would think that "unpacking" such an structure would basically be eliminating one layer of indirection,  so it would then look vague like a cdatatype * var.    But again,  given aliasing and indefinite extent, this would seem to be a difficult thing to do.

Actually this isn't too difficult if an MVar only exists in a single unpacked structure:   other references to the MVar can simply be pointers into the structure.   But the case where an MVar is unpacked into two different structures suggests that,  at least in some cases,  an "unpacked" MVar is still a cdatatype ** var;

So, is my understanding more or less correct?  Does anybody have a good, succinct explanation of how MVars are implemented,  and how they are unpacked?

One final question,   assuming that unpacking an MVar really does eliminate a layer of indirection,  and that other references to that MVar are simply pointers into a larger structure,   what happens to that larger structure when there are no more references to it (but still some references to the MVar?)    Given the complications that must arise out of a doubly "unpacked" MVar,  I'm going to guess that the larger structure does get garbage collected in this case,  and that the MVar becomes dislodged from this structure.   Would that MVar then be placed directly inside another unpacked reference, if one is available?

Best,
Leon
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Ryan Ingram | 31 Jul 07:04 2012
Picon

Re: What does unpacking an MVar really mean?

I'm not sure I totally understand your question about 'unpacking' an MVar, but I'm going to assume you mean data structures that use the {-# UNPACK #-} pragma, like in Control.Concurrent.Future [1] and Control.Concurrent.NamedLock [2].

Here is how MVar is defined in GHC [3]:
    data MVar a = MVar (MVar# RealWorld a)

A "MVar# s a" is an unboxed pointer to a structure understood by the GHC runtime.

So yes, you can imagine a MVar as a pointer-to-pointer.  The structure it points at likely has another pointer to the embedded boxed "a", so it may even be pointer-to-pointer-to-pointer.

The MVar data structure exists to allow laziness; for example

   let x = unsafePerformIO (newMVar ()) in ()

is likely to not allocate an MVar#, just a thunk that would create an MVar if it was evaluated.  Unboxed objects (represented by convetion with # in GHC), on the other hand, are strict, so if you have an MVar# RealWorld (), you know it points to a valid MVar#.

From [2] we have
data NLItem = NLItem {-# UNPACK #-} !Int
                     {-# UNPACK #-} !(MVar ())

All the {-# UNPACK #-} pragma does is embed the contents of a strict single-constructor data declaration *directly* into the structure containing it; it's like you declared NLItem as such:

    data NLItem = NLItem Word# (MVar# RealWorld ())

except that if you call functions that want an Int or MVar thunk, the compiler will automatically re-box them in a new I#/MVar constructor.

Many copies of pointers to the same MVar# may exist; they are all 'identical' MVars; equality is defined as such:
    instance Eq (MVar a) where
            (MVar mvar1#) == (MVar mvar2#) = sameMVar# mvar1# mvar2#

where sameMVar# is a primitive that is probably just raw pointer equality.

Because of this, boxed MVars can be garbage collected without necessarily garbage-collecting the MVar# it holds, if a live reference to that MVar# still exists elsewhere.

  -- ryan

[1] http://hackage.haskell.org/packages/archive/future/2.0.0/doc/html/src/Control-Concurrent-Future.html
[2] http://hackage.haskell.org/packages/archive/named-lock/0.1/doc/html/src/Control-Concurrent-NamedLock.html
[3] http://www.haskell.org/ghc/docs/7.4.1/html/libraries/base/src/GHC-MVar.html#MVar

On Mon, Jul 30, 2012 at 9:25 PM, Leon Smith <leon.p.smith <at> gmail.com> wrote:
I admit I don't know exactly how MVars are implemented,  but given that they can be aliased and have indefinite extent,   I would think that they look something vaguely like a  cdatatype ** var,  basically a pointer to an MVar (which is itself a pointer,  modulo some other things such as a thread queue.)

And,  I would think that "unpacking" such an structure would basically be eliminating one layer of indirection,  so it would then look vague like a cdatatype * var.    But again,  given aliasing and indefinite extent, this would seem to be a difficult thing to do.

Actually this isn't too difficult if an MVar only exists in a single unpacked structure:   other references to the MVar can simply be pointers into the structure.   But the case where an MVar is unpacked into two different structures suggests that,  at least in some cases,  an "unpacked" MVar is still a cdatatype ** var;

So, is my understanding more or less correct?  Does anybody have a good, succinct explanation of how MVars are implemented,  and how they are unpacked?

One final question,   assuming that unpacking an MVar really does eliminate a layer of indirection,  and that other references to that MVar are simply pointers into a larger structure,   what happens to that larger structure when there are no more references to it (but still some references to the MVar?)    Given the complications that must arise out of a doubly "unpacked" MVar,  I'm going to guess that the larger structure does get garbage collected in this case,  and that the MVar becomes dislodged from this structure.   Would that MVar then be placed directly inside another unpacked reference, if one is available?

Best,
Leon

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Leon Smith | 31 Jul 07:22 2012
Picon

Re: What does unpacking an MVar really mean?

Let me clarify a bit.


I am familiar with the source of Control.Concurrent.MVar,  and I do see {-# UNPACK #-}'ed MVars around,  for example in GHC's IO manager.     What I should have asked is,  what does an MVar# look like?  This cannot be inferred from Haskell source;  though I suppose I could have tried to read the Runtime source.

Now,  one would hope that and (MVar# RealWorld footype) would  approximately correspond to a "footype * mvar;" variable in C.   The problem is this cannot _always_ be the case,  because you can alias the (MVar# RealWorld footype) by placing a single MVar into two unpacked columns in two different data structures.    So you would need to be able to still sometimes represent an MVar# by a footype ** mvar at runtime,  even though one would hope that it would be represented by a "footype * mvar" in one particular data structure.

On Tue, Jul 31, 2012 at 1:04 AM, Ryan Ingram <ryani.spam <at> gmail.com> wrote:
Because of this, boxed MVars can be garbage collected without necessarily garbage-collecting the MVar# it holds, if a live reference to that MVar# still exists elsewhere.

I was asking the dual question:  if the MVar# exists in some data structure,  can that data structure still be garbage collected when there is a reference to the MVar#,  but not the data structure it is contained within.

Best,
Leon
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Bertram Felgenhauer | 31 Jul 13:37 2012

Re: What does unpacking an MVar really mean?

Leon Smith wrote:
> I am familiar with the source of Control.Concurrent.MVar,  and I do see {-#
> UNPACK #-}'ed MVars around,  for example in GHC's IO manager.     What I
> should have asked is,  what does an MVar# look like?  This cannot be
> inferred from Haskell source;  though I suppose I could have tried to read
> the Runtime source.

So let's have a brief look at the source. MVar# is an RTS specific
heap object which contains three pointers:

(from ghc/includes/rts/storage/Closures.h)

typedef struct {
    StgHeader                header;
    struct StgMVarTSOQueue_ *head;
    struct StgMVarTSOQueue_ *tail;
    StgClosure*              value;
} StgMVar;

The 'value' pointer refers to the actual value held by the mutable
variable, if any. The 'head' and 'tail' pointers are used for
managing a linked list of threads blocked on the mutable variable.

An MVar (if evaluated) contains just a pointer the MVar# object.

To access the value of an MVar, one starts with a pointer to the MVar
heap object. Then,

  1. Make sure that the MVar is evaluated, using standard lazy
     evaluation (follow indirections, enter thunks, ...).

     In the best case that's a check of a tag bit in the pointer.

  2. Read the pointer to the MVar# in the MVar.

  3. access the 'value' field of the StgMVar record, which
     results in another pointer to a heap object representing
     the actual data held by the MVar.

     (In reality the code has to check whether the MVar is full
     or not, and block if necessary. This is quite involved; see
     stg_takeMVarzh  in  ghc/rts/PrimOps.cmm)

That's two dereferences and some bookkeeping work.

In loops, the compiler will often unpack the MVar, so that you can
expect the first two steps to be performed just once.

Unpacking an MVar into a bigger record means that the pointer to the
MVar# will be stored in the record directly, rather than a pointer
to an MVar object that holds a pointer to the MVar#.

Note that MVar# itself cannot be unpacked -- the StgMVar record will
always be a separate heap object.

> I was asking the dual question:  if the MVar# exists in some data
> structure,  can that data structure still be garbage collected when there
> is a reference to the MVar#,  but not the data structure it is contained
> within.

Yes, because the data structure only contains a pointer to an MVar#
(StgMVar record) that will live on.

Best regards,

Bertram
Leon Smith | 31 Jul 19:18 2012
Picon

Re: What does unpacking an MVar really mean?

On Tue, Jul 31, 2012 at 7:37 AM, Bertram Felgenhauer <bertram.felgenhauer <at> googlemail.com> wrote:
Note that MVar# itself cannot be unpacked -- the StgMVar record will
always be a separate heap object.

One could imagine a couple of techniques to unpack the MVar# itself,  and was curious if GHC might employ one of them.

So, really,  unpacking the MVar does not eliminate a layer of indirection,  it just eliminates the need to check a pointer tag (and possibly execute a thunk or follow some redirects if you don't have a pointer to an MVar#).    I think this is what I was ultimately after.

Best,
Leon


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane