Gonçalo Lopes | 6 Jun 2012 20:14
Picon

Re: Support for GC.AddMemoryPressure()

Hi again,

Actually, I was searching more about it and I really cannot find by googling or elsewhere (i.e. scrounging MSDN documentations) and I couldn't find an example of what would be a "proper" use of memory pressure apart from the one I described. The MSDN docs even mention " In the simplest usage pattern, a managed object allocates unmanaged memory in the constructor and releases it in the Dispose or Finalize method."

Could you maybe give me a practical example of what you would find a good and valid use of memory pressure so I can have a better idea of what you have in mind?

Thanks again,
G

P.S.: I just realized that by accident I was replying directly to a personal e-mail account rather than to the mailing list, so please consider the attached mailing history as well.


On 5 June 2012 18:06, Gonçalo Lopes <goncaloclopes <at> gmail.com> wrote:
Sorry, just noticed an important typo.... where it reads " Conceptually, they are every bit as valid for a GC as a native object..." should be read "Conceptually, they are every bit as valid for a GC as a managed object...".

G

On 5 June 2012 18:03, Gonçalo Lopes <goncaloclopes <at> gmail.com> wrote:
First for all, thanks for your input, I always love to discuss these things.

I agree with most of your points but there are added difficulties which make the problem harder. First of all, just to make it clear that my specific problem concerns the design of a general-purpose and modular asynchronous data processing framework, so I can't benefit from application-specific or even composition-specific solutions that would address this problem. Second of all, the asynchronous nature of the framework makes it much harder to design a contract that would work seamlessly across all data types of the framework without making everything too cumbersome for the user. Simplicity and rapid prototyping are one of its main goals and tenets, and that's every bit as important for my scenario as performance.

Sorry for not being clear about immutability, and I grant you your point about that property having nothing to do with a GC. However, there is indeed a reason why functional languages do require a GC for being of any use. I just have not seen any single example of a functional programming language that does not heavily use collection, even when operating very close to C++ like D. The reason is that if you want to allow for composition in the functionally elegant style you simply must let go of handling the transients. I would say that's one of the main points for these languages actually.

That said, I am in fact concerned with performance, but my statements before were in the direction of claiming that the pressured GC can indeed be performant under most of the scenarios in which the framework is and has been used. Again, we're talking about transient intermediate objects, meaning objects that will most likely not survive a gen 0 collection. This effectively means the pressured GC's decision to collect is more likely than not always correct and will target the right objects, by virtue of they being short-lived.

Also, the argument for whether native resource management should be transparent or not, in my scenario the arguments for this are exactly the same as for why it's useful to have a GC in the first place. It is possible to compose applications in C/C++ and other languages where you have to explicitly handle all the memory, but again, there is a reason why people tried to move beyond this, even under exactly performance-related criticism.  My resources are not TCP connections, file handles or structures with other complex side-effects. Conceptually, they are every bit as valid for a GC as a native object, with the exception they do not live in the managed heap. Granted, this fact alone bears a whole host of potential implications, but following the tenet of "premature optimization is the root of all evil" I have yet to see any performance related issues related specifically to this point after 5-years of heavily using this approach in .NET windows machines for everything from multi-sensory acquisition, computer graphics and computation-intensive processing. Agree that that doesn't mean the problem is not there, just that it never showed up in the many difficult use cases we put it through.

Finally, I have in fact considered many times in the past moving away from the current functional-oriented paradigm to a more explicit memory model where the user has to provide specific nodes for processing the images. However, this would make image processing such a deviant special case in the framework, and would make everything so much harder for the user, that I just couldn't bring myself to do it until I really see a need for it. But I'm still very much thinking of how it could be done and I'm sure it definitely can be done, just not in a form that I'm currently happy with. The one thing I'm fairly confident is that whatever the solution is it should not transpire back to the composer layer. At most, it will imply a custom memory pool for image allocation and deallocation which will have its own GC-like strategy...

All the best and thanks again for all the feedback, really appreciate it :-)
G


On 5 June 2012 16:19, Rodrigo Kumpera <kumpera <at> gmail.com> wrote:


On Tue, Jun 5, 2012 at 11:56 AM, glopes <goncaloclopes <at> gmail.com> wrote:
I understand completely why people would think that, I honestly do, but I confess I'm at a loss why it should be a problem at a conceptual level, or why it MemoryPressure shouldn't be used this way.

We're talking about highly transient native resources (e.g. images), which are completely tied to a managed representation which I'm using to compose modular high-throughput data processing pipelines. In the end it's not as different from just allocating an array of bytes.

This is precisely the problem. The GC deals with managed resources only and the MemoryPressure API completely unties one from the other.
So, when should a collection based on the current managed and unmanaged pressure? Will a minor collection be enough to alleviate the current
native one? Or should it perform a major GC?

The only answer you can drawn from those design questions is that AddMemoryPressure can increase collection frequency significantly, which
does reduce throughput.

Explicitly disposing is doable in almost all scenarios given one thinks enough on the problem. I've seen this same story many times in all sorts
of managed langages and having user code do its job always results in a better solution.

Also, from a functional perspective of composability, it's not just a mild convenience, as garbage collection is what allows the immutability of objects to be preserved across calls.

I'm lost here. A garbage collector has nothing do to with object immutability. User code that doesn't change such objects is.

In a modular pipeline, there's no one who knows when it's safe to dispose an image, as it depends for how long this image will be thrown around, which in turn depends on the specific pipeline you're running it through. It's the same with LINQ queries, when you handle transient intermediate projections during complex queries, you don't really want to handle responsibility to anyone in particular as to how that projection will end up being used, as this will screw modularity and composability.

This grows from the wrong assumption that native resource management is or should be transparent. If you extend resource management to be
part of the contract you expose, it will compose as well as everything else. This works just fine with iterators, for example.
 



_______________________________________________
Mono-gc-list maillist  -  Mono-gc-list <at> lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-gc-list

Gmane