Antoine Chambille | 15 Feb 10:48 2013

NUMA-Aware Java Heaps for in-memory databases

I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Stanimir Simeonoff | 15 Feb 11:35 2013

Re: NUMA-Aware Java Heaps for in-memory databases

Just out of curiosity: would not DirectBuffers and managing the data yourself would be both easier and more efficient?
Technically you can ship the data w/o even copying it straight to the sockets (or disks).
I don't know how you store the data itself but I can think only of tuples i.e. Object[].

Stanimir

On Fri, Feb 15, 2013 at 11:48 AM, Antoine Chambille <ach <at> quartetfs.com> wrote:
I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Antoine Chambille | 15 Feb 12:18 2013

Re: NUMA-Aware Java Heaps for in-memory databases

Data is stored in columns, to maximize the performance of analytical queries that commonly scan billions of rows but only for a subset of the columns. We support a mix of primitive data and object oriented data ( some columns look like double[], some other look like Object[] ).

Using direct buffers would open a door to NUMA-Aware memory placement (provided that the direct allocation itself can be made on the right node). That's probably more a Pandora box than a door though ;) Anyway it implies serializing data into byte arrays, and deserializing at each query. That's a serious performance penalty for primitive data, and that's absolutely prohibitive when you do that with plain objects, even Externizable ones.

-Antoine


On 15 February 2013 11:35, Stanimir Simeonoff <stanimir <at> riflexo.com> wrote:
Just out of curiosity: would not DirectBuffers and managing the data yourself would be both easier and more efficient?
Technically you can ship the data w/o even copying it straight to the sockets (or disks).
I don't know how you store the data itself but I can think only of tuples i.e. Object[].

Stanimir

On Fri, Feb 15, 2013 at 11:48 AM, Antoine Chambille <ach <at> quartetfs.com> wrote:
I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Michał Warecki | 15 Feb 16:24 2013
Picon

Re: NUMA-Aware Java Heaps for in-memory databases

Hi!

I may be wrong but:
If you know in which NUMA node the data are stored and which thread will read this data, you can use pthead_setaffinity_np() function.
This will direct particular thread to particular CPU with faster access to particular NUMA node.
That's not a Pandora box but you have to use JNI.

Cheers,
Michał

2013/2/15 Antoine Chambille <ach <at> quartetfs.com>
Data is stored in columns, to maximize the performance of analytical queries that commonly scan billions of rows but only for a subset of the columns. We support a mix of primitive data and object oriented data ( some columns look like double[], some other look like Object[] ).

Using direct buffers would open a door to NUMA-Aware memory placement (provided that the direct allocation itself can be made on the right node). That's probably more a Pandora box than a door though ;) Anyway it implies serializing data into byte arrays, and deserializing at each query. That's a serious performance penalty for primitive data, and that's absolutely prohibitive when you do that with plain objects, even Externizable ones.

-Antoine


On 15 February 2013 11:35, Stanimir Simeonoff <stanimir <at> riflexo.com> wrote:
Just out of curiosity: would not DirectBuffers and managing the data yourself would be both easier and more efficient?
Technically you can ship the data w/o even copying it straight to the sockets (or disks).
I don't know how you store the data itself but I can think only of tuples i.e. Object[].

Stanimir

On Fri, Feb 15, 2013 at 11:48 AM, Antoine Chambille <ach <at> quartetfs.com> wrote:
I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Antoine Chambille | 15 Feb 16:58 2013

Re: NUMA-Aware Java Heaps for in-memory databases

> Michal

That was my initial hope. Binding threads to a NUMA node, and crossing my fingers that the objects instantiated by a thread would be allocated on its home NUMA node. And stay there. So that if later the same thread reads the same data it will read memory from its home NUMA node.

But that is not how it works. Even with NUMA options activated, the hotspot JVM will move objects when they get promoted into the old generation, removing them from their home NUMA node and copying them in some big shared NUMA-oblivious memory area...


I agree that binding threads to NUMA nodes is actually an easy trick when you use the JNA library. There is even an open source project by Peter Lawrey that does it quite elegantly and cross platform. https://github.com/peter-lawrey/Java-Thread-Affinity/

-Antoine





On 15 February 2013 16:24, Michał Warecki <michal.warecki <at> gmail.com> wrote:
Hi!

I may be wrong but:
If you know in which NUMA node the data are stored and which thread will read this data, you can use pthead_setaffinity_np() function.
This will direct particular thread to particular CPU with faster access to particular NUMA node.
That's not a Pandora box but you have to use JNI.

Cheers,
Michał


2013/2/15 Antoine Chambille <ach <at> quartetfs.com>
Data is stored in columns, to maximize the performance of analytical queries that commonly scan billions of rows but only for a subset of the columns. We support a mix of primitive data and object oriented data ( some columns look like double[], some other look like Object[] ).

Using direct buffers would open a door to NUMA-Aware memory placement (provided that the direct allocation itself can be made on the right node). That's probably more a Pandora box than a door though ;) Anyway it implies serializing data into byte arrays, and deserializing at each query. That's a serious performance penalty for primitive data, and that's absolutely prohibitive when you do that with plain objects, even Externizable ones.

-Antoine


On 15 February 2013 11:35, Stanimir Simeonoff <stanimir <at> riflexo.com> wrote:
Just out of curiosity: would not DirectBuffers and managing the data yourself would be both easier and more efficient?
Technically you can ship the data w/o even copying it straight to the sockets (or disks).
I don't know how you store the data itself but I can think only of tuples i.e. Object[].

Stanimir

On Fri, Feb 15, 2013 at 11:48 AM, Antoine Chambille <ach <at> quartetfs.com> wrote:
I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Michał Warecki | 16 Feb 18:51 2013
Picon

Re: NUMA-Aware Java Heaps for in-memory databases

Yes, of course. I didn't mean just a simple CPU binding. After evacuation (from young to old) and compaction, threads have to be rebinded.
As you wrote, problem is with old gen heap and therefore JVM modifications are needed. I don't know if you can deliver custom OpenJDK HotSpot. Probably you have to experiment with NUMA-aware old gen and NUMA-aware objects reordering during compaction (you have to take care also about CPU cache and TLB). Which GC are you using? With CMS there is another issues with fragmentation after a few collections. I believe, "bump the pointer" allocation is better in this case.
If you have time on such issues in the work you're lucky :-)

These are just my thoughts, but I'm no expert.

Michał

2013/2/15 Antoine Chambille <ach <at> quartetfs.com>
> Michal

That was my initial hope. Binding threads to a NUMA node, and crossing my fingers that the objects instantiated by a thread would be allocated on its home NUMA node. And stay there. So that if later the same thread reads the same data it will read memory from its home NUMA node.

But that is not how it works. Even with NUMA options activated, the hotspot JVM will move objects when they get promoted into the old generation, removing them from their home NUMA node and copying them in some big shared NUMA-oblivious memory area...


I agree that binding threads to NUMA nodes is actually an easy trick when you use the JNA library. There is even an open source project by Peter Lawrey that does it quite elegantly and cross platform. https://github.com/peter-lawrey/Java-Thread-Affinity/

-Antoine





On 15 February 2013 16:24, Michał Warecki <michal.warecki <at> gmail.com> wrote:
Hi!

I may be wrong but:
If you know in which NUMA node the data are stored and which thread will read this data, you can use pthead_setaffinity_np() function.
This will direct particular thread to particular CPU with faster access to particular NUMA node.
That's not a Pandora box but you have to use JNI.

Cheers,
Michał


2013/2/15 Antoine Chambille <ach <at> quartetfs.com>
Data is stored in columns, to maximize the performance of analytical queries that commonly scan billions of rows but only for a subset of the columns. We support a mix of primitive data and object oriented data ( some columns look like double[], some other look like Object[] ).

Using direct buffers would open a door to NUMA-Aware memory placement (provided that the direct allocation itself can be made on the right node). That's probably more a Pandora box than a door though ;) Anyway it implies serializing data into byte arrays, and deserializing at each query. That's a serious performance penalty for primitive data, and that's absolutely prohibitive when you do that with plain objects, even Externizable ones.

-Antoine


On 15 February 2013 11:35, Stanimir Simeonoff <stanimir <at> riflexo.com> wrote:
Just out of curiosity: would not DirectBuffers and managing the data yourself would be both easier and more efficient?
Technically you can ship the data w/o even copying it straight to the sockets (or disks).
I don't know how you store the data itself but I can think only of tuples i.e. Object[].

Stanimir

On Fri, Feb 15, 2013 at 11:48 AM, Antoine Chambille <ach <at> quartetfs.com> wrote:
I think this community is the right place to start a conversation about NUMA (aren't NUMA nodes to memory what multiprocessors are to processing? ;). I apologize if this is considered off-topic.


We are developing a Java in-memory analytical database (it's called "ActivePivot") that our customers deploy on ever larger datasets. Some ActivePivot instances are deployed on java heaps close to 1TB, on NUMA servers (typically 4 Xeon processors and 4 NUMA nodes). This is becoming a trend, and we are researching solutions to improve our performance on NUMA configurations.


We understand that in the current state of things (and including JDK8) the support for NUMA in hotspot is the following:
* The young generation heap layout can be NUMA-Aware (partitioned per NUMA node, objects allocated in the same node than the running thread)
* The old generation heap layout is not optimized for NUMA (at best the old generation is interleaved among nodes which at least makes memory accesses somewhat uniform)
* The parallel garbage collector is NUMA optimized, the GC threads focusing on objects in their node.


Yet activating -XX:+UseNUMA option has almost no impact on the performance of our in-memory database. It is not surprising, the pattern for a database is to load the data in the memory and then make queries on it. The data goes and stays in the old generation, and it is read from there by queries. Most memory accesses are in the old gen and most of those are not local.

I guess there is a reason hotspot does not yet optimize the old generation for NUMA. It must be very difficult to do it in the general case, when you have no idea what thread from what node will read data and interleaving is. But for an in-memory database this is frustrating because we know very well which threads will access which piece of data. At least in ActivePivot data structures are partitioned, partitions are each assigned a thread pool so the threads that allocated the data in a partition are also the threads that perform sub-queries on that partition. We are a few lines of code away from binding thread pools to NUMA nodes, and if the garbage collector would leave objects promoted to the old generation on their original NUMA node memory accesses would be close to optimal.

We have not been able to do that. But that being said I read an inspiring 2005 article from Mustafa M. Tikir and Jeffrey K. Hollingsworth that did experiment on NUMA layouts for the old generation. ("NUMA-aware Java heaps for server applications" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.6587&rep=rep1&type=pdf ). That motivated me to ask the following questions:


* Are there hidden or experimental hotspot options that allow NUMA-Aware partitioning of the old generation?
* Do you know why there isn't much (visible, generally available) research on NUMA optimizations for the old gen? Is the Java in-memory database use case considered a rare one?
* Maybe we should experiment and even contribute new heap layouts to the open-jdk project. Can some of you guys comment on the difficulty of that?


Thanks for reading,

--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





--
Antoine CHAMBILLE
Director Research & Development
Quartet FS

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest <at> cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

Gmane