« Speaking at The Spring Experience 2007 | Main | Openspaces is more testable, not true. »

August 01, 2007

Defering ObjectGrid partition distribution to avoid redistribution thrashing

We have scenarios where a customer may want to have say 200 partitions and preload the data into the grid when the partitions primaries are initially placed. The customer might want to load 100Gb of data and planned on 500MB of primary data and 500MB of replica data per JVM. Obviously, this needs say 200 JVMs with heap sizes of around 1.5Gb to 2Gb before it's possible to even start preloading.

ObjectGrid could start placing partitions as soon as a single JVM starts but that won't work in this case. We'd place all 200 primaries on that JVM with a 2GB heap and then each primary would start preloading and we'd run out of memory in under a second trying to load 100GB of data in a 2GB JVM. The lession here is naive partition placing like this is very dangerous.

This is why we have the numInitialContainers attribute on an ObjectGrid. This lets the application developer tell ObjectGrid not to start partition placement until at least this number of JVMs have started. We could set this to 200 in this scenario and nothing will happen until all 200 JVMs have started and then the partition primaries and replicas will be placed in them. There is no memory issue with all 200 JVMs registered and 2GB per JVM of heap space.

Another issue is what ObjectGrid automatically redistributes or rebalances as new JVMs start. Lets suppose we had a less extreme situation like we were keeping 10MB per partition but had 200 of them, thats 2Gb or 4Gb with replicas. Imagine the trashing at initial startup if we placed all the partitions on the first 10 JVMs that started and then started another 90JVMs. The amount of rebalancing and redundant shifting of data between JVMs would be ridiculous. The numInitialContainers attribute can be used to prevent this kind of sillyness also. It delays initial partition placement and therefore prevents this kind of thrashing by waiting until the cluster reachs stable initial start state. This eliminates the redistribution issues completely.

You can read more about this on our wiki.

August 1, 2007 | Permalink


Post a comment