Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Partition strategies and embedded spaces

It seems that the selection of a partitioning hash results in a specific number of hash buckets. For example, if I partition based on the last digit of a social security number, I will end up in 10 potential buckets. As an exercise in scalability, I create two spaces and assign 0-4 to one space and 5-9 to the other. Here are a few questions based on this assumption.

  1. What if my two spaces are the result of two PU's being deployed using an embedded space? It seems like partitioning is a separate exercise from the deployment of PUs (to handle load) that re-partitioning would have to occur each time a new PU was deployed.

  2. Assume I do not use an embedded space. Now I can deploy PU's at will without affecting my PU deployments. However, haven't I capped the number of spaces I can run because I have only 10 hash possibilities?

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=2392]{quote}

asked 2008-06-18 22:02:37 -0500

oravecz gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

James,

The partitioning (or load-balancing) is explained here:
http://www.gigaspaces.com/wiki/display/OLH/Load-BalancingGroup-ClusterSchema

You will find example that explains how the hash based load-balancing works and a "calculator" that mimics the functionality executed at the
client side when it routs the operations to the relevant target space based on the routing field value.

In essence the Target space for an operation calculated using:
Target space = routing field hash code % amount of active partitions.

If the routing field is null the operation is sent to all partitions (in parallel or serial manner). This is relevant only for read and take operations. Write/Update operations must have value with the routing field.

read/take operations with timeout > 0 (blocking operations) with null routing field value are not supported (for now). You will need to call these using non blocking mode and sleep + retry the call in case null returns (no matching).

Re-partitioning would not occur each time a new PU is deployed since the amount of total partitions is fixed. You may have the amount of JVM hosting the partitions changed in run-time.
For example:
2 JVMs each having 2 G heap size , total of 4 G heap size hosting 10 partitions (10 PUs) topology can be changed to 10 JVMs with each having 2 G heap size - total of 20 G heap size. Moving partitions from one JVM (GSC) to another can be done manually or programmatically.

The deployment topology (SLA) is decoupled from the PU configuration.
This means you can have your PU deployed using replicated topology and change it to partitioned by changing only the sla declarations.

If you capped the amount of routing field values to 10 it means that only 10 partitions would be used as the target space for your operations.

It does not matter how you deploy your partitioned space - embedded or remote (i.e. as separate PU different than your business logic) , remote operations will be load-balanced across the partitions based on the routing field value and the amount of total active partitions.

Shay

answered 2008-06-20 19:23:24 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2008-06-18 22:02:37 -0500

Seen: 92 times

Last updated: Jun 20 '08