Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

SBA architecture and partitioned spaces scaling

So the two major benefits to space partitioning is 1. Distributed in memory data load 2. in SBA architectures, you access data locally vs. remotely.

We currently only benefit from #1, since our application utilizes a partitioned space that is deployed in its own PU. The services are distributed in their respective PUs. The biggest benefits I see from this architecture and why we chose it originally was that services and space PUs can be scaled independently vs. what happens in the SBA architecture, where you package space/services together, which forces you to scale them as one unit. First, I'd like to know if my understanding is correct?

My other benefit is that if say we have PUs in a particular JVM that are currently not utilized, they can pitch in to contributed to work that is possibly contained in a different physical location in a partitioning scheme. Load can't be evenly temporally distributed, therefore I'd hate to have PUs with space/service that is backed up and some sitting and awaiting more work.

So ideally, one would be able to deploy space and services separately, but somehow without the big penalties of remote object access. Ideally, the polling container would get access to data physically allocated to the same JVM where it is running and if there is no more data in that partition, it would then access remote partitions.

I currently don't know if this is possible, without tightly coupling PUs to partition access and implementing our own algorithm for accessing data based on location priority.

Is this an unrealistic expectation? Am I somehow able to achieve this with some facilities that I'm currently not aware of?

Thanks.

Ilya

This thread was imported from the previous forum.
For your reference, the original is available here

asked 2008-06-10 12:36:18 -0600

isterin gravatar image

updated 2013-08-08 09:52:00 -0600

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
0

When your PU includes the space and the business logic your biggest added value here would be much faster data access (read) where remote calls will be involved only when destructive operations called replicated to the backup space.

You can scale up to some extent within the PU by having multiple threads of your business logic (when having polling container).

The advantage of partitioning is the ability to have additional memory capacity and distributing your business logic execution across multiple machines (CPUs).

Remote polling container would need to perform some sort of random access or round robin access to different partition in order to make sure all spaces are consumed. This is doable in different ways (cluster load balancing config modification or Owen’s utility found on openspaces.org).

Remoting providing the ability to enjoy both worlds. Have business logic collocated and the ability to execute it from remote business logic which does not necessarily have a collected space in very scalable and simple manner. See:
http://www.gigaspaces.com/wiki/display/OLH/OpenSpacesRemotingExample

Shay

answered 2008-06-10 12:59:55 -0600

shay hassidim gravatar image
edit flag offensive delete link more

Comments

Shay, thanks for the response....

You can scale up to some extent within the PU by having multiple threads of your business logic (when having polling container).

Yes, but we'd also like to scale up using a different JVM (GSC instance) and possibly a different physical node.

The advantage of partitioning is the ability to have additional memory capacity and distributing your business logic execution across multiple machines (CPUs).

I'm not sure I understand the second, unless you're talking about spaces collocated with the service. The business logic can be distributed even with a non-partitioned space, right? Even if your space is not partitioned it can be accessed by multiple remote polling containers and though your work is distributed. The only downside is the nature of remote operations vs. local.

Remote polling container would need to perform some sort of random access or round robin access to different partition in order to make sure all spaces are consumed. This is doable in different ways (cluster load balancing config modification or Owen’s utility found on openspaces.org ).

Yeah, I was hoping that your remote operations are optimized in a sense where if a space partition is collocated on the same JVM GSC (not within the same PU), you would give these object polling priority, if one is not using FIFO.

So basically if we want to scale the space separate from our services (not embedded), we are stuck with remote operations?

Ilya

isterin gravatar imageisterin ( 2008-06-10 13:17:13 -0600 )edit

Sure the business logic could be distributed even without having the data collocated. In this case the cost involved will be higher when dealing with short fast transactions.

Remoting gives you the ability to have some of the business logic running close the client and some of it collocated executed in distributed manner (ala MapReduce) across the partitions in parallel.

If a business logic accessing a clustered partitioned space and one of the partitions running within the same JVM as the business logic , the access to this space will be done using collocated space proxy. This is done transparently using our smart proxy technology. In this case make sure your proxy using the <cluster> tag as part of the space declaration.

If you need additional clarifications drop me an email: shay at gigaspaces dot com

Shay

shay hassidim gravatar imageshay hassidim ( 2008-06-10 13:36:47 -0600 )edit

>If a business logic accessing a clustered partitioned space and one of the partitions running within the same JVM as the business logic , the access to this >space will be done using collocated space proxy. This is done transparently using our smart proxy technology. In this case make sure your proxy using the
> <cluster> tag as part of the space declaration.

This is very good!

When was this feature introduced?

I have discussed exactly this feature as a possible optimization, cool that it actually exists :=)

niklasuhrberg gravatar imageniklasuhrberg ( 2008-06-13 13:24:48 -0600 )edit

Agree, this is a very nice feature. What would be even better, is if the polling containers actually gave priority to locally collocated spaces for template selection (in non FIFO scenarios of course).

Ilya

isterin gravatar imageisterin ( 2008-06-13 13:35:55 -0600 )edit

We have this feature from the early days of our LRMI communication layer , about 5-6 years. We have diffused this recently (starting with 6.0) to all other product services: transaction manager , lookup etc. This means that if you have a lookup service or a distributed transaction manager running within the same JVM as the space or the client JVM, the interaction with these will be done using local references and not via remote calls.

This is one of the reasons why the communication protocol parameters as been moved out from the space schema into a more global centralized config such as the \gigaspaces-xap-6.5-rc1\config\services\services.config. This one includes the settings for all network protocol parameters such as thread pool , ports etc.

See more here:http://www.gigaspaces.com/wiki/display/OLH/Communication+Protocol

Shay

shay hassidim gravatar imageshay hassidim ( 2008-06-13 14:06:46 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2008-06-10 12:36:18 -0600

Seen: 105 times

Last updated: Jun 10 '08