Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Replication and polling containers: duplicated messages

Hi,

I got into troubles with XAP 6.0 trying to "scale out" an existing GigaSpaces application. The main workflow looks like this: * PU A acts as a gateway to the space for an external application. It loads data into the space, and writes Request objects for external requests. * PU B acts as the main processing unit. A polling container receives requests written by A, does some (fairly complex) space operations, and returns a Response object.

What I want to do now is to distribute the computing load caused by B on a cluster. My first attempt is to use a sync-replicated space because: * the amount of space data is relatively small (maybe 100-200 MB) * every request in B potentially needs most of this data * there's no easy way of partitioning the data (at least not in a way that would evenly distribute load among cluster members)

In theory I'd deploy a space container and an instance of B on each cluster node. B would still have a unified view of the space data (because of replication), but several requests could be processed in parallel.

In B's pu.xml, I perform a remote space lookup using <os-core:space id="space" url="jini://*/*/myspace?fifo"/>. A polling container is registered with the space which receives Request objects from A and performs the business logic.

Here my issues arise: * The space lookup randomly picks a container of the clustered space. I deploy a space container on every GSC, but still the PU of this GSC (sometimes) picks a remote space. Is there any way to prefer the "local" (= in the same JVM) clustered space to the remote one? * If all PUs pick the same space container, everything works. A new request gets processed by a free PU on the grid. But if the PUs pick different space containers, I get flooded with "Wrong space usage" messages. Apparently every request written by A is replicated on all cluster members (ok) but is also processed by +all+ the polling containers on the grid (not ok). I tried different receive operation handlers (currently SingleTakeReceiveOperationHandler), and also tried to perform a blocking take manually, but I either end up with no requests being processed at all or duplicated request processing. * I also tried using a notify-container instead of a polling-container, but that stops working after a few requests. Another drawback is that there seems to be no way of limiting the number of concurrent workers (in contrast to the polling container).

I read through many of the Wiki pages, but found no real-world example of a similar application. I need both a clustered, replicated space +and+ parallel (but not duplicated) request processing. The space should not be partitioned (for now). Any ideas on this?

Thanks in advance, Daniel

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=2386]{quote}

asked 2008-06-17 10:29:08 -0500

daniel_lichtenberger gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

Remember that sync replicated means every destructive operation will be replicated.
By default a PU running a collocated space will access only its space (make sure u have the clustered tag set to false).

Here is what you can do- scaling out only the business logic. Do not have collocated spaces. The space will run in a separate PU using partitionedsync2backup with one primary and one backup.
- have a local cache running to be used to cache reused data to speed up repeated read operations. You might use different proxy for this other the one that access the master space directly. With the amount of data you have , you will probably never evict data from the local cache - so you are in the optimal scenario. You might also consider local view that is read only client side cache that is preloaded implicitly rather than the local cache that is loaded on demand.

Notify container is by default multi threaded at the client side. You can limit the amount of threads handling notifications and triggering the listener at the client side configuring the lrmi settings. In any case - this does not seems to be relevant for your case.

Would the above will work for you?

Shay

answered 2008-06-17 12:39:19 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Comments

Thanks for the quick reply.

I now use local caches (defined with os-core:local-cache) in both PUs. The polling containers poll the "master" space, while the space event listeners operate on the local caches. This solved the problem of duplicated messages - but now I cannot use transactions any more. I defined the space and tx-managers as follows:

<os-core:space id="masterSpace" url="jini://*/mycontainer/myspace"/>
<os-core:local-cache id="space" space="masterSpace" update-mode="PUSH"/>

<os-core:local-tx-manager id="transactionManager" space="space"/>
<os-core:giga-space id="gigaSpace" space="space" tx-manager="transactionManager" clustered="false"/>

<os-core:local-tx-manager id="masterTransactionManager" space="masterSpace"/>

<os-core:giga-space id="masterGigaSpace" space="masterSpace" tx-manager="masterTransactionManager" clustered="false"/>

In a polling-container I cannot use +os-events:tx-support+ (neither with the master nor with the local space) because the transaction never gets submitted. In the space browser I see hundreds of open connections, each having locked (at least) the polled space object. If I don't use transactions, it basically works but the update semantics absolutely require transactional processing.

daniel_lichtenberger gravatar imagedaniel_lichtenberger ( 2008-06-18 09:05:34 -0500 )edit

Daniel,
u should have 2 proxies with ur polling container:
- One with local cache for read activties. This will boost repeated reads of static data.
- Second without local cache for transactional updates. Use optimistick locking to make sure u update the recent version of the object.

Shay

shay hassidim gravatar imageshay hassidim ( 2008-06-18 17:37:02 -0500 )edit

So the local cache shouldn't be used for update operations? I was under the impression that only local views are read-only, and that local caches can/should be used also for updates.

daniel_lichtenberger gravatar imagedaniel_lichtenberger ( 2008-06-20 03:50:52 -0500 )edit

It could , but there is a cost involved. It will trigger notifications from master space to all local cache instances to update their copy of the data. If u are ok with this - have only one proxy.

Shay

shay hassidim gravatar imageshay hassidim ( 2008-06-20 06:21:49 -0500 )edit

Thanks! Now I use mostly the local cache and it seems to work. However, I had to use the Jini transaction manager because I had weird issues with the local one (like transactions that never committed).

Daniel

daniel_lichtenberger gravatar imagedaniel_lichtenberger ( 2008-06-24 02:15:07 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2008-06-17 10:29:08 -0500

Seen: 46 times

Last updated: Jun 17 '08