Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Howto wait until all partitions are online on startup

Currently we get org.openspaces.core.UpdateOperationTimeoutException because we try to do clustered writes as soon as the first partition (of two) is online.

Therefor I think I should wait (either blocking or async) until all partitions in the current space are started before starting my message processing.

Is this the best way to ensure a more "stable" (and also faster) startup?

If yes: Can you please tell me what is the easiest way to wait on startup until all partitions are online?


Currently we are using the following cluster schema and space config (we are using multiple spaces but they all have the same deploy config):

<os-sla:sla cluster-schema="partitioned-sync2backup" number-of-instances="2" number-of-backups="1"
            max-instances-per-vm="4">

<os-core:space id="space" url="${com.XXX.XXX.conversation.spaceCreateUrl}" lookup-groups="${LOOKUPGROUPS}">
    <os-core:properties>
        <props>
            <prop key="space-config.lease_manager.expiration_time_interval">2000</prop>
            <prop key="cluster-config.groups.group.fail-over-policy.active-election.yield-time">300</prop>
            <prop key="cluster-config.groups.group.fail-over-policy.active-election.fault-detector.invocation-delay">300</prop>
            <prop key="cluster-config.groups.group.fail-over-policy.active-election.fault-detector.retry-count">2</prop>
        </props>
    </os-core:properties>
</os-core:space>

Thanks and Br, David

asked 2014-11-20 11:11:59 -0500

leozilla gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

Few options:

  • Implement a distributed task that will return 1 from each partition. The reducer should sum all these. The returned value should much the number of deployed partitions. You can get the number of expected partitions using the admin api.

  • Use the admin api to get proxy to each partition and call ping api on each.

  • Use the admin api to get data grid cluster state. You should wait untill you receive the INTACT mode. This means all primary and backup instances been fully provisioned. This should be the simplest option.

Shay

answered 2014-11-20 11:34:19 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Comments

thanks for your fast response! We did choose option number 1 (the distributed task) and it works fine for us. :-)

leozilla gravatar imageleozilla ( 2014-11-25 02:32:06 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2014-11-20 11:11:59 -0500

Seen: 221 times

Last updated: Nov 20 '14