Failover time is too long

Hi, I hope that someone here knows the answer for this simple question:

we have partioned cluster with synchronous replications. when primary node fails it takes about 5 seconds for backup to detect the failure of the primary and to start working. The question is: is this normal? We really need shorter time - how we can change it?

Thanks, Kate

Do you see the 5 seconds delay from the client side?
Are you using transactions?
In general fail-over should take very short time.
Do you have IWorkers running at the backup space that takes time to wake up once their host space becomes active?


It is taesting application, so I from my debug messages I can see that only after 5 seconds the backup's IWorker initializes. Before that there is log message that backup do not get heartbeat from primary during 4500 ms. I thought that backup waits for 4.5 sec and only after that it detects that primary failed, is it right? We use only local transactions.

Please lower the <fail-over-find-timeout> as part of the cluster schema (located GigaSpaces Root\config\schemas\sync_replicated-cluster-schema.xsl and retry.


