Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Trouble with Admin re-connecting after a network outage

Hi,

We've started using the Admin object to monitor our clusters under GS 7.0. We start up an Admin object as outlined in the GS docs for Admin. We then get various events after about 20-30 seconds regarding space servers being discovered, coming on line, etc... If we simulate a network outage, by say, unplugging a network connection to a server or client, we get no events for about 20-30 seconds and then we get events about servers leaving the cluster or space servers being unavailable, etc... However, when we re-connect the network connection, it seems that the Admin is not able to re-establish the connection and report any more events.

However, after about 20-30 seconds, we can see that the SpaceProxy on the client is able to re-connect to the cluster and space reads writes and lease modifications work fine. This is puzzling that the Admin has trouble re-connecting, but the SpaceProxy is able to get working again.

We see the following type of exception after plugging the network connection back in -- this is from the client running the Admin:

2009-10-21 18:06:06,781 WARN DiscoveryService Failed to add GSC with uid a0bc1ffb-cf1f-4eb1-a61e-3ae3b11dad1b

java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: NIO://192.168.3.109:40000/pid[2978/273958353116713915261430978987074]; nested exception is:

java.io.IOException: An existing connection was forcibly closed by the remote host

at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:504)

at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:57)

at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:366)

at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:354)

at $Proxy18.isSecured(Unknown Source)

at com.gigaspaces.grid.gsc.GSCProxy.isSecured(GSCProxy.java:252)

at org.openspaces.admin.internal.discovery.DiscoveryService.serviceAdded(DiscoveryService.java:215)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2150)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2137)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.addServiceNotify(ServiceDiscoveryManager.java:2097)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.access$2500(ServiceDiscoveryManager.java:819)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl$NewOldServiceTask.run(ServiceDiscoveryManager.java:1402)

at com.sun.jini.thread.TaskManager$TaskThread.run(TaskManager.java:397)

Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host

at sun.nio.ch.SocketDispatcher.write0(Native Method)

at sun.nio.ch.SocketDispatcher.write(Unknown Source)

at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)

at sun.nio.ch.IOUtil.write(Unknown Source)

at sun.nio.ch.SocketChannelImpl.write(Unknown Source)

at com.gigaspaces.lrmi.nio.Writer.writeBytesToChannelBlocking(Writer.java:429)

at com.gigaspaces.lrmi.nio.Writer.writeBytesBlocking(Writer.java:369)

at com.gigaspaces.lrmi.nio.Writer.writePacket(Writer.java:219)

at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:139)

at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:144)

at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:417)

... 12 more

2009-10-21 18:06:06,797 WARN DiscoveryService Failed to add Processing Unit Instance with uid 71ae40e9-89ca-4686-8098-8e891d77ef60

java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: NIO://192.168.3.109:40000/pid[2978/273958353118333915261430978987074]; nested exception is:

java.io.IOException: An existing connection was forcibly closed by the remote host

at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:504)

at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:57)

at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:366)

at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:354)

at $Proxy16.getPUDetails(Unknown Source)

at org.openspaces.admin.internal.discovery.DiscoveryService.serviceAdded(DiscoveryService.java:237)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2150)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2137)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.addServiceNotify(ServiceDiscoveryManager.java:2097)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.access$2500(ServiceDiscoveryManager.java:819)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl$NewOldServiceTask.run(ServiceDiscoveryManager.java:1402)

at com.sun.jini.thread.TaskManager$TaskThread.run(TaskManager.java:397)

Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host

at sun.nio.ch.SocketDispatcher.write0(Native Method)

at sun.nio.ch.SocketDispatcher.write(Unknown Source)

at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)

at sun.nio.ch.IOUtil.write(Unknown Source)

at sun.nio.ch.SocketChannelImpl.write(Unknown Source)

at com.gigaspaces.lrmi.nio.Writer.writeBytesToChannelBlocking(Writer.java:429)

at com.gigaspaces.lrmi.nio.Writer.writeBytesBlocking(Writer.java:369)

at com.gigaspaces.lrmi.nio.Writer.writePacket(Writer.java:219)

at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:139)

at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:144)

at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:417)

... 11 more

2009-10-21 18:06:06,797 WARN DiscoveryService Failed to add Processing Unit Instance with uid 425d4878-170b-4f98-9e5a-78a1a18a36e4

java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: NIO://192.168.3.109:40001/pid[2985/39510785584520778745383797120990614]; nested exception is:

java.io.IOException: An existing connection was forcibly closed by the remote host

at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:504)

at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:57)

at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:366)

at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:354)

at $Proxy5.getPUDetails(Unknown Source)

at org.openspaces.admin.internal.discovery.DiscoveryService.serviceAdded(DiscoveryService.java:237)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2150)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.serviceNotifyDo(ServiceDiscoveryManager.java:2137)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.addServiceNotify(ServiceDiscoveryManager.java:2097)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl.access$2500(ServiceDiscoveryManager.java:819)

at net.jini.lookup.ServiceDiscoveryManager$LookupCacheImpl$NewOldServiceTask.run(ServiceDiscoveryManager.java:1402)

at com.sun.jini.thread.TaskManager$TaskThread.run(TaskManager.java:397)

Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host

at sun.nio.ch.SocketDispatcher.write0(Native Method)

at sun.nio.ch.SocketDispatcher.write(Unknown Source)

at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)

at sun.nio.ch.IOUtil.write(Unknown Source)

at sun.nio.ch.SocketChannelImpl.write(Unknown Source)

at com.gigaspaces.lrmi.nio.Writer.writeBytesToChannelBlocking ... (more)

asked 2009-10-22 14:29:22 -0500

jazzbutcher gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

These are simply network protocol exceptions.
Here is what I suggest:
1. Report the problem to support
2. Create the Admin in case of such exception thrown (have some Sleep before the actual creation)

Shay

answered 2009-10-26 06:56:08 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2009-10-22 14:29:22 -0500

Seen: 100 times

Last updated: Oct 26 '09