Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

NIO issues

Dear Gigaspaces team,

Last week on our production environment we received many exception of the following form

Caused by: java.lang.InterruptedException: Thread was interrupted while waiting on the network at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:84) at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:428) at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:403) at com.gigaspaces.reflect.$GSProxy2.update(Unknown Source) at com.gigaspaces.internal.lrmi.stubs.LRMISpaceImpl.update(LRMISpaceImpl.java:297) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.internalUpdateOrWrite(SpaceProxyImplWriteUpdateAction.java:223) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.updateOrWrite(SpaceProxyImplWriteUpdateAction.java:183) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.access$000(SpaceProxyImplWriteUpdateAction.java:32) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction$ClusterSpaceWriteProxyAction.write(SpaceProxyImplWriteUpdateAction.java:52) at com.gigaspaces.internal.client.spaceproxy.actions.AbstractSpaceProxyActionManager.write(AbstractSpaceProxyActionManager.java:463) ... 9 more Caused by: java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: [NIO://_some address_]; nested exception is: java.nio.channels.ClosedByInterruptException at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:607) at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:40) at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:60) ... 18 more

This issue locked many our server threads and we were not able to handle more load.

This happened when we updated all our code base to deprecate the use of SpaceFinder and IJSpace in favor of the GigaSpaceConfigurer, and during testing everything was fine. However, in our code base we use the Polling Container with a local jini transaction. The way spaces are deployed they are deployed on 2 partitions, so we actually create a polling container that works with each partition as the template object's routing ID is set to that partition.

The big issue is why we are unit test our code base that use the polling container we are not able to recreate the error in a consistent fashion, but when it happens I see this

2012-02-27 12:28:17,839 [SimplePollingEventListenerContainer-2675|INFO] LongRunningTaskPollingContainer: onEvent failed orfyassr6500r6_10268356257560_1:orfyassr6500r6_10268356257560_1-89326 com.j_spaces.core.exception.internal.InterruptedSpaceException: java.lang.InterruptedException: Thread was interrupted while waiting on the network at com.gigaspaces.internal.client.spaceproxy.actions.AbstractSpaceProxyActionManager.write(AbstractSpaceProxyActionManager.java:468) at com.gigaspaces.internal.client.spaceproxy.AbstractSpaceProxy.write(AbstractSpaceProxy.java:420) at com.gallup.distributed.computefarm.space.LongRunningTaskPollingContainer$TaskSpaceDataEventListener.onEvent(LongRunningTaskPollingContainer.java:145) at com.gallup.distributed.computefarm.space.LongRunningTaskPollingContainer$TaskSpaceDataEventListener.onEvent(LongRunningTaskPollingContainer.java:1) at org.openspaces.events.AbstractEventListenerContainer.invokeListener(AbstractEventListenerContainer.java:193) at org.openspaces.events.polling.AbstractPollingEventListenerContainer.doReceiveAndExecute(AbstractPollingEventListenerContainer.java:301) at org.openspaces.events.polling.AbstractPollingEventListenerContainer.receiveAndExecute(AbstractPollingEventListenerContainer.java:239) at org.openspaces.events.polling.SimplePollingEventListenerContainer$AsyncEventListenerInvoker.invokeListener(SimplePollingEventListenerContainer.java:731) at org.openspaces.events.polling.SimplePollingEventListenerContainer$AsyncEventListenerInvoker.run(SimplePollingEventListenerContainer.java:678) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.InterruptedException: Thread was interrupted while waiting on the network at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:84) at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:428) at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:403) at com.gigaspaces.reflect.$GSProxy2.update(Unknown Source) at com.gigaspaces.internal.lrmi.stubs.LRMISpaceImpl.update(LRMISpaceImpl.java:297) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.internalUpdateOrWrite(SpaceProxyImplWriteUpdateAction.java:223) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.updateOrWrite(SpaceProxyImplWriteUpdateAction.java:183) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction.access$000(SpaceProxyImplWriteUpdateAction.java:32) at com.gigaspaces.internal.client.spaceproxy.actions.SpaceProxyImplWriteUpdateAction$ClusterSpaceWriteProxyAction.write(SpaceProxyImplWriteUpdateAction.java:52) at com.gigaspaces.internal.client.spaceproxy.actions.AbstractSpaceProxyActionManager.write(AbstractSpaceProxyActionManager.java:463) ... 9 more Caused by: java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: [NIO://localhost:37803/pid[4528]/8394277787926_3_-1095458564349351717]; nested exception is: java.nio.channels.ClosedByInterruptException at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:607) at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:40) at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:60) ... 18 more Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:270) at com.gigaspaces.lrmi.nio.Reader.readBytesFromChannelBlocking(Reader.java:195) at com.gigaspaces.lrmi.nio.Reader.readBytesBlocking(Reader.java:629) at com.gigaspaces.lrmi.nio.Reader.bytesToPacket(Reader.java:548) at com.gigaspaces.lrmi.nio.Reader.readReply(Reader.java:128) at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:543) ... 20 more

One more note to note, When this issue wa happening on production, tailing the gigaspace logs we saw this exception

2012-02-20 07:31:13,408 prod711space.1 [1] SEVERE [com.gigaspaces.lrmi] - LRMI transport protocol over NIO connection [NIO://jsprod2b.gallup.com:37889/pid[16188]/5449595214750489_3_-258529723729479172] caught unexpected exception: com.gigaspaces.lrmi.nio.MarshallingException: Failed to marsh: [RequestPacket: interface com.gigaspaces.cluster.replication.IReplicationTarget.replicate(java.lang.String prod711space_container1:prod711space, java.util.List [SyncPacket=(key=7185588,op=WRITE,uid=819135195^42^Results: .............................. Object contents............................ }^0^0)], boolean false, long 1329542737378), isOneWay = false, isCallBack = false] at com.gigaspaces.lrmi.nio.Writer.writePacket(Writer.java:199) at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:139) at com.gigaspaces.lrmi.nio.Writer.writeRequest(Writer.java:144) at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:524) at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:40) at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:60) at com.gigaspaces.lrmi.DynamicSmartStub.invokeRemote(DynamicSmartStub.java:428) at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:403) at com.gigaspaces.reflect.$GSProxy17.replicate(Unknown Source) at com.gigaspaces.cluster.replication.sync.UnicastEventDispacher.dispatch(UnicastEventDispacher.java:221) at com.gigaspaces.cluster.replication.sync.UnicastEventDispacher.dispatchEvent(UnicastEventDispacher.java:80) at com.gigaspaces.cluster.replication.sync.AbstractEventDispacher.dispatchEvent(AbstractEventDispacher.java:150) at com.gigaspaces.cluster.replication.sync.SyncReplicationController.dispatch(SyncReplicationController.java:93) at com.gigaspaces.cluster.replication.sync.SyncReplicationController.doSyncReplication(SyncReplicationController.java:118) at com.gigaspaces.cluster.replication.Replicator.replicateSync(Replicator.java:328) at com.gigaspaces.internal.server.space.SpaceEngine.performSyncReplication(SpaceEngine.java:7870) at com.gigaspaces.internal.server.space.SpaceEngine.write(SpaceEngine.java:1013) at com.gigaspaces.internal.server.space.SpaceEngine.unsafeWrite(SpaceEngine.java:866) at com.gigaspaces.internal.server.space.SpaceEngine.write ... (more)

asked 2012-02-29 21:03:49 -0600

ramzi gravatar image

updated 2013-08-08 09:52:00 -0600

jaissefsfex gravatar image
edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

Are you using Java version 5?

answered 2012-05-09 05:24:55 -0600

eitany gravatar image
edit flag offensive delete link more

Comments

We are actually on java 6, we do plan to upgrade to gs 9 somewhere late this year. Do we know why the java.io.UTFDataFormatException happens in this context?

ramzi gravatar imageramzi ( 2012-06-13 04:00:01 -0600 )edit

I suggest you check this with support.
Shay

shay hassidim gravatar imageshay hassidim ( 2012-06-13 11:32:56 -0600 )edit
0

I suggest you upgrade to XAP 9. Good chance this issue has been resolved with this version.

Shay

answered 2012-05-09 04:49:51 -0600

shay hassidim gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2012-02-29 21:03:49 -0600

Seen: 590 times

Last updated: May 09 '12