Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Issue bringing up PUs in Linux boxes due to NIO exceptions.

We are facing an issue while upgrading our gigaspace setup from Solaris to Linux in production environment. The issue is happening while bringing up processing units. There are 8 processing units in our application ( 4 PU running on one server and 4 running on another server), we use a startup script triggered from one of the server that takes care of bringing up PU supposed to run on the same server as well as the other server, we are able to bring up the 4 PU running on the same machine from where the script is triggered but not on other server.

our lookuplocator config looks like this :

export LOOKUPLOCATORS=newserver1.itginc.com:4166,newserver2.itginc.com:4166

Sometimes the PUs running on remote server comes up partially or not at all.

Ultimately the application throws network interruption exception.

Error from logs copied below :- 2019-02-07 10:17:25,822 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - Waiting indefinitely for [8] processing unit instances to be deployed... 2019-02-07 10:17:59,718 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.3] [1] deployed successfully on [10.8.26.239] 2019-02-07 10:17:59,938 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.1] [1] deployed successfully on [10.8.26.239] 2019-02-07 10:18:05,052 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.4] [2] deployed successfully on [10.8.26.239] 2019-02-07 10:18:05,561 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.2] [2] deployed successfully on [10.8.26.239] 2019-02-07 10:19:23,424 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.2] [1] deployed successfully on [10.40.26.239] 2019-02-07 10:37:15,010 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.1] [2] deployed successfully on [10.40.26.239] 2019-02-07 10:45:23,466 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.3] [2] failed to deploy, resubmitted [true] 2019-02-07 11:30:19,599 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - [Space.3] [2] failed to deploy, resubmitted [true] 2019-02-07 11:30:19,793 INFO [org.openspaces.pu.container.servicegrid.deploy.Deploy] - Finished deploying [8] processing unit instances

Caused by: java.lang.InterruptedException: Thread was interrupted while waiting on the network at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:88) ... 16 more Caused by: java.rmi.ConnectException: LRMI transport protocol over NIO broken connection with ServerEndPoint: [NIO://newserver2.itginc.com:4167/pid[26907]/500292502777314_3_4896795498233689600_details[class org.openspaces.pu.container.servicegrid.PUServiceBeanImpl]]; nested exception is: java.nio.channels.ClosedByInterruptException at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:834) at com.gigaspaces.lrmi.ConnPoolInvocationHandler.invoke(ConnPoolInvocationHandler.java:75) at com.gigaspaces.lrmi.MethodCachedInvocationHandler.invoke(MethodCachedInvocationHandler.java:71) ... 16 more Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:417) at com.gigaspaces.lrmi.nio.Reader.readBytesFromChannelBlocking(Reader.java:248) at com.gigaspaces.lrmi.nio.Reader.readBytesBlocking(Reader.java:671) at com.gigaspaces.lrmi.nio.Reader.bytesToPacket(Reader.java:590) at com.gigaspaces.lrmi.nio.Reader.readReply(Reader.java:159) at com.gigaspaces.lrmi.nio.CPeer.invoke(CPeer.java:769)

We diagnosed with the help of our system administration team that the network settings are perfect on both the new Linux boxes.

asked 2019-02-13 10:31:06 -0600

Gigaspace_user gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

Hi, Please attach gsm logs and relevant gsc logs, GSM says where it tried to allocate the pu and we want to see the GSC where it failed to allocate.

Regards, Ester.

answered 2019-02-17 01:25:00 -0600

Ester gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2019-02-13 10:31:06 -0600

Seen: 282 times

Last updated: Feb 17