Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Distributed Transaction Problems (Deadlock?)

2 Questions here really, any help would be much appreciated.

#1 I have a clustered space, which use the embedded 'Distributed Jini Transaction Manager' in order to perform transactions over the partitions.

My clients that connect up (we're talking 100's here btw), if they want to execute transactions should they also declare a distributed manager in spring or use the lookup manager instead? (or would you suggest doing all transactions remotely using remoting and not letting the clients directly use transactions? - thinking of performance here)

The wiki seems to indicate that I can use the lookup manager to get a hold of one of the distributed managers but I get the following exception

org.springframework.transaction.TransactionSystemException: Failed to find Jini transaction manager using groups [ALL], locators [null] timeout [5000] and name [null]

#2 This is by far a much odder one, I appear to be getting a deadlock, I'm using programmatic transactions on the distributed transaction manager as follows

transactionTemplate = new TransactionTemplate(ptm); ... transactionTemplate.execute(...);

which works 100% fine, I can see the transactions appear in gs-ui space browser no problem (I am of course manually rolling back where needed, commit seems automatic).

But when I execute 2 near identical transactions very close to eachother in terms of time, they both just sit there and never commit/rollback. I'm assuming they're both trying to lock the same objects and getting stuck somehow - crazy I know. If I add in a small delay for the second one they both execute fine.

Rather worryingly they don't even appear to timeout with the timeout values I set on the transaction managers.

Now looking at the wiki, perhaps I shouldn't be using the TransactionTemplate but using DefaultTransactionDefinition instead??

Update:

If I wait about a minute or 2 I do get the transactions finally aborting with this exception. Also changing to the advised 'DefaultTransactionDefinition' method produces the same results :(.

org.openspaces.core.transaction.manager.AbstractJiniTransactionManager$1: unexpected exception ; nested exception is net.jini.core.transaction.CannotAbortException at org.openspaces.core.transaction.manager.AbstractJiniTransactionManager.convertJiniException(AbstractJiniTransactionManager.java:396) at org.openspaces.core.transaction.manager.AbstractJiniTransactionManager.doRollback(AbstractJiniTransactionManager.java:313) at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:800) at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:777) at alecti.common.TransactionalClusteredSpaceAwareBean.executeTransaction(TransactionalClusteredSpaceAwareBean.java:40) at alecti.service.order.OrderService.processRequest(OrderService.java:414) at alecti.service.order.OrderService.onNotify(OrderService.java:399) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) ... Caused by: net.jini.core.transaction.CannotAbortException at com.sun.jini.mahalo.TxnManagerImpl.abort(TxnManagerImpl.java:952) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at com.gigaspaces.lrmi.DynamicSmartStub._invoke(DynamicSmartStub.java:214) at com.gigaspaces.lrmi.DynamicSmartStub.invokeDirect(DynamicSmartStub.java:296) at com.gigaspaces.lrmi.DynamicSmartStub.invoke(DynamicSmartStub.java:372) at $Proxy62.abort(Unknown Source) at com.sun.jini.mahalo.TxnMgrProxy.abort(TxnMgrProxy.java:144) at net.jini.core.transaction.server.ServerTransaction.abort(ServerTransaction.java:142) at org.openspaces.core.transaction.manager.AbstractJiniTransactionManager.doRollback(AbstractJiniTransactionManager.java:308) ... 27 more

Edited by: Andrew Parry on Jan 22, 2009 12:19 PM h4. Attachments

[DistTxTestCase-Space.zip|/upfiles/13759712743493355.zip]

[gs-ui.jpeg|/upfiles/13759712749255555.jpeg]

[space-console.JPG|/upfiles/13759712743884556.jpeg]

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=2821]{quote}

asked 2009-01-22 06:07:01 -0600

aparry gravatar image

updated 2013-08-08 09:52:00 -0600

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

1. Make sure you set the locators or groups when configuring the
'Distributed Jini Transaction Manager'. These should match the space
settings. In general we would recommend using the a remote transaction
for relatively small transactions.
2. Can you post a test case?

Shay

Attachments

  1. DistTxTestCase-Space.zip
  2. gs-ui.jpeg
  3. space-console.JPG

answered 2009-01-22 06:35:19 -0600

shay hassidim gravatar image
edit flag offensive delete link more

Comments

Thanks.

Will give that a go for the lookup.

Here's a simple test case space for the second problem, source and compiled project included.

If you just deploy it as normal with 2xGSC running, the embedded service will kick in on partition 1 after 30 seconds, write 4 objects to the cluster, then attempt to execute 2 near identical transactions (both read and update the same object) by using 2 threads, resulting in about a minute of deadlock before the exception I mentioned before. The threads are just to simulate 2 notify event threads executing similar logic at the same time - which is what my real project does.

Also attached is a screenshot of the gs-ui duing the deadlock, and console output from afterwards.

Thanks h4. Attachments

[DistTxTestCase-Space.zip|/upfiles/1375971275566050.zip]

[gs-ui.jpeg|/upfiles/1375971275179520.jpeg]

[space-console.JPG|/upfiles/1375971275415030.jpeg]

aparry gravatar imageaparry ( 2009-01-22 08:47:25 -0600 )edit

Ahh, thanks Guy. That's very interesting, I'd not thought about this, had forgotten about the exclusive flag and reads inside a transaction not being exclusive.

I think this solves my problem, by replacing the normal reads on the objects I intend to update with exclusive reads it does seem to work well. I guess I just have to take into account these reads timing out, which is no big deal, and more importantly track down all the places I need to update in my real app.

Looking at this page

http://www.gigaspaces.com/wiki/displa...

null transaction reads are unaffected by the exclusive lock, so it shouldn't effect the rest of my app which doesn't mind these potentially dirty objects whilst a transaction is occurring.

Shay :

I still can't get <os-core:jini-tx-manager id="transactionManager" lookup-timeout="10000"/>

to work from the client where embedded space has

<os-core:distributed-tx-manager id="transactionManager" default-timeout="1000"/>

I'm not using any special group or locators etc. so I'd assume it should be able to find it? Unless I have this wrong and everything has to use jini-tx-manager and I run mahalo separately?

aparry gravatar imageaparry ( 2009-01-22 11:23:30 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2009-01-22 06:07:01 -0600

Seen: 369 times

Last updated: Jan 22 '09