Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Local Cache and memory

When writing a typical application that reads and writes from the a database it's normal to implement a local cache. Obviously one of the major reasons why is to save the performance hit of the round trip to the database. One of the other major benefits in some kinds of applications is that in a multiuser, read heavy, scenario where many of the users are retrieving the same data you save a ton of memory with the local cache because the cached object being used by each users is in fact the "same" object in every sense of the word. It's the same object in memory, not just a clone of the same object with the same data. I'm having difficulty achieving that same benefit with GigaSpaces' local-cache (master-local space).

Here's the setup:

<os-core:space id="remoteSpace" url="jini://*/*/gvDataSpace"/>
<os-core:local-cache id="cacheSpace" space="remoteSpace" update-mode="PULL"/>
<os-core:local-tx-manager id="transactionManager" space="cacheSpace"/>
<os-core:giga-space id="gigaspace" space="cacheSpace" tx-manager="transactionManager"/>

I can create a simple object and save it to the space, however each time I retrieve that object I get a different copy of it occupying a different memory location. I'd like to find out if I'm doing something wrong or if the technology just operates on different principles than I'm used to and that I shouldn't expect this particular benefit.

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=2456]{quote}

asked 2008-07-15 20:47:39 -0500

mwelch gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
0

GigaSpaces support different types of IMDG toplogies.

One of the options - called master-local , is optimized for a very special scenario where the applications performs repeated read operations of the same object where the most of the operations conducted (at least 80%) are in fact single read operation.
When using this option with the space API , a full blown space running within the same memory address as the client application. This client side space acts as a cache. This hub and spook architecture designed to prevent the client application from accessing the remote space which holds all data by holding a subset of the data at the client side. This data loaded on demand into the client local cache with every read operation (when ever there is a local cache miss).

Updates and data removal done by different clients are delegated to the client local cache to make sure the local copy will be updated/removed.

Every read from the local cache (in case matching object exists) will return an object with the relevant data. It will not return the same object reference since the local space is not a simple hashmap. The object handed to the client materialized from a universal GigaSpaces proprietary structure that is translated to the expected object in runtime. So you will get the correct data each time , but using a different reference. This unique capability is the cornstone of GigaSpaces interoperability technology allowing the same space to serve both Java, CPP and .Net applications allowing these to share the same object.

The above works differently with the Map api local cache that is a special type of hashmap that does returns the same reference each time.

Please note that since the local cache updates done via notifications that are async by nature there might be an option to read a stale object from the local cache. Still, updates of none recent object will fail since these will be conducted using optimistic locking mode.

If your application is write/read (50%/50%) without repeated reads the master-local topology is not the recommended topology. In such a case you should have your business logic colocated with the (master) space(s) or perform remote calls against the space. These will be slower than the colocated space , but much faster than a database calls.

Shay

answered 2008-07-15 22:22:49 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Comments

Once again, thanks for the help Shay. I don't know how you manage to answer questions on these forums so quickly. I'm sure I've been a pest these last few weeks.

Unfortunately, in my app, memory savings are only the initial benefit of shared references. It's the fewer garbage collections that those memory savings lead to that are just as useful. Let say that I have 10,000 users hitting the system and 90% of them are viewing roughly the same data. If they are each getting their own individual references to that data I will have large and frequent garbage collections. The GC algorithims have been massively improved in the past few years but I still have not been on a single project where they didn't have to be tweaked manually. If all of the users were accessing the same data objects instead of each having their own copy, the collections that would be needed are reduced dramatically.

So if I'm understanding correctly, I can use the GigaSpace API and have its flexibility and usability but each call to retrieve an object will create a new copy of that object, which is necessary to support GigaSpace's cross-language interoperability feature. The alternative is to use the Map API (what about GigaMap?) where I will lose a lot of flexibility and usability (only be able to access one item at a time by key; no queries; no templates) but when I retrieve an object by key I will get the same object reference as long as it's in the cache.

mwelch gravatar imagemwelch ( 2008-07-16 10:14:00 -0500 )edit

The GigaMap is the Map API I'm talking about. It wraps the IMap interface. If you need simple key , value based data access this provides very powerful and fast response time both with local cache an collocated space for get operations.

Regards the way we construct the returned object with the space API - I understand your concern. Still , lets remember the following: - GigaSpaces 6.5 comes with pretty tuned settings for the GSC. Take a look on these at the GigaSpaces Root\bin\setenv. You might want to apply these for the client side. The GC settings should help here very much. - If you are talking about web based application , we support caching HTTP session data and running the web server within the GSC. This is a big plus. These will be part of 6.6 release. - With a machine with multi core and proper JVM GC tuned, you will hardly notice the GC cleanup. - We do not “copy” the object when materializing it. Primitive fields are placed into the new created object. Non primitive fields (collections , references to other objects…) having their references placed. So it is a shallow copy and not a deep copy. - We have conducted benchmark using Sun Real Time JVM. The results are simply amazing. It works slower than the regular JVM, but the GC impact is simply ZERO. If the GC behavior makes you nervous, you should take a look on this JVM.

You are not bothering us at all. It is our commitment to the community to make you guys successful.

Shay

Edited by: Shay Hassidim on Jul 16, 2008 11:58 AM

shay hassidim gravatar imageshay hassidim ( 2008-07-16 11:53:58 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2008-07-15 20:47:39 -0500

Seen: 85 times

Last updated: Jul 15 '08