Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Memory cost of storing an Object in the space

Hi,

I would like to find out what the gigaspace memory cost is when I store an Object in the space.

I have seen from unit testing when I add a number of objects, myObjects, to the space, gigaspaces seems to associate the following classes at a cost of 76-77 bytes per myObject; com.j_spaces.kernel.list.StoredListChainSegment$ConcurrentSLObjectInfo com.j_spaces.core.cache.PEntry com.gigaspaces.internal.server.storage.EntryHolder com.gigaspaces.internal.server.storage.FlatEntryData

The size of the objects I wish to add to the space are around 300 bytes in size. I will be adding millions of these objects and when this is the case gigaspaces 76-77 bytes per object grows to quite a number.

Is this as expected? Is there a way to decrease the size/amount of these associated gigaspaces objects?

Regards and thanks, D

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=3354]{quote}

asked 2010-01-28 08:07:07 -0500

dlehane gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

Hi,

There's no way to reduce that overhead directly in the product.

A possible solution is to store several user objects in the same space entry.
For example:
Let's assume the primary key of the user object is an integer.
The space class will have two properties:
1. an integer which is the space id.
2. An array of 10 user objects.

Let's say you need to read object #163 from the space.
1. Invoke space.readById(16).
2. Get item #3 from the array in the retrieved entry.

This is just an example, you can come up with different mapping strategies to map more/less objects per entries.
In this case we've cut the overhead to about tenth the size (however, we'll need to measure again because the array has footprint too).

This approach has its drawbacks, naturally:
1. Network bandwidth - Operations require more network bandwidth, since a request for a single object returns a bunch of object.
   This should not be a major issue, being that the user objects are very small to begin with.
2. Concurrency - Concurrency is potentially degraded, since two objects in the same entry cannot be changed concurrently.
   Depending on the usage patterns of your application, this can be reduced by distributing the objects in a smarter methodology.
3. Loss of functionality - Some space functionality will be less usable, e.g. readMultiple, SQL Query, etc.
Some of this can be implemented manually via executors.

To conclude: Though this solution has some drawbacks, it might prove useful in some scenarios.
If you're planning to use a lot of complicated queries and/or modify the data, it's probably risky.
If most of your queries are trivial and most of the access is read, it might be useful.

Niv.

answered 2010-01-28 10:42:50 -0500

niv gravatar image
edit flag offensive delete link more

Comments

Donnacha , To add into Niv fine response you should also consider the operational footprint overhead a JVM would need when running the space or your business logic (when collocated with the space). Usually this might be 30% of the consumed amount of memory. You can reduce this by tuning the garbage collection activity to be called more frequently, but this will impact the performance.

If the stored data is string based , you may consider compressing the data. If the data is stored within user defined object within the space object you might want to consider using the compressed serialization mode. This will obviously will impact the performance..

So the raw data footprint in many cases is not the only parameter to consider when performing your capacity planning and memory usage. There are few additional considerations. Please contact me offline for additional discussion on this impotent issue. I would be happy to assist.

shay at gigaspaces.com GigaSpaces Deputy CTO

shay hassidim gravatar imageshay hassidim ( 2010-01-28 13:49:49 -0500 )edit

Hi Niv/Shay,

Thanks for the responses and the ideas to reduce my objects footprint.I am currently looking into ways to reduce this by both taking your suggestions into account and looking at the make up of the objects we are adding.

I guess you've answered my question about gigaspaces cost to holding the objects. I was hoping this could be reduced a little but alas that is not to be.

I will conact you again Shay once I have a little more research done on my side to see if our setup can be tweaked any bit.

Thanks for your help for now, Regards, Donnacha

dlehane gravatar imagedlehane ( 2010-01-29 10:08:56 -0500 )edit

Ordinarily one would also increase the capacity of the space by adding additional GSCs to a clustered space, although there are also synchronization issues between the GSCs to consider in this scenario.

jottinger gravatar imagejottinger ( 2010-02-02 07:54:46 -0500 )edit

We have a utility users can use to "expand" or "shrink" a running clustered space capacity. To expand a running space cluster capacity the utility spread the existing partitions across more GSCs making sure primary spaces will be evenly distributed across all the machines. there are no synchronization with this process.

Shay

shay hassidim gravatar imageshay hassidim ( 2010-02-02 10:48:12 -0500 )edit

Heh. Well, Shay, what's the name of the utility? Can you paste an example of its use?

jottinger gravatar imagejottinger ( 2010-02-02 14:14:19 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2010-01-28 08:07:07 -0500

Seen: 418 times

Last updated: Jan 28 '10