Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question

is it possible to have LRU to use local hard disk instead of database?

Is that possible? or do we have to use an RDBMS when we want it written to the database.

thanks, Dean

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=3569]{quote}

asked 2010-12-19 15:04:10 -0500

deanhiller gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted

It can be done via several implementations options and there are users running with this model.

Still, there are few fundamental known issues with this architecture you should be aware of: - Your data on disk will not be highly-available, unless you use a NAS (Network-attached storage) which means you will have indirectly network activity at the storage layer , or use a database that support high-availability out of the box which means network calls. - If you use a file to store your data and there is a cache miss as a result of a query (assuming you are running in LRU cache policy mode) there would not be a quick way to locate the data on file without scanning it. For most of the GigaSpaces users this is unacceptable. - The machine running the data grid performs usually intensive processing and consumes relatively large amount of CPU resources. Having it also to deal with persistency activity will increase the CPU utilization. The Mirror allows you to offload this activity into a dedicated server, preferably running on the database machine. - In case of elastic scaling event, or manual move of the logical partitions between the different containers , the logical partition that was moved into a new location will never be able to fetch its data as it is located on a file located on another machine. Having a database as the disk storage media will avoid this problem.

GigaSpaces is a mission critical, low latency, transaction processing grid technology. Its users usually can't afford any data lose when relying on the space as its primary data source (system of record). This means data both in-memory and on file should be highly-available to survive any system failures, especially when running in LRU cache policy mode where the space holds only some of the data in-memory.

Looking on Hadoop for example and its distributed file system that is based on local file storage across the nodes, it has not been designed for low latency, on-line transaction processing or mission critical systems. It can afford loosing a node and will continue to operate assuming eventual consistency or accept partial results. This is very different operational model than the GigaSpaces In-Memory-Data-Grid. That's why you will find Hadoop used mostly with search engines systems (this how it was initially started by Doug Cutting to support distribution of the Nutch search engine), ad rendering, log file processing, and other non mission critical content processing in a batch mode.


answered 2010-12-19 16:53:05 -0500

shay hassidim gravatar image
edit flag offensive delete link more

when I say local hard disk, I do mean local to the data grid node/container so there is no traffic network involved. It just writes out to the disk on the local node. thanks, Dean

answered 2010-12-19 15:04:56 -0500

deanhiller gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2010-12-19 15:04:10 -0500

Seen: 62 times

Last updated: Dec 19 '10