Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

having the right processing units read the right accounts from ext. db

I have a database I am going to suck data in from, and when I do that, I would like to have 2 nodes grab from 100 accounts or basically each node grabbing 50 accounts each. Naturally, I would like the logic grabbing the first 50 accounts to be on the node where it will most likely put them and the node grabbing the other 50 to be close to the data it will write(especially since when we really run some large data sizes in the terabytes, we don't want to generate extra network traffic between nodes).

So, how do you deploy logic with this knowledge?

Also, I am trying to wrap my ahead around Map/Reduce into gigaspaces. What is the Map logic...is that the

Some additional questions I have as well
1. The gui is very memory centric and we will 100% scale to 21 tera when done(and 100 tera 8 years after that based on our data storage trends). We naturally will have 512 to 1024m memory and will be letting the rest go to disk. How to configure where on disk the data is put? ie. we would like to store it in /data volume separate from other volumes.
2. The gui always says 1.3g out of 1.9g for memory but we want the hard disk storage numbers...is that the same number when it rolls out to disk when you go over or???
3. How do I start the client from windows and tell it the ip of one of the nodes so it can connect? (multicast won’t work between my network and the lab network and none of the lab OS have a gui so I need to launch from windows and then tell it one ip in the node and from there it should be able to connect to all nodes.
4. The apis have the below in terms of memory. but how to write this in terms of hard disk space really?

new ElasticDataGridDeployment("mygrid")
        .capacity("1g", "10g")
And
new ElasticDataGridDeployment("mygrid")
.initialJavaHeapSize("512m").maximumJavaHeapSize("512m")

thanks,
Dean

This thread was imported from the previous forum.
For your reference, the original is available here

asked 2010-12-17 10:20:47 -0500

deanhiller gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

3 Answers

Sort by » oldest newest most voted
0

> {quote:title=deanhiller wrote:}{quote}
>how do you deploy logic with this knowledge?
See:
http://www.gigaspaces.com/wiki/display/XAP71/ExternalDataSourceInitialLoad

> {quote:title=deanhiller wrote:}{quote}
>I am trying to wrap my ahead around Map/Reduce into gigaspaces. What is the Map logic?
See:
http://www.gigaspaces.com/wiki/display/XAP71/TaskExecutionovertheSpace

See simple example:
http://www.gigaspaces.com/wiki/display/SBP/Map-ReducePattern-ExecutorsExample

Shay

answered 2010-12-17 22:47:45 -0500

shay hassidim gravatar image
edit flag offensive delete link more
0

Dean,

Few basic concepts first:
- GSA - Process manager. Responsible for the life cycle of the GSC , GSM , Lookup (LUS) , ESM.
- GSC - the container. This is hosting the deployed PU. A PU can be a data-grid (space) or user custom PU.
- GSM - The service grid manager
- Lookup service (LUS) - this is the directory service. There should be at least one of these running. Once it is running
- ESM - Elastic service manager. Responsible to scale a deployed data-grid.
- Space cluster - set of partitions. A partition include one primary instance and ZERO or more backups instances.

See detals here:
http://www.gigaspaces.com/wiki/display/XAP71/TheRuntimeEnvironment

Basic steps to setup the GigaSpaces environment:
1. set LOOKUPLOCATORS
When starting the GigaSpaces environment you should have at least one GSM and one LUS. When multicast is not supported you should set the LOOKUPLOCATORS environment variable on all the machines running the GSA to allow them to find the LUS.
export LOOKUPLOCATORS=<LUSMACHINEIP>

2. set NIC_ADDR
When having multiple network cards set one each machine running GSA the NIC_ADDR variable to have the machine IP:
export LOOKUPLOCATORS=<MACHINE_IP>

3. set max File Descriptors
set the following on all the machines:
ulimit -n 65536

4. Start one LUS , ESM and GSM:
Pick one of the machines to run the LUS , ESM and GSM. Start these via:
./gs-agent.sh gsa.global.lus 0 gsa.lus 1 gsa.global.gsm 0 gsa.gsm 1 gsa.gsc 0 gsa.global.esm 1

5. Deploy an elastic data-grid:
See:
http://www.gigaspaces.com/wiki/display/XAP71/DeploymentSetupExamples#DeploymentSetupExamples-Productionenvironment

See full instructions here:
http://www.gigaspaces.com/wiki/display/SBP/MovingintoProductionChecklist

- Info about Data partitioning can be found here:
http://www.gigaspaces.com/wiki/display/XAP71/Data-Partitioning

- If you would like to get stats about the disk space you should install the SIGAR libraries. See:
http://www.gigaspaces.com/wiki/display/XAP71/InstallingGigaSpaces#InstallingGigaSpaces-UsingtheSIGARLibrarytoMonitorMachineLevelStatistics

- Loading data from external DB:
http://www.gigaspaces.com/wiki/display/XAP71/ExternalDataSourceInitialLoad

Shay

answered 2010-12-17 11:02:05 -0500

shay hassidim gravatar image
edit flag offensive delete link more
0

{quote:title=deanhiller wrote:}{quote} The apis have the below in terms of memory. but how to write this in terms of hard disk space really?

GigaSpaces does not control the hard disk space max utilization.

{quote:title=deanhiller wrote:}{quote} How to configure where on disk the data is put? ie. we would like to store it in /data volume separate from other volumes. Controlling the amount of data in memory done usually via eviction. When running in LRU cache policy mode, data is evicted in an automatic manner. See: http://www.gigaspaces.com/wiki/displa...

In some cases, you might want to have a special eviction strategy. To implement this you should use any of the options described below: http://www.gigaspaces.com/wiki/displa...

Shay

answered 2010-12-17 22:41:34 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2010-12-17 10:20:47 -0500

Seen: 60 times

Last updated: Dec 17 '10