Welcome to the new Gigaspaces XAP forum. To recover your account, please follow these instructions.

Ask Your Question
0

Concurrent take() is a problem?

Hi all, In my GigaSpaces app, i need to know when the data processing is ended in order to do some notifications.

As suggested here, i've developed a new object called GigaCounter, which encapsulates the number of objects processed by the PUs. The feeder knows how many objects have to be processed, and keeps polling the space to read all the GigaCounter objects until it finds that the counters sum is equal to the expected total number.

Basically, every PU calls this method to process any object, so i count how many times this method is called:


public void store (...) {
... some hibernate related operations ...

GigaCounter template = new GigaCounter();
GigaCounter counter = gigaSpace.take(template);
if (counter == null) {
    counter = template;
    counter.setCount(0);
}
    counter.increment();
gigaSpace.write(counter);
}

This is the feeder "polling" part:


public void run() {

    GigaCounter clearTemplate = new GigaCounter();
    gigaSpace.clear(clearTemplate);

    Integer count = 0;   

    try {
    while(! count.equals(employeesCount) ) {
        count = 0;

        GigaCounter template = new GigaCounter();
        GigaCounter[] counters = gigaSpace.readMultiple(template, 1000);

        for (GigaCounter c : counters)
        count+=c.getCount();

        System.out.println("Employees processed: "+count+" of "+employeesCount+".");

        if ( ! count.equals(employeesCount) )
        Thread.sleep(5000);
    }
    } catch (InterruptedException e) {
    e.printStackTrace();
    }
    System.out.println("Process finished");

And this is the GigaCounter object:


@SpaceClass(persist=false) public class GigaCounter implements Serializable {

private Integer count = null;

public GigaCounter() {
}

/*
public GigaCounter(Integer id) {
this.id = id;
this.count = 0;
}



public Integer getId() {
    return this.id;
}


/*
public void setId(Integer id) {
this.id = id;
} */

/**
 * @return the count
 */
@SpaceRouting
@SpaceId(autoGenerate=false)
public Integer getCount() {
    return this.count;
}

/**
 * @param count the count to set
 */
@SpaceId(autoGenerate=false)
public void setCount(Integer count) {
    this.count = count;
}

public void increment() {
count++;
}

}


Let's call X the total expected and C the value calculated with the GigaCounter objects. The problem is that, at the end of the data processing, the counting process is correct with low data loading (X == C, the sum is equal to the total expected), but when the data rate, operations and loads is high, the count is wrong: it's about the 80% of the total expected (C == 0,8*X) , even if i can see that the method is called X times (because i see that X rows are written in the database via Hibernate).

So, what's the matter here? I just can think about some problems when different PUs ask to take the same GigaCounter object simultaneously...

{quote}This thread was imported from the previous forum. For your reference, the original is [available here|http://forum.openspaces.org/thread.jspa?threadID=2981]{quote}

asked 2009-03-27 07:39:21 -0500

kappolo gravatar image

updated 2013-08-08 09:52:00 -0500

jaissefsfex gravatar image
edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted
0

Hi,

From reviewing your code, I'm not sure exactly how exactly you're implementing a global counter, is it a single entry or multiple entries, each size of one?
In any case, It appears that you are running into concurrency problem. In addition to Shay's suggestions, you need to make sure that the space operations are done under transactions.

Further more, and I'm not sure that this is the case, but if you refer to one single counter feeders and processors, you're limiting the system's scalability and introducing single point of contention.

To solve your problem, notifying clients about notification completion, you may want to consider the following SBA pattern:

1. feeder writes object to space with state -> new
2. Polling Container A processes the object and moves it to state -> processed
3. Polling Container B processes all objects in state 'processed' and sends the notifications

If you need a single for the entire batch:
1. The feeder can write an object to the space with the amount of entries
2. Polling Container B, simply counts till it gets to this number and only when this number is reached, it sends a single notification.

I hope this makes sense to your app, if it doesn't please clarify a bit further.

Guy

answered 2009-03-27 09:49:43 -0500

nirpaz gravatar image
edit flag offensive delete link more

Comments

{quote:title=nirpaz wrote:}{quote} Hi,

From reviewing your code, I'm not sure exactly how exactly you're implementing a global counter, is it a single entry or multiple entries, each size of one?

We'll have multiple instances of GigaCounter: whenever a PU can't take an existing one (because there aren't any or the other ones are busy with other PUs) a new one is created.

In any case, It appears that you are running into concurrency problem. In addition to Shay's suggestions, you need to make sure that the space operations are done under transactions.

How do i do that?

In the pu.xml i already have a local transaction manager, defined as:

<os-core:local-tx-manager id="transactionManager" space="space"/>

And used in the polling container to process the data sent by the feeder.

Further more, and I'm not sure that this is the case, but if you refer to one single counter feeders and processors, you're limiting the system's scalability and introducing single point of contention.

To solve your problem, notifying clients about notification completion, you may want to consider the following SBA pattern:

  1. feeder writes object to space with state -> new
  2. Polling Container A processes the object and moves it to state -> processed
  3. Polling Container B processes all objects in state 'processed' and sends the notifications

If you need a single for the entire batch: 1. The feeder can write an object to the space with the amount of entries 2. Polling Container B, simply counts till it gets to this number and only when this number is reached, it sends a single notification.

I hope this makes sense to your app, if it doesn't please clarify a bit further.

Guy

I understand well these two approches, but the problem is that i need the Feeder to be responsible for sending the notification.

kappolo gravatar imagekappolo ( 2009-03-27 11:41:00 -0500 )edit

See: http://www.gigaspaces.com/wiki/displa... If you are using polling container: http://www.gigaspaces.com/wiki/displa... #PollingContainerComponent-TransactionSupport

Shay

shay hassidim gravatar imageshay hassidim ( 2009-03-27 11:50:03 -0500 )edit

{quote:title=shay hassidim wrote:}{quote} See: http://www.gigaspaces.com/wiki/displa... If you are using polling container: http://www.gigaspaces.com/wiki/displa... #PollingContainerComponent-TransactionSupport

Shay

I've already seen it but... Why do i have to use a Polling Container? The PUs don't need to be called when there's a new GigaCounter object, the PUs just need to take one of them an put it back to the space after increment...

kappolo gravatar imagekappolo ( 2009-03-27 12:23:00 -0500 )edit

Enrico, I would change a bit the design to use the classic master-worker pattern. Have a result object written into the space per “task” executed via a polling container (which process incoming data and writing the results object) and have the feeder taking or counting the amount of result objects for a given task. The results objects will have the same taskID as the Task object. This how you will be able to count the correct results objects.

I’ve used the above pattern with the following: http://blog.gigaspaces.com/2009/02/09... - see the slides for the Credit Risk HPC Calculation Benchmark

Maintaining counters will make your life very complicated.

Shay

shay hassidim gravatar imageshay hassidim ( 2009-03-27 13:02:41 -0500 )edit

Enrico,

If you need the feeder to send the notification, have processor B, once all processing is complete write a complete object into the space which is going to be read by the feeder.

The feeder sequence would be something like that: 1. Write - counter object (e.g 1000) 2. write 1000 entries 3. Take for complete object (block) 4. send notifications

Guy

nirpaz gravatar imagenirpaz ( 2009-03-28 02:13:37 -0500 )edit
0
  • You shoud not modify the SpaceID and the routing field after the object been initialy written into the space. I guess this is the reason for the strange behavior you see. Make sure you update the same object you took each time you increament its counter. I don't think this is what is going on. You update different object each time.

  • You should have one "@SpaceId(autoGenerate=false)" on the getter method.

Shay

answered 2009-03-27 08:49:03 -0500

shay hassidim gravatar image
edit flag offensive delete link more

Comments

I hope that's is the problem. I've changed the GigaCounter behaviour, but i can't test it know.

Anyway, i have a "@SpaceId(autoGenerate=false)" on the getter method.

kappolo gravatar imagekappolo ( 2009-03-27 09:47:11 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2009-03-27 07:39:21 -0500

Seen: 37 times

Last updated: Mar 27 '09