Loading large data set.

Yes lots of questions... :D I'm preparing for a demo.

I have a SQL table with 3 000 000 records and i would like to load in a partitioned space.

What is the best way of doing so? I have a couple of machines ranging from 2-4 GB of ram. How many machines would I need?

Each row can be about 550 bytes So if the calculation is a straightforward as...

(550 bytes x 3 millions rows) / 1 GB = 15 GB So I would need 6 or so machines? Or is a POJO allot smaller when serialized?

asked 2010-11-08 09:28:28 -0500

updated 2013-08-08 09:52:00 -0500

1 Answer

answered 2010-11-08 09:46:21 -0500

Asked: 2010-11-08 09:28:28 -0500

