RE: direct Hfile Read and Writes
When there is a need of bulk loading huge amount of data into HBase at one time, it will be better go with the
direct HFile write.
Here 1st using the MR framework HFiles are directly written (Into HDFS).. For this HBase provides the
utility classes and the ImportTSV tool itself.
Then using the IncrementalLoadHFile , these files are loaded into the regions managed by RS.
Once these 2 steps are over client can read the data normally.
For loading these much data in a normal way of HTable#put() will take lot of time.
-Anoop-
________________________________________
From: Jerry Lam [chilinglam@...]
Sent: Wednesday, June 27, 2012 10:52 PM
To: user@...
Subject: Re: direct Hfile Read and Writes
Hi Samar:
I have used IncrementalLoadHFile successfully in the past. Basically, once
you have written hfile youreself you can use the IncrementalLoadHFile to
merge with the HFile currently managed by HBase. Once it is loaded to
HBase, the records in the increment hfile are accessible by clients.
HTH,
Jerry
On Wed, Jun 27, 2012 at 10:33 AM, shixing <paradisehit@...> wrote:
> 1. Since the data we might need would be distributed across regions how
> would direct reading of Hfile be helpful.
(Continue reading)