Adrien Mogenet | 13 Jul 09:04 2012
Picon

Maximum number of tables ?

Hi there,

I read some good practices about number of columns / column families, but
nothing about the number of tables.
What if I need to spread my data among hundred or thousand (big) tables ?
What should I care about ? I guess I should keep a tight number of
storeFiles per RegionServer ?

--

-- 
Adrien Mogenet
http://www.mogenet.me
N Keywal | 13 Jul 10:14 2012
Picon

Re: Maximum number of tables ?

Hi,

There is no real limits as far as I know. As you will have one region
per table (at least :-), the number of region will be something to
monitor carefully  if you need thousands of table. See
http://hbase.apache.org/book.html#arch.regions.size.

Don't forget that you can add as many column as you want, and that an
empty cell cost nothing. For example, a class hierarchy is often
mapped to multiple tables in a RDBMS, while in HBase having a single
table for the same hierarchy makes much more sense. Moreover, there is
no transaction between tables, so sometimes a 'uml composition' will
go to a single table. And so on.

N.

On Fri, Jul 13, 2012 at 9:04 AM, Adrien Mogenet
<adrien.mogenet@...> wrote:
> Hi there,
>
> I read some good practices about number of columns / column families, but
> nothing about the number of tables.
> What if I need to spread my data among hundred or thousand (big) tables ?
> What should I care about ? I guess I should keep a tight number of
> storeFiles per RegionServer ?
>
> --
> Adrien Mogenet
> http://www.mogenet.me

(Continue reading)

Michael Segel | 13 Jul 17:42 2012
Picon

Re: Maximum number of tables ?

Currently there is a hardcoded limit on the number of regions that a region server can manage. 
Its 1500.
Note that if the number of regions gets to around 1000 regions per region server, you end up with a
performance hit. (YMMV) 

So if you have 1 region per table, there's a real limit of 1500 tables * number of RS nodes. 

Note: You will probably die well before hitting this limit, again YMMV.

On Jul 13, 2012, at 3:14 AM, N Keywal wrote:

> Hi,
> 
> There is no real limits as far as I know. As you will have one region
> per table (at least :-), the number of region will be something to
> monitor carefully  if you need thousands of table. See
> http://hbase.apache.org/book.html#arch.regions.size.
> 
> Don't forget that you can add as many column as you want, and that an
> empty cell cost nothing. For example, a class hierarchy is often
> mapped to multiple tables in a RDBMS, while in HBase having a single
> table for the same hierarchy makes much more sense. Moreover, there is
> no transaction between tables, so sometimes a 'uml composition' will
> go to a single table. And so on.
> 
> N.
> 
> On Fri, Jul 13, 2012 at 9:04 AM, Adrien Mogenet
> <adrien.mogenet@...> wrote:
>> Hi there,
(Continue reading)

Amandeep Khurana | 13 Jul 17:50 2012
Picon

Re: Maximum number of tables ?

I have come across clusters with 100s of tables but that typically is
due to a sub optimal table design.

The question here is - why do you need to distribute your data over
lots of tables? What's your access pattern and what kind of data are
you putting in? Or is this just a theoretical question?

On Jul 13, 2012, at 12:05 AM, Adrien Mogenet
<adrien.mogenet@...> wrote:

> Hi there,
>
> I read some good practices about number of columns / column families, but
> nothing about the number of tables.
> What if I need to spread my data among hundred or thousand (big) tables ?
> What should I care about ? I guess I should keep a tight number of
> storeFiles per RegionServer ?
>
> --
> Adrien Mogenet
> http://www.mogenet.me

Kevin O'dell | 13 Jul 19:36 2012

Re: Maximum number of tables ?

Mike,

  I just saw a system with 2500 Regions per RS(crazy I know, we are fixing
that).  I did not think there was a hard coded limit...

On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana <amansk@...> wrote:

> I have come across clusters with 100s of tables but that typically is
> due to a sub optimal table design.
>
> The question here is - why do you need to distribute your data over
> lots of tables? What's your access pattern and what kind of data are
> you putting in? Or is this just a theoretical question?
>
> On Jul 13, 2012, at 12:05 AM, Adrien Mogenet <adrien.mogenet@...>
> wrote:
>
> > Hi there,
> >
> > I read some good practices about number of columns / column families, but
> > nothing about the number of tables.
> > What if I need to spread my data among hundred or thousand (big) tables ?
> > What should I care about ? I guess I should keep a tight number of
> > storeFiles per RegionServer ?
> >
> > --
> > Adrien Mogenet
> > http://www.mogenet.me
>

(Continue reading)

Michael Segel | 13 Jul 19:40 2012
Picon

Re: Maximum number of tables ?

I'm going from memory. There was a hardcoded number. I'd have to go back and try to find it. 

From a practical standpoint, going over 1000 regions per RS will put you on thin ice. 

Too many regions can kill your system.

On Jul 13, 2012, at 12:36 PM, Kevin O'dell wrote:

> Mike,
> 
>  I just saw a system with 2500 Regions per RS(crazy I know, we are fixing
> that).  I did not think there was a hard coded limit...
> 
> On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana <amansk@...> wrote:
> 
>> I have come across clusters with 100s of tables but that typically is
>> due to a sub optimal table design.
>> 
>> The question here is - why do you need to distribute your data over
>> lots of tables? What's your access pattern and what kind of data are
>> you putting in? Or is this just a theoretical question?
>> 
>> On Jul 13, 2012, at 12:05 AM, Adrien Mogenet <adrien.mogenet@...>
>> wrote:
>> 
>>> Hi there,
>>> 
>>> I read some good practices about number of columns / column families, but
>>> nothing about the number of tables.
>>> What if I need to spread my data among hundred or thousand (big) tables ?
(Continue reading)

Lars George | 13 Jul 19:47 2012
Picon

Re: Maximum number of tables ?

It is basically unset:

    this.regionSplitLimit = conf.getInt("hbase.regionserver.regionSplitLimit",
        Integer.MAX_VALUE);

(from CompactSplitThread.java).

The number of regions is OK until you dilute the available heap share too much. So you can have >1000 regions
(given the block index, file handles etc. keep up) but only a few them can be active most of the time.

Lars

On Jul 13, 2012, at 7:40 PM, Michael Segel wrote:

> I'm going from memory. There was a hardcoded number. I'd have to go back and try to find it. 
> 
> From a practical standpoint, going over 1000 regions per RS will put you on thin ice. 
> 
> Too many regions can kill your system.
> 
> On Jul 13, 2012, at 12:36 PM, Kevin O'dell wrote:
> 
>> Mike,
>> 
>> I just saw a system with 2500 Regions per RS(crazy I know, we are fixing
>> that).  I did not think there was a hard coded limit...
>> 
>> On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana
<amansk@...> wrote:
>> 
(Continue reading)

Adrien Mogenet | 14 Jul 08:27 2012
Picon

Re: Maximum number of tables ?

Thanks for these answers ; it was a theoretical question. Actually, a
common pattern in other solutions for batch deletion is to organize data in
- for instance - one table per day and remove the eldest day after day.
That way is more efficient than finding old rows, then delete them (due to
lock strategy, fragmentation, blocking compaction, etc.). Not sure it's
relevant for HBase!

On Fri, Jul 13, 2012 at 7:47 PM, Lars George <lars.george@...> wrote:

> It is basically unset:
>
>     this.regionSplitLimit =
> conf.getInt("hbase.regionserver.regionSplitLimit",
>         Integer.MAX_VALUE);
>
> (from CompactSplitThread.java).
>
> The number of regions is OK until you dilute the available heap share too
> much. So you can have >1000 regions (given the block index, file handles
> etc. keep up) but only a few them can be active most of the time.
>
> Lars
>
> On Jul 13, 2012, at 7:40 PM, Michael Segel wrote:
>
> > I'm going from memory. There was a hardcoded number. I'd have to go back
> and try to find it.
> >
> > From a practical standpoint, going over 1000 regions per RS will put you
> on thin ice.
(Continue reading)


Gmane