Lloyd Standish | 7 Sep 2010 08:38

dnscache memory requirements for large number of server files

Please help me estimate memory requirements to run dnscache with about 769,000 files in the "servers"
directory (/etc/service/dncache/root/servers).

Each file has 9 bytes.  The filenames are the domain names to forward to an "override" nameserver (tinydns
running on 127.0.0.2).  Each file contains the same content: the IP 127.0.0.2.  (Actually, the files are
mostly hardlinks.  Otherwise I would run out of inodes.)

This is part of a project to set up porn-blocking using a list of 769,000 porn domain names.  dnscache should
forward dns queries for the porn domains to tinydns, running on 127.0.0.2 on the same machine.  tinydns
should return a bogus IP (to a page saying access to the pornography has been blocked).  Of course, I got this
working on a few test domains before attempting to load the 769,000 servers entries.

I already loaded the 769,000 (minimal) zones into the tinydns data file, and ran "make."  tinydns seems to be
fine (with zero queries).

However, dnscache cannot load the 769.000 servers files with only 256 megs of physical memory.  I have
raised the CACHESIZE and DATALIMIT up to 20M and 100M, respectively.

How much memory should be necessary to do this (assuming it is possible)?  This is running on a VPS and I could
increase the available memory.
--
Lloyd

Jeff King | 7 Sep 2010 17:20
Gravatar

Re: dnscache memory requirements for large number of server files

On Tue, Sep 07, 2010 at 12:38:56AM -0600, Lloyd Standish wrote:

> Please help me estimate memory requirements to run dnscache with about
> 769,000 files in the "servers" directory
> (/etc/service/dncache/root/servers).

Memory requirements aside, I think this is probably a bad idea. Just
glancing at the code in roots.c, it looks like dnscache will do a linear
search through the 769,000 entries for every query.

As for the memory requirements, I would expect (and again, I just
glanced at the code) it to take only 769,000 * (average_domain_length +
64) bytes. Where the "64" comes from the fact that each entry gets a
fixed-size slot for server IPs. Which is probably only on the order of
50-60M or so.

> However, dnscache cannot load the 769.000 servers files with only 256
> megs of physical memory.  I have raised the CACHESIZE and DATALIMIT up
> to 20M and 100M, respectively.

I'm not sure, but there may be a leak in root.c:init2. Your best bet is
probably to try your experiment with 10,000, 20,000, etc, and see how
the memory scales.

-Peff

Daryl Tester | 7 Sep 2010 17:17
Picon

Re: dnscache memory requirements for large number of server files

Lloyd Standish wrote:

> Please help me estimate memory requirements to run dnscache
> with about 769,000 files in the "servers" directory
> (/etc/service/dncache/root/servers).

Wow.  Even if you could get away with loading this much data, I don't
think you'd want to as the code doesn't appear to be optimised for
such an extreme case.  From a quick look at the relevant code (roots.c)
it would appear to be roughly the length of the domain name (in wire
format) + 64 bytes per domain name.  And the resultant "array" (it's
actually a string) is linearly searched, which could be a killer.

If your C is up to it, I'd look at modifying dnscache to perform a CDB
lookup on the domain+querytype, and if you get a hit return your
fictitious answer, otherwise proceed with the query normally.

--

-- 
Regards,
  Daryl Tester

"It's bad enough to have two heads, but it's worse when one's unoccupied."
  -- Scatterbrain, "I'm with Stupid."

Lloyd Standish | 7 Sep 2010 18:44

Re: dnscache memory requirements for large number of server files

Hi Daryl and Jeff,
Thanks for the information.  I agree that a linear search would not be a good idea.  Adding some sort of hashed
lookup sounds like the way to go here.  I'm not very experienced at C (my experience is primarily in perl and
bash) and I have little time.  If anyone here is interested in making this modification for pay, please
contact me off-list.
--
Lloyd

On Tue, 07 Sep 2010 09:17:07 -0600, Daryl Tester <dt-djb <at> handcraftedcomputers.com.au> wrote:

> Lloyd Standish wrote:
>
>> Please help me estimate memory requirements to run dnscache
>> with about 769,000 files in the "servers" directory
>> (/etc/service/dncache/root/servers).
>
> Wow.  Even if you could get away with loading this much data, I don't
> think you'd want to as the code doesn't appear to be optimised for
> such an extreme case.  From a quick look at the relevant code (roots.c)
> it would appear to be roughly the length of the domain name (in wire
> format) + 64 bytes per domain name.  And the resultant "array" (it's
> actually a string) is linearly searched, which could be a killer.
>
> If your C is up to it, I'd look at modifying dnscache to perform a CDB
> lookup on the domain+querytype, and if you get a hit return your
> fictitious answer, otherwise proceed with the query normally.
>

Scott Gifford | 7 Sep 2010 20:18
Gravatar

Re: dnscache memory requirements for large number of server files

On Tue, Sep 7, 2010 at 12:44 PM, Lloyd Standish <lloyd <at> crnatural.net> wrote:

Hi Daryl and Jeff,
Thanks for the information.  I agree that a linear search would not be a good idea.  Adding some sort of hashed lookup sounds like the way to go here.  I'm not very experienced at C (my experience is primarily in perl and bash) and I have little time.

Going a bit off-topic for the list I know, but based on your comments Lloyd, you may want to look at the Perl module Net::DNS::Nameserver:


I have had fantastic luck using it to solve small, strange DNS problems like the one you are describing.  You could put an instance of dnscache in front of it to handle the caching and get answers without going to the Perl code, but really for these sorts of things Perl is pretty fast.

Just a random idea, I know, but maybe it will be helpful.

------Scott.

Daryl Tester | 8 Sep 2010 02:33
Picon

Re: dnscache memory requirements for large number of server files

(* Reply to /dev/null'd - damn autoresponders *)

Scott Gifford wrote:

> http://search.cpan.org/search?query=Net::DNS::Nameserver&mode=all
> 
> I have had fantastic luck using it to solve small, strange DNS problems like
> the one you are describing.  You could put an instance of dnscache in front
> of it to handle the caching and get answers without going to the Perl code,
> but really for these sorts of things Perl is pretty fast.

Just in case it's a "thinko" - he'd want the Perl lookup to occur before hitting
dnscache, otherwise it's a variant of his original problem (i.e. redirecting to
Perl instead of tinydns).

--

-- 
Regards,
  Daryl Tester

"It's bad enough to have two heads, but it's worse when one's unoccupied."
  -- Scatterbrain, "I'm with Stupid."

Ask Bjørn Hansen | 22 Oct 2010 06:19
Picon

Re: dnscache memory requirements for large number of server files

On Tue, Sep 7, 2010 at 11:18 AM, Scott Gifford <sgifford <at> suspectclass.com> wrote:

On Tue, Sep 7, 2010 at 12:44 PM, Lloyd Standish <lloyd <at> crnatural.net> wrote:
Hi Daryl and Jeff,
Thanks for the information.  I agree that a linear search would not be a good idea.  Adding some sort of hashed lookup sounds like the way to go here.  I'm not very experienced at C (my experience is primarily in perl and bash) and I have little time.

Going a bit off-topic for the list I know, but based on your comments Lloyd, you may want to look at the Perl module Net::DNS::Nameserver:


I have had fantastic luck using it to solve small, strange DNS problems like the one you are describing.  You could put an instance of dnscache in front of it to handle the caching and get answers without going to the Perl code, but really for these sorts of things Perl is pretty fast.
 
I know it's been a while, but for what it's worth the pool.ntp.org DNS servers are running a server based on Net::DNS::Nameserver.  It's not super fast, but they - happily - do something like 30 million queries a day.  Making a proxy of sorts that just sends back a fixed response for something on the blacklist and otherwise just forwards it to a real resolving server / cache should be pretty easy.

Gmane