Yitzchak Gale | 9 Apr 14:54 2013

Reverse DNS lookups on accept in network

I noticed that the accept function in the network library,
unlike the underlying C function, does a reverse DNS
lookup every time it accepts a connection.

This seems to be the cause of an acute problem:
Hackage is nearly unusable for people whose ISP
has broken reverse DNS, since every request to
the server delays for 30 seconds or more while
waiting for the broken reverse DNS server to time
out. I know, the ISP should fix it, or the user should
switch to a different ISP, but that isn't always practical.

In particular, Roman, our expert from Odessa, is
experiencing this problem. And he is hosting a
Haskell Hackathon, OdHack, in just a few weeks
time. I am concerned that all participants in the
Hackathon might also be susceptible, which would
be a Very Bad Thing.

I'll note that nowadays it seems to be widely
accepted "best practice" to avoid per-connection
RDNS lookup, e.g., by configuring web servers
to log IP addresses instead of domain names.

So there are two questions here: one is whether
we need a change to the the network and/or
cgi packages (and possibly others), and the other
is how to solve the hackage problem promptly.

My first thought on the first question is to add
(Continue reading)

Yitzchak Gale | 9 Apr 16:24 2013

Re: Reverse DNS lookups on accept in network

Answering my own question:

I retract the proposal for changes here.

Although the function Network.accept
does an implied reverse DNS lookup,
it does so lazily. So the actual lookup
should not happen unless the library client
actually tries to use the host name.

As for the Hackage problem, this problem
is inherent to CGI, which is what Hackage
currently uses. The CGI protocol supplies
the resolved client host name to the web
application in an environment variable. So
the web server (Apache in this case) will
always have do a reverse DNS lookup by definition.
(Environment variables are strict. Too bad.)

So until we upgrade to a complete rewrite of
Hackage (any day now, right?), I guess the
only solution is to access Hackage via a
proxy on a host whose reverse DNS is
working.

Thanks,
Yitz

On Tue, Apr 9, 2013 at 3:54 PM, Yitzchak Gale <gale <at> sefer.org> wrote:
> I noticed that the accept function in the network library,
(Continue reading)

Anders Kaseorg | 9 Apr 23:00 2013
Picon

Re: Reverse DNS lookups on accept in network

On 04/09/2013 10:24 AM, Yitzchak Gale wrote:
> As for the Hackage problem, this problem is inherent to CGI, which is
> what Hackage currently uses. The CGI protocol supplies the resolved
> client host name to the web application in an environment variable.
> So the web server (Apache in this case) will always have do a reverse
> DNS lookup by definition. (Environment variables are strict. Too
> bad.)

This is not required by the CGI protocol.  Apache only provides 
REMOTE_HOST if the HostnameLookups directive is set to On (the default 
is Off).  So this should be easily fixable.

Anders
Yitzchak Gale | 10 Apr 00:59 2013

Re: Reverse DNS lookups on accept in network

I wrote:
>> As for the Hackage problem, this problem is inherent to CGI, which is
>> what Hackage currently uses. The CGI protocol supplies the resolved
>> client host name to the web application in an environment variable.

Anders Kaseorg wrote:
> This is not required by the CGI protocol.  Apache only provides REMOTE_HOST
> if the HostnameLookups directive is set to On (the default is Off).  So this
> should be easily fixable.

Interesting, thanks. But it really does seem that
Hackage is doing an RDNS lookup for every connection
to a CGI app, both Haskell and non-Haskell. I don't
have direct access to the server, but the sysadmin
says that Apache is configured not to do lookups.
And HostnameLookups did come up in the conversation.
I'll ask again specifically about HostnameLookups.
Can you think of any other reason that every CGI
connection would trigger an RDNS lookup of the
remote host?

Thanks,
Yitz
Herbert Valerio Riedel | 10 Apr 10:08 2013
Picon

Re: Reverse DNS lookups on accept in network

Yitzchak Gale <gale <at> sefer.org> writes:

[...]

> Although the function Network.accept
> does an implied reverse DNS lookup,
> it does so lazily. So the actual lookup
> should not happen unless the library client
> actually tries to use the host name.

I've looked at the source code but I don't recognize how the lazyness is
achieved w.r.t. to the RDNS lookup, here's the relevant source fragment
from [1]:

    accept sock <at> (MkSocket _ AF_INET _ _ _) = do
        ~(sock', (SockAddrInet port haddr)) <- Socket.accept sock
        peer <- catchIO
              (do   
                 (HostEntry peer _ _ _) <- getHostByAddr AF_INET haddr
                 return peer
              )
              (\_e -> inet_ntoa haddr)
        handle <- socketToHandle sock' ReadWriteMode
        return (handle, peer, port)

the blocking operation would be 'getHostByAddr' but I don't see any
measure to turn that into a lazy I/O operation. What am I overlooking?

[1]: http://hackage.haskell.org/packages/archive/network/2.4.1.2/doc/html/src/Network.html#accept
(Continue reading)

Roman Cheplyaka | 9 Apr 16:40 2013

Re: Reverse DNS lookups on accept in network

(Moving the discussion back to haskell-infrastructure)

Thanks for looking into this, Yitz.

It's interesting that I face this problem only on the hackage.h.o-hosted
services (hackage itself and trac), and not, say, on haskellwiki or
trac.h.o. 

It it true that only hackage.h.o-hosted services use CGI?

Roman

* Yitzchak Gale <gale@...> [2013-04-09 17:24:08+0300]
> Answering my own question:
> 
> I retract the proposal for changes here.
> 
> Although the function Network.accept
> does an implied reverse DNS lookup,
> it does so lazily. So the actual lookup
> should not happen unless the library client
> actually tries to use the host name.
> 
> As for the Hackage problem, this problem
> is inherent to CGI, which is what Hackage
> currently uses. The CGI protocol supplies
> the resolved client host name to the web
> application in an environment variable. So
> the web server (Apache in this case) will
> always have do a reverse DNS lookup by definition.
(Continue reading)

Darius Jahandarie | 9 Apr 16:30 2013
Picon

Re: Reverse DNS lookups on accept in network

On Apr 9, 2013, at 8:54 AM, Yitzchak Gale <gale <at> sefer.org> wrote:

> This seems to be the cause of an acute problem:
> Hackage is nearly unusable for people whose ISP
> has broken reverse DNS, since every request to
> the server delays for 30 seconds or more while
> waiting for the broken reverse DNS server to time
> out. 

So THAT's what was making Hackage totally unusable for me... So many hours (days?) blown on just waiting for it.

--

-- 
Darius Jahandarie
Peter Simons | 10 Apr 08:46 2013
Picon

Re: Reverse DNS lookups on accept in network

Hi Darius,

 >> Hackage is nearly unusable for people whose ISP has broken reverse
 >> DNS [...].
 >
 > So THAT's what was making Hackage totally unusable for me... So many
 > hours (days?) blown on just waiting for it.

you should complain to your ISP about this issue.

Take care,
Peter
Jeffrey Shaw | 10 Apr 20:57 2013
Picon

Re: Reverse DNS lookups on accept in network

Is it an option for you to change to OpenDNS or Google's DNS servers?


On Tue, Apr 9, 2013 at 10:30 AM, Darius Jahandarie <djahandarie <at> gmail.com> wrote:
On Apr 9, 2013, at 8:54 AM, Yitzchak Gale <gale <at> sefer.org> wrote:

> This seems to be the cause of an acute problem:
> Hackage is nearly unusable for people whose ISP
> has broken reverse DNS, since every request to
> the server delays for 30 seconds or more while
> waiting for the broken reverse DNS server to time
> out.

So THAT's what was making Hackage totally unusable for me... So many hours (days?) blown on just waiting for it.

--
Darius Jahandarie
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Darius Jahandarie | 10 Apr 21:05 2013
Picon

Re: Reverse DNS lookups on accept in network

On Wed, Apr 10, 2013 at 2:57 PM, Jeffrey Shaw <shawjef3 <at> gmail.com> wrote:
> Is it an option for you to change to OpenDNS or Google's DNS servers?

That unfortunately would be of no help for reverse DNS lookups. I've
since switched ISPs though, so it's no longer been a problem for the
past few months, thankfully. My old ISP was AT&T, which is rather
large, so I imagine this issue could be affecting a lot of people.

--
Darius Jahandarie
Gregory Collins | 10 Apr 20:50 2013
Picon

Re: Reverse DNS lookups on accept in network


On Tue, Apr 9, 2013 at 2:54 PM, Yitzchak Gale <gale <at> sefer.org> wrote:
My first thought on the first question is to add
a new function acceptRaw or accept' to network
that skips the lookup

Network.Socket.accept doesn't do the reverse lookup.

G


--
Gregory Collins <greg <at> gregorycollins.net>
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Johan Tibell | 10 Apr 20:56 2013
Picon

Re: Reverse DNS lookups on accept in network

On Wed, Apr 10, 2013 at 11:50 AM, Gregory Collins
<greg <at> gregorycollins.net> wrote:
> Network.Socket.accept doesn't do the reverse lookup.

That is what I use. The Network module itself doesn't get any love
because I think converting Sockets to Handles is a bad idea (because I
think the current Handle design is wrong). Hence all improvements to
the API tends to go in Network.Socket.

Gmane