Pekka Savola | 16 Jul 07:50 2010
Picon

miredo relay performance/EAGAIN on linux

Hi,

We've been running a miredo relay on freebsd where it could handle 
about 200Mbit/s of traffic with O(100K)+ users. I had to increase the 
default tun output queue in the kernel (50pkts) to something like 
10000 to avoid output drops.

I've been looking into moving the relay on Linux (RHEL5 with 2.6.18), 
miredo 1.2.3, and performance is poor (20-30Mbit/s max, with about 10% 
CPU usage).  On the relay when I try to ping a teredo host (e.g. 
mire.remlab.net), quite often I get:

$ ping6 mire.remlab.net
PING mire.remlab.net(2001:0:5b79:69d6:0:f226:a486:9629) 56 data bytes
ping: sendmsg: Invalid argument

Looking at this with strace shows:

4749  sendto(3, 
"\200\0\0\0\215\22\0\0Z\354?L\0\0\0\0003j\1\0\0\0\0\0\20\21\22\23\24\25\26\27"..., 
64, 0, {sa_family=AF_INET6, sin6_port=htons(58), inet_pton(AF_INET6, 
"2001:0:5b79:69d6:0:f226:a486:9629", &sin6_addr), sin6_flowinfo=0, 
sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
4749  recvmsg(3, 0x7fffdef30a00, MSG_ERRQUEUE|MSG_DONTWAIT) = -1 
EAGAIN (Resource temporarily unavailable)

.. so I suppose something in the kernel/tun interface is the 
bottleneck here, reporting EAGAIN.

Any ideas on how to improve the situation?
(Continue reading)

Rémi Denis-Courmont | 16 Jul 10:02 2010
Picon

Re: miredo relay performance/EAGAIN on linux


   Moi,

On Fri, 16 Jul 2010 08:50:55 +0300 (EEST), Pekka Savola <pekkas <at> netcore.fi>
wrote:
> Looking at this with strace shows:
> 
> 4749  sendto(3,
>
"\200\0\0\0\215\22\0\0Z\354?L\0\0\0\0003j\1\0\0\0\0\0\20\21\22\23\24\25\26\27"...,
> 64, 0, {sa_family=AF_INET6, sin6_port=htons(58), inet_pton(AF_INET6,
> "2001:0:5b79:69d6:0:f226:a486:9629", &sin6_addr), sin6_flowinfo=0,
> sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
> 4749  recvmsg(3, 0x7fffdef30a00, MSG_ERRQUEUE|MSG_DONTWAIT) = -1
> EAGAIN (Resource temporarily unavailable)
> 
> .. so I suppose something in the kernel/tun interface is the
> bottleneck here, reporting EAGAIN.

If I read this strace right, the kernel returns EINVAL on sending. Then it
returns EAGAIN on receiving ICMPv4 errors (MSG_ERRQUEUE). EAGAIN simply
means that there are no queued incoming ICMP errors, which seems fine.

The real problem is, why is it returning EINVAL on sendto?!

> Any ideas on how to improve the situation?

The sockets are already in blocking mode to minimize context switch
overhead and avoid output buffers congestion causing EAGAIN and packet
lost. The only thing I can think of is, change the queuing discipline of
(Continue reading)

Picon

Re: miredo relay performance/EAGAIN on linux

Hi,

> We've been running a miredo relay on freebsd where it could handle
> about 200Mbit/s of traffic with O(100K)+ users. I had to increase the
> default tun output queue in the kernel (50pkts) to something like
> 10000 to avoid output drops.

May I share the patch?  I'm interested in the performance enhancement
on FreeBSD box.

-- Sumikawa

Pekka Savola | 21 Jul 11:23 2010
Picon

Re: miredo relay performance/EAGAIN on linux

On Wed, 21 Jul 2010, Munechika SUMIKAWA / ???? wrote:
>> We've been running a miredo relay on freebsd where it could handle
>> about 200Mbit/s of traffic with O(100K)+ users. I had to increase the
>> default tun output queue in the kernel (50pkts) to something like
>> 10000 to avoid output drops.
>
> May I share the patch?  I'm interested in the performance enhancement
> on FreeBSD box.

Sure.. Here it is...

At least on FreeBSD 7.2 tun(4) kernel driver has insufficient locking 
which is causing kernel panics. This has manifested at least if you do 
'ifconfig teredointerface destroy' under some circumstances.  I've 
also seen a large number of memory allocation panics. These may be 
related.  A developer has looked into this briefly but ran out of 
time.

--

-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
--- /sys/net/if_tun.c~	2008-09-10 04:44:12.000000000 +0300
+++ /sys/net/if_tun.c	2008-09-10 04:44:12.000000000 +0300
 <at>  <at>  -377,7 +377,9  <at>  <at> 
 	ifp->if_start = tunstart;
 	ifp->if_flags = IFF_POINTOPOINT | IFF_MULTICAST;
 	ifp->if_softc = sc;
-	IFQ_SET_MAXLEN(&ifp->if_snd, ifqmaxlen);
(Continue reading)

Picon

Re: miredo relay performance/EAGAIN on linux

> Sure.. Here it is...
> 
> At least on FreeBSD 7.2 tun(4) kernel driver has insufficient locking
> which is causing kernel panics. This has manifested at least if you do
> 'ifconfig teredointerface destroy' under some circumstances.  I've
> also seen a large number of memory allocation panics. These may be
> related.  A developer has looked into this briefly but ran out of
> time.

Thanks the patch and the info!

-- Sumikawa

Rémi Denis-Courmont | 21 Jul 13:03 2010
Picon

Re: miredo relay performance/EAGAIN on linux


On Wed, 21 Jul 2010 16:44:10 +0900 (JST), Munechika SUMIKAWA / 角川宗近
<sumikawa@...> wrote:
> Hi,
> 
>> We've been running a miredo relay on freebsd where it could handle
>> about 200Mbit/s of traffic with O(100K)+ users. I had to increase the
>> default tun output queue in the kernel (50pkts) to something like
>> 10000 to avoid output drops.
> 
> May I share the patch?  I'm interested in the performance enhancement
> on FreeBSD box.

On that topic, the __FreeBSD__ ifdefs near line 310 at
http://git.remlab.net/cgi-bin/gitweb.cgi?p=miredo.git;a=blob;f=libteredo/teredo.c;h=39ad1ba5fbc0342031650955f1611ef0bd9b3592;hb=HEAD
should be removed. That will spare one system call per decapsulated packet.

--

-- 
Rémi Denis-Courmont
http://www.remlab.net
http://fi.linkedin.com/in/remidenis


Gmane