Robert Olsson | 2 Jun 2003 12:58
Picon
Picon

Re: Route cache performance under stress


Simon Kirby writes:
 > Full profile output available here:
 > 
 > 	http://blue.netnation.com/sim/ref/
 > 	readprofile.full_route_table_hash_fixed_napi.*
 > 
 > Note that if I increase the packet rate and NAPI kicks in, all of the
 > handle_IRQ and similar overhead basically disappears because it no longer
 > uses IRQs.  Pretty spiffy.  Here is a profile of that:
 > Full profile output available as:

  8896 rt_garbage_collect                         9.4237
  8959 ip_route_input_slow                        3.8885
 10516 dst_alloc                                 73.0278
 10666 kmem_cache_free                           66.6625
 15339 tg3_rx                                    16.2489
 16553 ipt_do_table                              14.9937
 20193 fn_hash_lookup                            70.1146
 26833 rt_intern_hash                            34.9388
 64803 ip_route_input                           150.0069

 From DoS perspective a more interesting experiment compared to where you limited input
 rate to have 30% idle CPU.

 New dst is coming all the time first seached in hash (ip_route_input) and not found
 so ip_route_input_slow/fn_hash_lookup/dst_alloc/rt_intern_hash path is taken to add
 a new dst entry...

 And later GC have to remove all enties with spin_lock_bh hold (no packet processing 
(Continue reading)

David S. Miller | 9 Jun 2003 19:19
Picon
Favicon

Re: Route cache performance under stress

   From: Robert Olsson <Robert.Olsson <at> data.slu.se>
   Date: Mon, 2 Jun 2003 12:58:31 +0200

    And later GC have to remove all enties with spin_lock_bh hold (no
    packet processing runs). I see packet drops exactly when GC
    runs. Tuning GC might help but it's something to observe.

Please note, in 2.5.x, holding of this lock on one cpu does
not prevent packet processing (even for routes on same hash
chain) on another cpu because we use RCU there.

Simon Kirby | 2 Jun 2003 17:18
Favicon

Re: Route cache performance under stress

On Mon, Jun 02, 2003 at 12:58:31PM +0200, Robert Olsson wrote:

>  New dst is coming all the time first seached in hash (ip_route_input) and not found
>  so ip_route_input_slow/fn_hash_lookup/dst_alloc/rt_intern_hash path is taken to add
>  a new dst entry...
> 
>  And later GC have to remove all enties with spin_lock_bh hold (no packet processing 
>  runs). I see packet drops exactly when GC runs. Tuning GC might help but it's something 
>  to observe.
> 
>  I had some idea to rate-limit new flows and try to isolate the device causing the DoS 
>  Something like (ip_route_input):
...
>                         if (net_ratelimit())
>                                 printk(KERN_WARNING "dst creation throttled\n");
> 
>                         return -ECONNREFUSED;

This reminds me of the situation we experienced with the dst cache
overflowing in early 2.2 kernels.  This was a long time ago, when our
traffic was only about 10 Mbits/second.  We had recently upgraded from a
2.0 kernel.  The dst cache was overflowing due to a bug in the garbage
collector, and at the time, no messages were printed.  It took me a
_long_ time to figure out why connections to a server I hadn't previously
connected to in a while would only work every so often, and not
immediately like they should.  I'm affraid this approach will have a
similar effect, albeit (hopefully) only under an attack.

Is it possible to have a dst LRU or a simpler approximation of such and
recycle dst entries rather than deallocating/reallocating them?  This
(Continue reading)

Robert Olsson | 2 Jun 2003 18:36
Picon
Picon

Re: Route cache performance under stress


Simon Kirby writes:

 > This reminds me of the situation we experienced with the dst cache
 > overflowing in early 2.2 kernels.  This was a long time ago, when our
 > traffic was only about 10 Mbits/second.  We had recently upgraded from a
 > 2.0 kernel.  The dst cache was overflowing due to a bug in the garbage
 > collector, and at the time, no messages were printed.  It took me a
 > _long_ time to figure out why connections to a server I hadn't previously
 > connected to in a while would only work every so often, and not
 > immediately like they should.  I'm affraid this approach will have a
 > similar effect, albeit (hopefully) only under an attack.

 We are given more work than we have resources for (max_size) what else than 
 refuse can we do?  But yes we have invested pretty much work already. 

 Also remember we are looking into runs were 100% of incoming traffic has one 
 new dst for every packet. So how is the situation in "real life"? 
 In case of multiple devices at least NAPI gives all devs it's share. 

 > Is it possible to have a dst LRU or a simpler approximation of such and
 > recycle dst entries rather than deallocating/reallocating them?  This
 > would relieve a lot of work from the garbage collector and avoid the
 > periodic large garbage collection latency.  It could be tuned to only
 > occur in an attack (I remember Alexey saying that the deferred garbage
 > collection was implemented to reduce latency in normal opreation).

 I don't see how this can be done. Others may?

 Cheers.
(Continue reading)

David S. Miller | 9 Jun 2003 19:21
Picon
Favicon

Re: Route cache performance under stress

   From: Robert Olsson <Robert.Olsson <at> data.slu.se>
   Date: Mon, 2 Jun 2003 18:36:37 +0200

   Simon Kirby writes:

    > Is it possible to have a dst LRU or a simpler approximation of such and
    > recycle dst entries rather than deallocating/reallocating them?  This
    > would relieve a lot of work from the garbage collector and avoid the
    > periodic large garbage collection latency.  It could be tuned to only
    > occur in an attack (I remember Alexey saying that the deferred garbage
    > collection was implemented to reduce latency in normal opreation).

    I don't see how this can be done. Others may?

Full recycle is very doable in 2.4.x, in 2.5.x is an enormously hard
problem because we use RCU there (readers run completely without
locks).

Simon Kirby | 2 Jun 2003 20:05
Favicon

Re: Route cache performance under stress

On Mon, Jun 02, 2003 at 06:36:37PM +0200, Robert Olsson wrote:

>  We are given more work than we have resources for (max_size) what else than 
>  refuse can we do?  But yes we have invested pretty much work already. 

Well, this is the problem.  We do not and cannot know which entries we
really want to remember (legitimate traffic).  Adding code to actually
refuse new dst entries is just going to make the DoS effective, which is
NOT what we want.

>  Also remember we are looking into runs were 100% of incoming traffic has one 
>  new dst for every packet. So how is the situation in "real life"? 
>  In case of multiple devices at least NAPI gives all devs it's share. 

Right, so, when we are traffic saturated, we want to make sure the whole
route cache and route path is as fast as possible.  Recycling dst entries
by simpy rewriting and rehashing them rather than allocating new and
eventually freeing them all in the garbage collection cycle should reduce
allocator overhead.  If this is only done when the table is full, I don't
see any downside...if this is in fact doable, that is. :)

Simon-
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane