Cong Wang | 27 Sep 10:41 2012
Picon

[RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

Some customer requests this feature, as they stated:

	"This parameter is necessary, especially for software that continually 
        creates many ephemeral processes which open sockets, to avoid socket 
        exhaustion. In many cases, the risk of the exhaustion can be reduced by 
        tuning reuse interval to allow sockets to be reusable earlier.

        In commercial Unix systems, this kind of parameters, such as 
        tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
        already been available. Their implementations allow users to tune 
        how long they keep TCP connection as TIME-WAIT state on the 
        millisecond time scale."

We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
are not equivalent in that they cannot be tuned directly on the time
scale nor in a safe way, as some combinations of tunings could still
cause some problem in NAT. And, I think second scale is enough, we don't
have to make it in millisecond time scale.

See also: https://lkml.org/lkml/2008/11/15/80

Any comments?

Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Alexey Kuznetsov <kuznet <at> ms2.inr.ac.ru>
Cc: Patrick McHardy <kaber <at> trash.net>
Cc: Eric Dumazet <edumazet <at> google.com>
Cc: Neil Horman <nhorman <at> tuxdriver.com>
Signed-off-by: Cong Wang <amwang <at> redhat.com>

(Continue reading)

Neil Horman | 27 Sep 16:23 2012

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> Some customer requests this feature, as they stated:
> 
> 	"This parameter is necessary, especially for software that continually 
>         creates many ephemeral processes which open sockets, to avoid socket 
>         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
>         tuning reuse interval to allow sockets to be reusable earlier.
> 
>         In commercial Unix systems, this kind of parameters, such as 
>         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
>         already been available. Their implementations allow users to tune 
>         how long they keep TCP connection as TIME-WAIT state on the 
>         millisecond time scale."
> 
> We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> are not equivalent in that they cannot be tuned directly on the time
> scale nor in a safe way, as some combinations of tunings could still
> cause some problem in NAT. And, I think second scale is enough, we don't
> have to make it in millisecond time scale.
> 
I think I have a little difficultly seeing how this does anything other than
pay lip service to actually having sockets spend time in TIME_WAIT state.  That
is to say, while I see users using this to just make the pain stop.  If we wait
less time than it takes to be sure that a connection isn't being reused (either
by waiting two segment lifetimes, or by checking timestamps), then you might as
well not wait at all.  I see how its tempting to be able to say "Just don't wait
as long", but it seems that theres no difference between waiting half as long as
the RFC mandates, and waiting no time at all.  Neither is a good idea.

Given the problem you're trying to solve here, I'll ask the standard question in
(Continue reading)

Rick Jones | 27 Sep 19:02 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On 09/27/2012 07:23 AM, Neil Horman wrote:
> The code looks fine, but the idea really doesn't seem like a good plan to me.
> I'm sure HPUX/Solaris/AIX/etc have done this in response to customer demand, but
> that doesn't make it the right solution.

In the case of HP-UX at least, while the rope is indeed there, the 
advice is to not wrap it around one's neck unless one *really* has a 
handle on the environment.  Instead things suggested, in no particular 
order:

*) The aforementioned SO_REUSEADDR to address the "I can't restart the 
server quickly enough." issue

*) Tuning the size of the anonymous/ephemeral port range.

*) Making explicit bind() calls using the entire non-privileged port range

*) Making the connections longer-lived.  Especially if the comms are 
between a fixed set of IP addresses.

rick jones
Cong Wang | 28 Sep 08:33 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > Some customer requests this feature, as they stated:
> > 
> > 	"This parameter is necessary, especially for software that continually 
> >         creates many ephemeral processes which open sockets, to avoid socket 
> >         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
> >         tuning reuse interval to allow sockets to be reusable earlier.
> > 
> >         In commercial Unix systems, this kind of parameters, such as 
> >         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
> >         already been available. Their implementations allow users to tune 
> >         how long they keep TCP connection as TIME-WAIT state on the 
> >         millisecond time scale."
> > 
> > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > are not equivalent in that they cannot be tuned directly on the time
> > scale nor in a safe way, as some combinations of tunings could still
> > cause some problem in NAT. And, I think second scale is enough, we don't
> > have to make it in millisecond time scale.
> > 
> I think I have a little difficultly seeing how this does anything other than
> pay lip service to actually having sockets spend time in TIME_WAIT state.  That
> is to say, while I see users using this to just make the pain stop.  If we wait
> less time than it takes to be sure that a connection isn't being reused (either
> by waiting two segment lifetimes, or by checking timestamps), then you might as
> well not wait at all.  I see how its tempting to be able to say "Just don't wait
> as long", but it seems that theres no difference between waiting half as long as
> the RFC mandates, and waiting no time at all.  Neither is a good idea.

(Continue reading)

David Miller | 28 Sep 08:43 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

From: Cong Wang <amwang <at> redhat.com>
Date: Fri, 28 Sep 2012 14:33:07 +0800

> I don't think reducing TIME_WAIT is a good idea either, but there must
> be some reason behind as several UNIX provides a microsecond-scale
> tuning interface, or maybe in non-recycle mode, their RTO is much less
> than 2*MSL?

Yes, there is a reason.  It's there for retaining multi-million-dollar
customers.

There is no other reasons these other systems provide these
facilities, they are simply there in an attempt to retain a dwindling
customer base.

Any other belief is extremely naive.
Rick Jones | 28 Sep 19:30 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On 09/27/2012 11:43 PM, David Miller wrote:
> From: Cong Wang <amwang <at> redhat.com>
> Date: Fri, 28 Sep 2012 14:33:07 +0800
>
>> I don't think reducing TIME_WAIT is a good idea either, but there must
>> be some reason behind as several UNIX provides a microsecond-scale
>> tuning interface, or maybe in non-recycle mode, their RTO is much less
>> than 2*MSL?

Microsecond?  HP-UX uses milliseconds for the units of the tunable, 
though that does not necessarily mean it will actually be implemented to 
millisecond accuracy

> Yes, there is a reason.  It's there for retaining multi-million-dollar
> customers.
>
> There is no other reasons these other systems provide these
> facilities, they are simply there in an attempt to retain a dwindling
> customer base.
 >
 > Any other belief is extremely naive.

HP-UX's TIME_WAIT interval tunability goes back to HP-UX 11.0, which 
first shipped in 1997.  It got it by virtue of using a "Mentat-based" 
stack which had that functionality.  I may not have my history 
completely correct, but Solaris 2 also got their networking bits from 
Mentat, and I believe shipped before HP-UX 11.

To my recollection, neither were faced with a dwindling customer base at 
the time.
(Continue reading)

Neil Horman | 28 Sep 15:16 2012

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Fri, Sep 28, 2012 at 02:33:07PM +0800, Cong Wang wrote:
> On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> > On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > > Some customer requests this feature, as they stated:
> > > 
> > > 	"This parameter is necessary, especially for software that continually 
> > >         creates many ephemeral processes which open sockets, to avoid socket 
> > >         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
> > >         tuning reuse interval to allow sockets to be reusable earlier.
> > > 
> > >         In commercial Unix systems, this kind of parameters, such as 
> > >         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
> > >         already been available. Their implementations allow users to tune 
> > >         how long they keep TCP connection as TIME-WAIT state on the 
> > >         millisecond time scale."
> > > 
> > > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > > are not equivalent in that they cannot be tuned directly on the time
> > > scale nor in a safe way, as some combinations of tunings could still
> > > cause some problem in NAT. And, I think second scale is enough, we don't
> > > have to make it in millisecond time scale.
> > > 
> > I think I have a little difficultly seeing how this does anything other than
> > pay lip service to actually having sockets spend time in TIME_WAIT state.  That
> > is to say, while I see users using this to just make the pain stop.  If we wait
> > less time than it takes to be sure that a connection isn't being reused (either
> > by waiting two segment lifetimes, or by checking timestamps), then you might as
> > well not wait at all.  I see how its tempting to be able to say "Just don't wait
> > as long", but it seems that theres no difference between waiting half as long as
> > the RFC mandates, and waiting no time at all.  Neither is a good idea.
(Continue reading)

Cong Wang | 2 Oct 09:04 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Fri, 2012-09-28 at 09:16 -0400, Neil Horman wrote:
> On Fri, Sep 28, 2012 at 02:33:07PM +0800, Cong Wang wrote:
> > On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> > > On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > > > Some customer requests this feature, as they stated:
> > > > 
> > > > 	"This parameter is necessary, especially for software that continually 
> > > >         creates many ephemeral processes which open sockets, to avoid socket 
> > > >         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
> > > >         tuning reuse interval to allow sockets to be reusable earlier.
> > > > 
> > > >         In commercial Unix systems, this kind of parameters, such as 
> > > >         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
> > > >         already been available. Their implementations allow users to tune 
> > > >         how long they keep TCP connection as TIME-WAIT state on the 
> > > >         millisecond time scale."
> > > > 
> > > > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > > > are not equivalent in that they cannot be tuned directly on the time
> > > > scale nor in a safe way, as some combinations of tunings could still
> > > > cause some problem in NAT. And, I think second scale is enough, we don't
> > > > have to make it in millisecond time scale.
> > > > 
> > > I think I have a little difficultly seeing how this does anything other than
> > > pay lip service to actually having sockets spend time in TIME_WAIT state.  That
> > > is to say, while I see users using this to just make the pain stop.  If we wait
> > > less time than it takes to be sure that a connection isn't being reused (either
> > > by waiting two segment lifetimes, or by checking timestamps), then you might as
> > > well not wait at all.  I see how its tempting to be able to say "Just don't wait
> > > as long", but it seems that theres no difference between waiting half as long as
(Continue reading)

Neil Horman | 2 Oct 14:09 2012

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Tue, Oct 02, 2012 at 03:04:39PM +0800, Cong Wang wrote:
> On Fri, 2012-09-28 at 09:16 -0400, Neil Horman wrote:
> > On Fri, Sep 28, 2012 at 02:33:07PM +0800, Cong Wang wrote:
> > > On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> > > > On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > > > > Some customer requests this feature, as they stated:
> > > > > 
> > > > > 	"This parameter is necessary, especially for software that continually 
> > > > >         creates many ephemeral processes which open sockets, to avoid socket 
> > > > >         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
> > > > >         tuning reuse interval to allow sockets to be reusable earlier.
> > > > > 
> > > > >         In commercial Unix systems, this kind of parameters, such as 
> > > > >         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
> > > > >         already been available. Their implementations allow users to tune 
> > > > >         how long they keep TCP connection as TIME-WAIT state on the 
> > > > >         millisecond time scale."
> > > > > 
> > > > > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > > > > are not equivalent in that they cannot be tuned directly on the time
> > > > > scale nor in a safe way, as some combinations of tunings could still
> > > > > cause some problem in NAT. And, I think second scale is enough, we don't
> > > > > have to make it in millisecond time scale.
> > > > > 
> > > > I think I have a little difficultly seeing how this does anything other than
> > > > pay lip service to actually having sockets spend time in TIME_WAIT state.  That
> > > > is to say, while I see users using this to just make the pain stop.  If we wait
> > > > less time than it takes to be sure that a connection isn't being reused (either
> > > > by waiting two segment lifetimes, or by checking timestamps), then you might as
> > > > well not wait at all.  I see how its tempting to be able to say "Just don't wait
(Continue reading)

Cong Wang | 8 Oct 05:17 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Tue, 2012-10-02 at 08:09 -0400, Neil Horman wrote:
> No, its not very friendly, but the people using this are violating the RFC,
> which isn't very friendly. :)

Could you be more specific? In RFC 793, AFAIK, it is allowed to be
changed:

http://tools.ietf.org/html/rfc793

" To be sure that a TCP does not create a segment that carries a
  sequence number which may be duplicated by an old segment remaining in
  the network, the TCP must keep quiet for a maximum segment lifetime
  (MSL) before assigning any sequence numbers upon starting up or
  recovering from a crash in which memory of sequence numbers in use was
  lost.  For this specification the MSL is taken to be 2 minutes.  This
  is an engineering choice, and may be changed if experience indicates
  it is desirable to do so."

or I must still be missing something here... :)

Neil Horman | 8 Oct 16:07 2012

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Mon, Oct 08, 2012 at 11:17:37AM +0800, Cong Wang wrote:
> On Tue, 2012-10-02 at 08:09 -0400, Neil Horman wrote:
> > No, its not very friendly, but the people using this are violating the RFC,
> > which isn't very friendly. :)
> 
> Could you be more specific? In RFC 793, AFAIK, it is allowed to be
> changed:
> 
> http://tools.ietf.org/html/rfc793
> 
> " To be sure that a TCP does not create a segment that carries a
>   sequence number which may be duplicated by an old segment remaining in
>   the network, the TCP must keep quiet for a maximum segment lifetime
>   (MSL) before assigning any sequence numbers upon starting up or
>   recovering from a crash in which memory of sequence numbers in use was
>   lost.  For this specification the MSL is taken to be 2 minutes.  This
>   is an engineering choice, and may be changed if experience indicates
>   it is desirable to do so."
> 
Its the length of time that represents an MSL that was the choice, not the fact
that reusing a TCP before the expiration of the MSL is a bad idea.

> or I must still be missing something here... :)
> 
Next paragraph down:
	This specification provides that hosts which "crash" without
    retaining any knowledge of the last sequence numbers transmitted on
    each active (i.e., not closed) connection shall delay emitting any
    TCP segments for at least the agreed Maximum Segment Lifetime (MSL)
    in the internet system of which the host is a part.  In the
(Continue reading)

Cong Wang | 9 Oct 05:42 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Mon, 2012-10-08 at 10:07 -0400, Neil Horman wrote:
> On Mon, Oct 08, 2012 at 11:17:37AM +0800, Cong Wang wrote:
> > On Tue, 2012-10-02 at 08:09 -0400, Neil Horman wrote:
> > > No, its not very friendly, but the people using this are violating the RFC,
> > > which isn't very friendly. :)
> > 
> > Could you be more specific? In RFC 793, AFAIK, it is allowed to be
> > changed:
> > 
> > http://tools.ietf.org/html/rfc793
> > 
> > " To be sure that a TCP does not create a segment that carries a
> >   sequence number which may be duplicated by an old segment remaining in
> >   the network, the TCP must keep quiet for a maximum segment lifetime
> >   (MSL) before assigning any sequence numbers upon starting up or
> >   recovering from a crash in which memory of sequence numbers in use was
> >   lost.  For this specification the MSL is taken to be 2 minutes.  This
> >   is an engineering choice, and may be changed if experience indicates
> >   it is desirable to do so."
> > 
> Its the length of time that represents an MSL that was the choice, not the fact
> that reusing a TCP before the expiration of the MSL is a bad idea.
> 
> > or I must still be missing something here... :)
> > 
> Next paragraph down:
> 	This specification provides that hosts which "crash" without
>     retaining any knowledge of the last sequence numbers transmitted on
>     each active (i.e., not closed) connection shall delay emitting any
>     TCP segments for at least the agreed Maximum Segment Lifetime (MSL)
(Continue reading)

David Miller | 27 Sep 19:05 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

From: Cong Wang <amwang <at> redhat.com>
Date: Thu, 27 Sep 2012 16:41:01 +0800

>         In commercial Unix systems, this kind of parameters, such as 
>         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
>         already been available. Their implementations allow users to tune 
>         how long they keep TCP connection as TIME-WAIT state on the 
>         millisecond time scale."

This statement only makes me happy that these systems are not as
widely deployed as Linux is.

Furthermore, the mere existence of a facility in another system
is never an argument for why we should have it too.  Often it's
instead a huge reason for us not to add it.

Without appropriate confirmation that an early time-wait reuse is
valid, decreasing this interval can only be dangerous.
Cong Wang | 28 Sep 08:39 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

On Thu, 2012-09-27 at 13:05 -0400, David Miller wrote:
> 
> Without appropriate confirmation that an early time-wait reuse is
> valid, decreasing this interval can only be dangerous.

Yeah, would a proper documentation cure this? Something like we did for
other tuning:

"It should not be changed without advice/request of technical experts."

David Miller | 28 Sep 08:44 2012
Picon

Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

From: Cong Wang <amwang <at> redhat.com>
Date: Fri, 28 Sep 2012 14:39:59 +0800

> On Thu, 2012-09-27 at 13:05 -0400, David Miller wrote:
>> 
>> Without appropriate confirmation that an early time-wait reuse is
>> valid, decreasing this interval can only be dangerous.
> 
> Yeah, would a proper documentation cure this? Something like we did for
> other tuning:
> 
> "It should not be changed without advice/request of technical experts."

No, we're not adding this facility.

Gmane