erik quanstrom | 18 Aug 2012 22:11
Favicon

tcp!

since it came up, i put my working copy of tcp along with some testing
scripts in /n/sources/contrib/quanstro/tcp.

there are a number of fixes rolled into this, but the main fixes are
- add support for new reno,
- properly handle zero-window probes (on both ends),
- don't confuse the cwind with the receiver's advertized window.  this
particular condition can lead to livelock.
- don't confuse the window scale with the amount of local buffering
we'd like to do.
- and, don't queue tcp infinitely, which can crash kernels.  :-)

i don't have the numbers for the old tcp handy, but i think you'll
be surprised at how much difference there can be.  i saw differences
of 20x when the sender was limited in how fast it could read by the
read rate from user space.

i've included "testscript."  for the two machines i have handy, i get
the following results with new and old tcp.

machine		stack	kernel	0ms delay	1ms delay
ideal		-	386	unlimited	8.19mb/s

xeon x5550	old	386	138mb/s		0.49mb/s  (!)
intel atom	old	386	37.2mb/s		0.10mb/s

amd x4 964	new	386	145mb/s		8.03mb/s
intel e31220	new	amd64	303mb/s		8.15mb/s
intel atom	new	386	67mb/s		8.03mb/s
	# note: i can get up to 80mb/s using forsyth's qmalloc.
(Continue reading)

erik quanstrom | 18 Aug 2012 22:26
Favicon

Re: tcp!

> - add support for new reno,

i apoligize for not mentioning that the new reno work
was part of the nix/9k tcp.  i'm not sure who wrote it.

sorry!

also i forgot to mention that this version of qread can
potentially cut the number of reads on tcp channels by up
to 1/2.  one might as well completely satisfy the read,
if possible.  especially since typical iounits (8192) do not
divide up into typical mss-sized (1460) packets evenly.

[...]
	/* if we get here, there's at least one block in the queue */
	if(q->state & Qcoalesce){
		/* when coalescing, 0 length blocks just go away */
		b = q->bfirst;
		if(BLEN(b) <= 0){
			freeb(qremove(q));
			goto again;
		}

		/*
		 * grab the first block and as many following
		 * blocks as will partially fit in the read
		 */
		n = 0;
		l = &first;
		for(;;) {
(Continue reading)

Richard Miller | 19 Aug 2012 15:37

Re: tcp!

> also i forgot to mention that this version of qread can
> potentially cut the number of reads on tcp channels by up
> to 1/2.  one might as well completely satisfy the read,
> if possible.

This looks like a good idea for tcp.  But there are other
users of qread, with stricter assumptions.  Aren't you in danger
of breaking the contract of pipe(3) which uses qwrite/qread:

          Writes are atomic up to a certain size, typically 32768
          bytes, that is, each write will be delivered in a single
          read by the recipient, provided the receiving buffer is
          large enough.

To preserve the atomicity of qread/qwrite, maybe tcp should be
coalescing the blocks itself by multiple calls to qread.

cinap_lenrek | 19 Aug 2012 15:55
Picon
Picon
Gravatar

Re: tcp!

its only done on queues that have this flag set i think:

Qcoalesce	= (1<<4),	/* coallesce packets on read */

--
cinap

Richard Miller | 19 Aug 2012 16:05

Re: tcp!

> its only done on queues that have this flag set i think:

... and it won't be set for pipes, of course.  Sorry Erik, I should
have studied this more carefully.

I'll try it.

erik quanstrom | 19 Aug 2012 17:07
Favicon

Re: tcp!

> ... and it won't be set for pipes, of course.  Sorry Erik, I should
> have studied this more carefully.
> 
> I'll try it.

no problems.  i'm glad you're double-checking.  nobody i know is immune
from error.  and there's me, myself and i.  so i am 3x as likely to screw up.

i'd be curious to know if this makes a noticable difference on slower machines
like the π with tcptest to self.

- erik

erik quanstrom | 19 Aug 2012 16:48
Favicon

Re: tcp!

> This looks like a good idea for tcp.  But there are other
> users of qread, with stricter assumptions.  Aren't you in danger
> of breaking the contract of pipe(3) which uses qwrite/qread:
> 
>           Writes are atomic up to a certain size, typically 32768
>           bytes, that is, each write will be delivered in a single
>           read by the recipient, provided the receiving buffer is
>           large enough.

this change only applies to Qcoalesce queues.

the only users of Qcoalesce are the kprintoq and tcp.  both
should be okay with this change.  

; g qopen port/devpipe.c
port/devpipe.c:68: 	p->q[0] = qopen(conf.pipeqsize, 0, 0, 0);
port/devpipe.c:73: 	p->q[1] = qopen(conf.pipeqsize, 0, 0, 0);

- erik

Richard Miller | 19 Aug 2012 15:17

Re: tcp!

Within the last month or so I've been having trouble copying large
files to remote servers e.g. sources.  The cp process hangs for
many minutes and eventually ends in 'mount rpc error'.  I was
hoping this tcp patch might solve it, but alas no.

Has anyone else been observing this?

erik quanstrom | 19 Aug 2012 17:43
Favicon

Re: tcp!

On Sun Aug 19 09:19:23 EDT 2012, 9fans <at> hamnavoe.com wrote:
> Within the last month or so I've been having trouble copying large
> files to remote servers e.g. sources.  The cp process hangs for
> many minutes and eventually ends in 'mount rpc error'.  I was
> hoping this tcp patch might solve it, but alas no.

could you send a snoopy capture?  -M100 and just the tail should
be good enough.  also a capture of /net/log with 'set tcp' during the
issue could be helpful. also, could you point to a particular large
file on sources?  i'd like to try to replicate.

- erik

Richard Miller | 21 Aug 2012 20:32

Re: tcp!

I reported:

> Within the last month or so I've been having trouble copying large
> files to remote servers e.g. sources.  The cp process hangs for
> many minutes and eventually ends in 'mount rpc error'.

Thanks to a hint from Erik ("... an mss problem of some sort"), I've
managed to make the problem go away, by doing
  echo mtu 1496 >/net/ipifc/1/ctl

I hope to come back to this when I have more time, because I don't
like not understanding why this works.  As nobody else has said they
have the same trouble, there may be something amiss in my adsl gateway.

Gorka Guardiola | 22 Aug 2012 19:18
Picon

Re: tcp!

I had this problem several years ago
with an adsl router (9fans archive may know about this). There was a bug in my adsl router (which seems to be
common, I have seen it since more than once) that dropped ethernet frames of size greater than 1480
(someone counted a header twice probably). Linux adapts the
mss to 1480 if there are problems so it works in this case. 

G.

On Aug 21, 2012, at 8:32 PM, Richard Miller <9fans <at> hamnavoe.com> wrote:

> I reported:
> 
>> Within the last month or so I've been having trouble copying large
>> files to remote servers e.g. sources.  The cp process hangs for
>> many minutes and eventually ends in 'mount rpc error'.
> 
> Thanks to a hint from Erik ("... an mss problem of some sort"), I've
> managed to make the problem go away, by doing
>  echo mtu 1496 >/net/ipifc/1/ctl
> 
> I hope to come back to this when I have more time, because I don't
> like not understanding why this works.  As nobody else has said they
> have the same trouble, there may be something amiss in my adsl gateway.
> 
> 

Steven Stallion | 22 Aug 2012 20:29
Picon
Gravatar

Re: tcp!

On Wed, Aug 22, 2012 at 10:18 AM, Gorka Guardiola <paurea <at> gmail.com> wrote:
> I had this problem several years ago
> with an adsl router (9fans archive may know about this). There was a bug in my adsl router (which seems to be
common, I have seen it since more than once) that dropped ethernet frames of size greater than 1480
(someone counted a header twice probably). Linux adapts the
> mss to 1480 if there are problems so it works in this case.

Not so much a bug as ATM overhead.


Gmane