Dan Miller | 25 Jul 18:40
Favicon

rtl8139 - another repeating error (??)

We're still running our test system as defined previously: WebStone with 500 connections...  We find that
periodically through the night (about once every 1-3 minutes) we get the following message on the receive side:

eth0: Transmit timeout, status 0d 0000 media 10. 

Is this a sign of something actually failing???  Or is it just a "I'm really busy" type of message??

soooo... if Nasdaq drops below $1, will it delist itself??
Donald Becker | 25 Jul 20:38

Re: rtl8139 - another repeating error (??)

On Thu, 25 Jul 2002, Dan Miller wrote:

> We're still running our test system as defined previously: WebStone
> with 500 connections...  We find that periodically through the night
> (about once every 1-3 minutes) we get the following message on the
> receive side: 
> 
> eth0: Transmit timeout, status 0d 0000 media 10. 

The older driver versions, from over a year ago, had a false-trigger
window for this message on SMP machines.  The interrupt handler could be
delayed long enough for another processor to think that the interrupt
had not been handled.

What version are you using?

--

-- 
Donald Becker				becker <at> scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993
Dan Miller | 25 Jul 20:44
Favicon

RE: rtl8139 - another repeating error (??)

We're using version 1.17, from the Scyld website.
We're not in a multiprocessor environment; should this message be controlled by an "#ifdef __SMP" or some
such control??

> -----Original Message-----
> From: Donald Becker [mailto:becker <at> scyld.com]
> Sent: Thursday, July 25, 2002 11:38
> To: Dan Miller
> Cc: realtek-bug <at> scyld.com; David Peavey (Work) (E-mail)
> Subject: Re: [realtek-bug] rtl8139 - another repeating error (??)
> 
> 
> On Thu, 25 Jul 2002, Dan Miller wrote:
> 
> > We're still running our test system as defined previously: WebStone
> > with 500 connections...  We find that periodically through the night
> > (about once every 1-3 minutes) we get the following message on the
> > receive side: 
> > 
> > eth0: Transmit timeout, status 0d 0000 media 10. 
> 
> The older driver versions, from over a year ago, had a false-trigger
> window for this message on SMP machines.  The interrupt 
> handler could be
> delayed long enough for another processor to think that the interrupt
> had not been handled.
> 
> What version are you using?
> 
> -- 
(Continue reading)

Donald Becker | 25 Jul 22:08

RE: rtl8139 - another repeating error (??)

On Thu, 25 Jul 2002, Dan Miller wrote:

> We're using version 1.17, from the Scyld website.
> We're not in a multiprocessor environment; should this message be controlled by an "#ifdef __SMP" or some
such control??

That version should not have the false-trigger window.

What is the link partner?  Is it possible that this is the link beat
dropping? 

--

-- 
Donald Becker				becker <at> scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993
Dan Miller | 25 Jul 22:26
Favicon

RE: rtl8139 - another repeating error (??)

The source code *does* still have the message, in

static void rtl8129_tx_timeout(struct net_device *dev);

The link partner is a Netgear FS108 Switch.  I don't know whether the link beat is dropping, though webstone
*does* put a heavy load on the system (both in amount of traffic and in number of TCP connections) ...

> -----Original Message-----
> From: Donald Becker [mailto:becker <at> scyld.com]
> Sent: Thursday, July 25, 2002 13:09
> To: Dan Miller
> Cc: realtek-bug <at> scyld.com; 'David Peavey (Work) (E-mail)'
> Subject: RE: [realtek-bug] rtl8139 - another repeating error (??)
> 
> 
> On Thu, 25 Jul 2002, Dan Miller wrote:
> 
> > We're using version 1.17, from the Scyld website.
> > We're not in a multiprocessor environment; should this 
> message be controlled by an "#ifdef __SMP" or some such control??
> 
> That version should not have the false-trigger window.
> 
> What is the link partner?  Is it possible that this is the link beat
> dropping? 
> 
> -- 
> Donald Becker				becker <at> scyld.com
> Scyld Computing Corporation		http://www.scyld.com
> 410 Severn Ave. Suite 210		Second Generation 
(Continue reading)

Donald Becker | 25 Jul 23:17

RE: rtl8139 - another repeating error (??)

On Thu, 25 Jul 2002, Dan Miller wrote:

> The source code *does* still have the message, in
> static void rtl8129_tx_timeout(struct net_device *dev);

Yes, the message is there.  It is an important check.
The false-trigger window was when the timer-based consistency check
triggered incorrectly after an interrupt was not immediately handled.

Any message from the current driver is reporting a real problem.

> The link partner is a Netgear FS108 Switch.  I don't know whether the
> link beat is dropping, though webstone *does* put a heavy load on the
> system (both in amount of traffic and in number of TCP connections)
> ... 

--

-- 
Donald Becker				becker <at> scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993

Gmane