Bob Hoffman | 10 Feb 00:54

oops, or how to bring a datacenter router down with one setting

so I gave up on bonding.
I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
as interfaces.
I followed them correctly, or so I thought.
I pointed both ethx to the bridge, restarted network and bam...!!!

entire ip block went out.

when I called datacenter they told me the router was under attack and I 
was like 'uh oh' and told them to just shut off my computer I would be 
there to fix it. They did not believe me.
An hour later I was there and deleted the eth1 point to the br0 and all 
was fine.
Meanwhile they were all around the router trying to stop the attack.
(it was just the router for me and others in that room....oops)

I wonder if they will boot me from the center now?
How is it possible that it did that so quickly?
Such an easy way to bring down routers, wow, a hacker could have a field 
day.

Apparently there is more to making to eth ports go to the same bridge 
than a simple point.
I have since tried bridge_ports command as listed online, however that 
must be deprecated.
I think I am just gonna stay with multiple bridges with one eth on each 
for a while until
I can test this stuff in a safe environ.

I never had a chance to recover, the second the network came up I lost 
(Continue reading)

Tony Mountifield | 10 Feb 11:18
Picon

Re: oops, or how to bring a datacenter router down with one setting

In article <4F345CD3.4060604@...>,
Bob Hoffman <bob@...> wrote:
> so I gave up on bonding.
> I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
> as interfaces.
> I followed them correctly, or so I thought.
> I pointed both ethx to the bridge, restarted network and bam...!!!
> 
> entire ip block went out.
> 
> [...]
> 
> Feb  9 04:22:41 main kernel: __ratelimit: 100807 callbacks suppressed
> Feb  9 04:22:41 main kernel: eth1: received packet with own address as 
> source address

I think to do this you also need to be connected to a managed switch
which supports interface bonding. You would have to tell it that the two
switch ports are bonded to the same machine. That should prevent it from
forwarding packets received on one of the ports out via the other port.

The key phrase to look for appears to be "IEEE 802.3ad Dynamic Link
Aggregation".

Cheers
Tony
--

-- 
Tony Mountifield
Work: tony@... - http://www.softins.co.uk
Play: tony@... - http://tony.mountifield.org
(Continue reading)

Picon

Re: oops, or how to bring a datacenter router down with one setting

On 02/10/2012 11:18 AM, Tony Mountifield wrote:
> In article<4F345CD3.4060604@...>,
> Bob Hoffman<bob@...>  wrote:
>> so I gave up on bonding.
>> I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
>> as interfaces.
>> I followed them correctly, or so I thought.
>> I pointed both ethx to the bridge, restarted network and bam...!!!
>>
>> entire ip block went out.
>>
>> [...]
>>
>> Feb  9 04:22:41 main kernel: __ratelimit: 100807 callbacks suppressed
>> Feb  9 04:22:41 main kernel: eth1: received packet with own address as
>> source address
>
> I think to do this you also need to be connected to a managed switch
> which supports interface bonding. You would have to tell it that the two
> switch ports are bonded to the same machine. That should prevent it from
> forwarding packets received on one of the ports out via the other port.
>
> The key phrase to look for appears to be "IEEE 802.3ad Dynamic Link
> Aggregation".

Yes, linux support LACP but it's just one of the possible bonding modes. 
The other ones can work without special switch support i.e. "Active-backup" 
only works with one port and the other only comes into play when the first 
one fails.

(Continue reading)

Picon

Re: oops, or how to bring a datacenter router down with one setting

On 02/10/2012 12:54 AM, Bob Hoffman wrote:
> so I gave up on bonding.
> I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
> as interfaces.
> I followed them correctly, or so I thought.
> I pointed both ethx to the bridge, restarted network and bam...!!!

Bonding and bridging are completely different things. If you want to start 
bonding then you should first start with simply bonding the two interfaces 
and only once you got that going add the bridge and then add the bond0 
device to it.

Regards,
   Dennis
Bob Hoffman | 10 Feb 14:54

Re: oops, or how to bring a datacenter router down with one setting


---------------------------------------------------------
Dennis Jacobfeuerborn wrote
/Fri Feb 10 06:47:22 EST 2012/

On 02/10/2012 12:54 AM, Bob Hoffman wrote:
>/  so I gave up on bonding.
/>/  I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
/>/  as interfaces.
/>/  I followed them correctly, or so I thought.
/>/  I pointed both ethx to the bridge, restarted network and bam...!!!
/
Bonding and bridging are completely different things. If you want to start
bonding then you should first start with simply bonding the two interfaces
and only once you got that going add the bridge and then add the bond0
device to it.

Regards,
    Dennis

-----------------------------------------------------------

Yea, I gave up on bonding, ended up just using eth1. But every tutorial 
I found had added eth0 and eth1 as interfaces to br0, thus sharing the 
bridge so to speak.
All the tutorials were for debian though, all the centos ones ended up 
pointing each eth to a different cridge (br0 and br1)
So I tried it....bam, took down router in less than a second.

I did not add a domain= setting in the bridge though. With network 
(Continue reading)

Janez Kosmrlj | 10 Feb 15:02

Re: oops, or how to bring a datacenter router down with one setting

i have several centos 5.x servers with bonding enabled. And none of them
have any problems.

I used this tutorial:
http://www.howtoforge.com/network_card_bonding_centos

I use mode=6.

On Fri, Feb 10, 2012 at 2:54 PM, Bob Hoffman <bob@...> wrote:

>
> ---------------------------------------------------------
> Dennis Jacobfeuerborn wrote
> /Fri Feb 10 06:47:22 EST 2012/
>
> On 02/10/2012 12:54 AM, Bob Hoffman wrote:
> >/  so I gave up on bonding.
> />/  I found about 300 posts showing eth0 and eth1 both pointing to br0
> (bridge)
> />/  as interfaces.
> />/  I followed them correctly, or so I thought.
> />/  I pointed both ethx to the bridge, restarted network and bam...!!!
> /
> Bonding and bridging are completely different things. If you want to start
> bonding then you should first start with simply bonding the two interfaces
> and only once you got that going add the bridge and then add the bond0
> device to it.
>
> Regards,
>    Dennis
(Continue reading)

m.roth | 10 Feb 15:18
Picon

Re: oops, or how to bring a datacenter router down with one setting

Bob Hoffman wrote:
> Dennis Jacobfeuerborn wrote
> /Fri Feb 10 06:47:22 EST 2012/
>
> On 02/10/2012 12:54 AM, Bob Hoffman wrote:
>>/  so I gave up on bonding.
> />/  I found about 300 posts showing eth0 and eth1 both pointing to br0
> (bridge)
> />/  as interfaces.
> />/  I followed them correctly, or so I thought.
> />/  I pointed both ethx to the bridge, restarted network and bam...!!!
<snip>
>
> I was not bonding at this time. I am wondering though why the network
> manager overwrites resolv.conf if NM is off, all ifcfg files say
> nm_controlled=no, and chkconfig NetworkManager off was run.

dhcp running? That will update resolv.conf; NM not needed.

       mark
Picon

Re: oops, or how to bring a datacenter router down with one setting

On 02/10/2012 02:54 PM, Bob Hoffman wrote:
>
> ---------------------------------------------------------
> Dennis Jacobfeuerborn wrote
> /Fri Feb 10 06:47:22 EST 2012/
>
> On 02/10/2012 12:54 AM, Bob Hoffman wrote:
>> /  so I gave up on bonding.
> />/  I found about 300 posts showing eth0 and eth1 both pointing to br0 (bridge)
> />/  as interfaces.
> />/  I followed them correctly, or so I thought.
> />/  I pointed both ethx to the bridge, restarted network and bam...!!!
> /
> Bonding and bridging are completely different things. If you want to start
> bonding then you should first start with simply bonding the two interfaces
> and only once you got that going add the bridge and then add the bond0
> device to it.
>
> Regards,
>      Dennis
>
> -----------------------------------------------------------
>
> Yea, I gave up on bonding, ended up just using eth1. But every tutorial
> I found had added eth0 and eth1 as interfaces to br0, thus sharing the
> bridge so to speak.
> All the tutorials were for debian though, all the centos ones ended up
> pointing each eth to a different cridge (br0 and br1)

What are you actually trying to accomplish? You still seem to mix bonding 
(Continue reading)

Bob Hoffman | 10 Feb 16:25

Re: oops, or how to bring a datacenter router down with one setting

/
=================================
Dennis Jacobfeuerborn wrote

/>/  Yea, I gave up on bonding, ended up just using eth1. But every tutorial
/>/  I found had added eth0 and eth1 as interfaces to br0, thus sharing the
/>/  bridge so to speak.
/>/  All the tutorials were for debian though, all the centos ones ended up
/>/  pointing each eth to a different cridge (br0 and br1)
/
What are you actually trying to accomplish? You still seem to mix bonding
and bridging willy nilly as if they are somehow related. They are not.

Regards,
    Dennis
==================================

Nothing at all to do with bonding. Not at all.
eth1 to br0 , eth0 to br0....that's all.
If that is possible, I see no reason for a bond at all.
I just want to make sure if an NIC fails, the other one is still working
while I am asleep and not a care in the world.
Picon

Re: oops, or how to bring a datacenter router down with one setting

On 02/10/2012 04:25 PM, Bob Hoffman wrote:
> /
> =================================
> Dennis Jacobfeuerborn wrote
>
> />/  Yea, I gave up on bonding, ended up just using eth1. But every tutorial
> />/  I found had added eth0 and eth1 as interfaces to br0, thus sharing the
> />/  bridge so to speak.
> />/  All the tutorials were for debian though, all the centos ones ended up
> />/  pointing each eth to a different cridge (br0 and br1)
> /
> What are you actually trying to accomplish? You still seem to mix bonding
> and bridging willy nilly as if they are somehow related. They are not.
>
> Regards,
>      Dennis
> ==================================
>
> Nothing at all to do with bonding. Not at all.
> eth1 to br0 , eth0 to br0....that's all.
> If that is possible, I see no reason for a bond at all.
> I just want to make sure if an NIC fails, the other one is still working
> while I am asleep and not a care in the world.

Bridging doesn't do that. You need bonding for this.

Regards,
   Dennis
Bob Hoffman | 10 Feb 16:53

Re: oops, or how to bring a datacenter router down with one setting


/  =================================
/>/  Dennis Jacobfeuerborn wrote/>/
/>/  Nothing at all to do with bonding. Not at all.
/>/  eth1 to br0 , eth0 to br0....that's all.
/>/  If that is possible, I see no reason for a bond at all.
/>/  I just want to make sure if an NIC fails, the other one is still working
/>/  while I am asleep and not a care in the world.
/
Bridging doesn't do that. You need bonding for this.

Regards,
    Dennis

====================================
That may be true, I am no expert at all, but I can find you literally 
hundreds of how-tos out there all specifically adding two or more ethx 
interfaces to the same bridge. hundreds.
So, I thought it would be safe to do.
But obviously it is dangerous or I messed up real well..lol

https://www.google.com/search?q=brctl+eth0+eth1+br0&btnG=Search&oe=utf-8&rls=org.mozilla%3Aen-US%3Aofficial&client=firefox-a&gbv=1&sei=-zs1T47wJJCd0gGctMSaAg

google search with a lot of the how-tos i was following.
Devin Reade | 10 Feb 17:22

Re: oops, or how to bring a datacenter router down with one setting

Bob,

I'd suggest you do some more reading on the purpose behind bonding
and bridging.  It *sounds* like what you functionally need is
to have a server with a single route upstream, not acting as
a gateway, but where you want to be able to take a failure on
one of the upstream network connections without losing connectivity.

If that is true, then look at bonding.

Bridging is typically used if you want to have a machine, perhaps
acting as a transparent firewall join two physical network segments
as if they are one logical network. It has nothing to do with 
network redundancy.

Note that bonding will only solve the redundancy problem if your
upstream switches are redundant and all the upstream connections
from there are redundant as well.  (Bonding can have other purposes
as well, such as increasing throughput, but I don't think that's
relevent here.)

As an aside (and in case you run into it in your reading), multihoming
is another way to receive redundancy, but unless you are an expert
(or at least very experienced) in networking including DNS, routing,
and exterior gateway protocols, as well as having your own ASN and
directly assigned network blocks, then Don't Go There.  And this
type of multihoming is typically used only on border gateways.
(Also, if you do multihoming wrong and start flapping then your
peer networks will typically blacklist you and you lose *all* 
connectivity.)
(Continue reading)

Les Mikesell | 10 Feb 20:49
Picon

Re: oops, or how to bring a datacenter router down with one setting

On Fri, Feb 10, 2012 at 9:25 AM, Bob Hoffman <bob@...> wrote:

>
> Nothing at all to do with bonding. Not at all.
> eth1 to br0 , eth0 to br0....that's all.
> If that is possible, I see no reason for a bond at all.
> I just want to make sure if an NIC fails, the other one is still working
> while I am asleep and not a care in the world.
>

I suppose it is possible for a NIC to fail, but I can't recall actually
ever seeing it.  I've seen lots of complicated failover schemes introduce
new problems and their own failure modes though, including a bad cable that
kept flipping the primary/backup links at approximately the same rate that
spanning-tree would let them switch.

--

-- 
    Les Mikesell
       lesmikesell@...
Devin Reade | 10 Feb 22:33

Re: oops, or how to bring a datacenter router down with one setting

--On Friday, February 10, 2012 01:49:05 PM -0600 Les Mikesell
<lesmikesell@...> wrote:

> I suppose it is possible for a NIC to fail, but I can't recall actually
> ever seeing it.  I've seen lots of complicated failover schemes introduce
> new problems and their own failure modes [...]

+1.

Redundancy is cool.  Redundancy, when needed and properly implemented,
can work and can save your bacon.  However, it is expensive, time
consuming, and significantly increases both the complexity of a
system and the skill needed to analyze problems (or for that matter
predict them and plan for mitigation strategies).  It also needs
to be exercised on a regular basis or, when you need it, you'll 
find that someone has made a bad configuration change that prohibits
failover.

I, also, have not seen a properly tested NIC fail in quite a few years.
(I'm discounting bad NIC models that don't pass evaluation.) Of course,
just because I've not seen it doesn't mean it can't happen, but I also
don't usually worry about having a redundant SERIAL back-channel for
cluster hearbeat operations, which used to be considered as the only
reasonable way to do things.

I do have clusters where bonding is in use but those have helped not so
much in avoiding NIC failures as they do in allowing the machines
to continue operating as the network team brings down part of the
redundant switch network for maintenance (or to replace a failed switch,
or when some fool decides that they can unplug a network cable 
(Continue reading)

m.roth | 10 Feb 22:40
Picon

Re: oops, or how to bring a datacenter router down with one setting

Devin Reade wrote:
<snip>
> I do have clusters where bonding is in use but those have helped not so
> much in avoiding NIC failures as they do in allowing the machines
> to continue operating as the network team brings down part of the
> redundant switch network for maintenance (or to replace a failed switch,
> or when some fool decides that they can unplug a network cable
> briefly so that they can move other cables around).
>
Now wait a minute - I would dearly love to disconnect some cables we have
in a shared rack downstairs in the datacenter - it's a rats' nest, and
more than half ain't ours, and every single time I have to do something in
the back, I'm deathly afraid I'm going to pull out somebody's power,
or....

        mark
Devin Reade | 10 Feb 22:49

Re: oops, or how to bring a datacenter router down with one setting

--On Friday, February 10, 2012 04:40:59 PM -0500 m.roth@... wrote:

> Devin Reade wrote:
> <snip>
>> or when some fool decides that they can unplug a network cable
>> briefly so that they can move other cables around).
>> 
> Now wait a minute - I would dearly love to disconnect some cables we have
> in a shared rack downstairs in the datacenter [...]

My complaint is not with moving cables, it's in doing so without
having proper change control.

Clean data centers == good
Arbitrarily moving hardware without planning and authorization == bad

Devin
m.roth | 10 Feb 23:02
Picon

Re: oops, or how to bring a datacenter router down with one setting

Devin Reade wrote:
> --On Friday, February 10, 2012 04:40:59 PM -0500 m.roth@... wrote:
>
>> Devin Reade wrote:
>> <snip>
>>> or when some fool decides that they can unplug a network cable
>>> briefly so that they can move other cables around).
>>>
>> Now wait a minute - I would dearly love to disconnect some cables we
>> have in a shared rack downstairs in the datacenter [...]
>
> My complaint is not with moving cables, it's in doing so without
> having proper change control.
>
Change control? We're talking about a datacenter that provides racks,
power, and connectivity, and the responsible folks from various Institutes
get to rack, connect, etc them all....

> Clean data centers == good
> Arbitrarily moving hardware without planning and authorization == bad

The racks are locked, so no one who doesn't have access to the rack can do
anything, but this shared rack!

       mark
Les Mikesell | 10 Feb 22:58
Picon

Re: oops, or how to bring a datacenter router down with one setting

On Fri, Feb 10, 2012 at 3:40 PM, <m.roth@...> wrote:

> Devin Reade wrote:
> <snip>
> > I do have clusters where bonding is in use but those have helped not so
> > much in avoiding NIC failures as they do in allowing the machines
> > to continue operating as the network team brings down part of the
> > redundant switch network for maintenance (or to replace a failed switch,
> > or when some fool decides that they can unplug a network cable
> > briefly so that they can move other cables around).
> >
> Now wait a minute - I would dearly love to disconnect some cables we have
> in a shared rack downstairs in the datacenter - it's a rats' nest, and
> more than half ain't ours, and every single time I have to do something in
> the back, I'm deathly afraid I'm going to pull out somebody's power,
> or....
>

Do you really want to double the size of the mess to make it a little safer
to move one thing?   Redundant power connections normally do work with only
a little attention to grounding and that the connections really do go to
separate circuits/UPSs.  But with NICs, you have to be very careful that
the switch ports are configured to match so you are even more likely to
break things by moving them around.    It's not impossible, but rarely
worthwhile if you don't need the combined bandwidth.   But the real lesson
here is to not do something for the first time in a place where mistakes
will cause big trouble.

--

-- 
   Les Mikesell
(Continue reading)

Gordon Messmer | 14 Feb 01:11
Favicon

Re: oops, or how to bring a datacenter router down with one setting

On 02/10/2012 05:54 AM, Bob Hoffman wrote:
> Yea, I gave up on bonding, ended up just using eth1. But every tutorial
> I found had added eth0 and eth1 as interfaces to br0, thus sharing the
> bridge so to speak.

Those tutorials were documenting the manner in which you can set up a 
transparent Linux firewall.  That's not what you want to do with a KVM 
server.

Creating an Ethernet bridge and adding two interfaces to it effectively 
makes a Linux host into a two-port switch with firewalling.

If you connect multiple ports from one switch to ports on a second 
switch (two bridged Linux Ethernet ports to a switch) you create a 
switch loop.  Switch loops will endlessly replay broadcast traffic (such 
as ARP), creating a broadcast storm.

Yes, that can consume all of a router's CPU cycles quite easily.  That 
is why data centers should always run spanning tree on their switches. 
STP will shut off ports that get looped.
Lamar Owen | 10 Feb 21:01
Favicon

Re: oops, or how to bring a datacenter router down with one setting


On Feb 9, 2012, at 6:54 PM, Bob Hoffman wrote:
> entire ip block went out.
>
> when I called datacenter they told me the router was under attack  
> and I
> was like 'uh oh' and told them to just shut off my computer I would be
> there to fix it. They did not believe me.
> An hour later I was there and deleted the eth1 point to the br0 and  
> all
> was fine.
> Meanwhile they were all around the router trying to stop the attack.
> (it was just the router for me and others in that room....oops)
>
> I wonder if they will boot me from the center now?
> How is it possible that it did that so quickly?
> Such an easy way to bring down routers, wow, a hacker could have a  
> field
> day.

If you weren't running a spanning-tree on your Linux bridge, and their  
switch ports aren't sending you BPDU's for STP, then you found out  
what happens when you activate a bridging (from the point of view of  
the switch, not the Linux bridging) loop.  Been there, done that.   
Most monitoring tools are written to track layer-3 happenings, and  
this is happening at layer 2.  And it will take down that whole layer  
2 broadcast domain, that's for sure.

And since many, if not most, tools are working at layer 3 and dealing  
with IP flows and not actual ethernet traffic, none of the typical  
(Continue reading)


Gmane