heardbeat and bonded NICs

Matthew Sackman matthew@sackman.co.uk
Thu, 28 Mar 2002 20:33:23 +0000


Hi all.

I've just got myself a 2nd pair of 100base NICs for my two machines
and have successfully got them bonded together. The specs are:
Machine1: PIII, 2.4.18+sched-K3. 192MB, eth0=Intel EthernetPro100,
	  eth1=Realtek 8139
Machine2: PII, 2.4.18+sched-K4. 128MB, eth0=3Com 905C-TX-M,
	  eth1=Realtek 8139

Commands were (the interesting ones anyway):
one$> ifconfig bond0 192.168.1.1 netmask 255.255.255.0
one$> ifenslave -E bond0 eth0 eth1

two$> ifconfig bond0 192.168.1.2 netmask 255.255.255.0
two$> ifenslave -E bond0 eth0 eth1

I couldn't get the bonding to work without the -E option. Dunno
why. Both the Intel and the 3Com are in Bus Master slots.

The network is just a pair of X-overs: Intel <=> 3Com, 
Realtek <=> Realtek

The problem is that when heartbeat starts up, it tries to create
aliases on bond0 instead of eth0. However, I suspect that there
is nothing wrong with heartbeat: it is following it's rules
correctly. In haresources:
two 192.168.1.10 mysql
one 192.168.1.11

and in ha.cf:
node one
node two

Of course, in hosts one = 192.168.1.1 and two = 192.168.1.2
Thus heartbeat is finding 192.168.1.1 on bond0 on one, and
192.168.1.2 on bond0 in two and aliasing on those interfaces. The
problem is that by aliasing bond0 to bond0:0 the eth0 and eth1 
don't get aliased and the bonding doesn't get setup.

After starting up heartbeat, ifconfig gives:
two$> ifconfig           
bond0     Link encap:Ethernet  HWaddr 00:01:02:10:5B:A7  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:11915119 errors:256 dropped:510 overruns:939 frame:0
          TX packets:11975358 errors:0 dropped:0 overruns:19 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3921533499 (3.6 GiB)  TX bytes:4198879121 (3.9 GiB)

bond0:0   Link encap:Ethernet  HWaddr 00:01:02:10:5B:A7  
          inet addr:192.168.1.11  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

bond0:1   Link encap:Ethernet  HWaddr 00:01:02:10:5B:A7  
          inet addr:192.168.1.10  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

eth0      Link encap:Ethernet  HWaddr 00:01:02:10:5B:A7  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:5957975 errors:0 dropped:0 overruns:701 frame:0
          TX packets:5988187 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:1960364532 (1.8 GiB)  TX bytes:2129981466 (1.9 GiB)
          Interrupt:11 Base address:0xe800 

eth1      Link encap:Ethernet  HWaddr 00:01:02:10:5B:A7  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:5957144 errors:256 dropped:510 overruns:238 frame:0
          TX packets:5987171 errors:0 dropped:0 overruns:19 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:1961168967 (1.8 GiB)  TX bytes:2068897655 (1.9 GiB)
          Interrupt:10 Base address:0xe000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:22308 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22308 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4184450 (3.9 MiB)  TX bytes:4184450 (3.9 MiB)

Apart from IPs, both machines have entries for bond0:0 and bond0:1. So
what needs to happen is for heartbeat to recognise that bond is a special
network type and for it to work out which eths are joined to the bond,
then alias the eths and then bond and then run ifenslave to get the new
bond to work.

Unfortunately, I don't know where to start, apart from the info on the
working bond (*after* ifenslave has been issued) is available in
/proc/net/bond0/info:

two$> cat /proc/net/bond0/info 
Bonding Mode: load balancing
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0

Slave Interface: eth0
MII Status: up
Link Failure Count: 0

I guess not a lot of people will need this, but it would be nice for it to
work. If someone could kick me in the right direction then I *may* be able
to do the coding for it, but I'm not hot on C so it may look like a dog's
dinner!

Matthew

-- 

Matthew Sackman
Nottingham
England

BOFH Excuse Board:
You need to upgrade your VESA local bus to a MasterCard local bus.