[Linux-HA] arp not resolving to backup node after takeover
Steven A. Sullam
steve at stevesullam.com
Tue Feb 27 12:06:34 MST 2007
Hi,
I just thought I would throw this out there and see what comes back.
I am using *heartbeat *with two nodes running fedora. Everything seems
to be working the way it should.
When I shutdown heartbeat on the primary node, the backup node
immediately adds the ip address of the main node and broadcasts arps
telling lan nodes to find the ip address at the new mac address. Yet
when I access my network from the internet my router sends me to the
dead main node when it should be sending me to the backup node.
How can see if the arp table has been updated on my consumer grade
linksys router? I am running a program called linksysmon, but this only
show ip addresses coming in and out.
Here is my setup:
backup node
192.168.1.11----------\
172.16.1.2 \
^ \
| \----------------------lan
interface--router ------wan interface
^ /
172.16.1.1 /
192.168.1.10-----------/
main node
Here is an excerpt from the log on the backup log when I shutdown
heartbeat on the main node. There are entries from both nodes, because I
am using remote logging. wolf01 is the backup node.
11:24:41 wolf00 atd: atd shutdown failed
11:24:54 wolf01 heartbeat: [7112]: info: Received shutdown notice from 'mydomain.com'.
11:24:54 wolf01 heartbeat: [7112]: info: Resources being acquired from mydomain.com.
11:24:55 wolf01 heartbeat: [7495]: info: acquire local HA resources (standby).
11:24:55 wolf01 heartbeat: [7496]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys wolf01] to acquire.
11:24:55 wolf01 heartbeat: [7495]: info: local HA resource acquisition completed (standby).
11:24:55 wolf01 heartbeat: [7112]: info: Standby resource acquisition done [foreign].
11:24:55 wolf01 harc[7515]: [7518]: info: Running /etc/ha.d/rc.d/status status
11:24:55 wolf01 mach_down[7521]: [7536]: info: Taking over resource group 192.168.1.10/24/eth0/192.168.1.255
11:24:55 wolf01 ResourceManager[7537]: [7545]: info: Acquiring resource group: mydomain.com 192.168.1.10/24/eth0/192.168.1.255 atd
11:24:55 wolf01 ResourceManager[7537]: [7583]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.10/24/eth0/192.168.1.255 start
11:24:56 wolf01 IPaddr[7585]: [7633]: info: /sbin/ifconfig eth0:0 192.168.1.10 netmask 255.255.255.0 broadcast 192.168.1.255
11:24:56 wolf01 avahi-daemon[2145]: Registering new address record for 192.168.1.10 on eth0.
11:24:56 wolf01 IPaddr[7585]: [7638]: info: Sending Gratuitous Arp for 192.168.1.10 on eth0:0 [eth0]
11:24:56 wolf01 IPaddr[7585]: [7639]: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.1.10 eth0 192.168.1.10 auto 192.168.1.10 ffffffffffff
11:24:56 wolf01 send_arp: [7642]: info: Enable using logging daemon
11:24:56 wolf01 ResourceManager[7537]: [7664]: info: Running /etc/init.d/atd start
11:24:56 wolf01 mach_down[7521]: [7674]: info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
11:24:56 wolf01 heartbeat: [7112]: info: mach_down takeover complete.
11:24:56 wolf01 mach_down[7521]: [7677]: info: mach_down takeover complete for node mydomain.com.
11:24:58 wolf00 last message repeated 10 times
11:24:58 wolf00 logd: [26666]: info: logd_term_write_action: received SIGTERM
11:24:58 wolf00 logd: [26666]: info: ha_logd: Exiting write process
11:24:58 wolf00 logd: [29041]: info: Waiting for pid=26665 to exit
11:24:59 wolf00 logd: [29041]: info: Pid 26665 exited
11:25:27 wolf01 ipfail: [7136]: info: Status update: Node mydomain.com now has status dead
11:25:27 wolf01 heartbeat: [7112]: WARN: node mydomain.com: is dead
11:25:27 wolf01 heartbeat: [7112]: info: Dead node mydomain.com gave up resources.
11:25:27 wolf01 heartbeat: [7112]: info: Link mydomain.com:eth2 dead.
11:25:27 wolf01 ipfail: [7136]: info: NS: We are still alive!
11:25:28 wolf01 ipfail: [7136]: info: Link Status update: Link mydomain.com/eth2 now has status dead
11:25:28 wolf01 ipfail: [7136]: info: Asking other side for ping node count.
11:25:28 wolf01 ipfail: [7136]: info: Checking remote count of ping nodes.
Thanks much in advance!!
More information about the Linux-HA
mailing list