[Linux-HA] Heartbeat kills my lo:0, and therefore my LVS-DR
Nic Pottier
nicpottier at yahoo.com
Fri Dec 8 19:45:39 MST 2006
A bit more digging finally yielded a solution, the key being to put my
arp-answer resource before the IPaddr resource:
n01-bot.trileet.com arp-answer 192.168.0.120 ldirectord::ldirectord.cf
For anybody who might find such a thing useful, I've included my
arp-answer resource script which takes care of the various requirements
for LVS-DR setup on RHEL or CentOS.
-Nic
#!/bin/bash
# arp-answer
#
# Handles the various init procedures required on RHEL4 systems to
# use LVS-DR.
#
# Specifically:
# - enables forwarding via sysctl
# - kills lo:0 before starting heartbeat to avoid complaints
# - handles turning on/off arp responses via arptables
# - brings up lo:0 when a node is NOT the director so it can be a
# real server.
#
# You'll want to set this resource before IPaddr or IPaddr2 in your
# haresource as otherwise the lo:0 link will be killed by IPaddr.
# ie: node1 arp-answer 192.168.0.120 ldirectord::ldirectord.cf
#
# You can also use this script as the /etc/ha.d/resource.d/startstop
# script to make sure the system is in an ok state when heartbeat starts
# and a valid LVS-DR node when heartbeat stops. (IE, has a VIP on lo:0
# and doesn't answer arps)
#_____________________________________________________________________
# !!!!! You must set the VIP address to use here !!!!!
#---------------------------------------------------------------------
VIP=192.168.0.120
host=`/bin/hostname`
case "$1" in
# if linked as startstop this is triggered when heartbeat first loads
pre-start)
# turn on forwarding (assumes sysctl.conf has been modified to
# have net.ipv4.ip_forward = 1)
/sbin/sysctl -p
# give up lo:0
/sbin/ifdown lo:0
# don't answer arps yet
/etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;
# called when this resource becomes active
start)
# give up lo:0
/sbin/ifdown lo:0
# we want to answer arps, do so
/etc/ha.d/rc.d/arptables-noarp-addr_takeip $VIP
;;
# startstop hook, noop
post-start)
;;
# startstop hook, noop
pre-stop)
;;
# called either when heartbeat exits completely, or a resource
# is taken down by heartbeat.
stop|post-stop)
# bring up lo:0 again, this will be our loopback with our VIP
/sbin/ifup lo:0
# but do not answer arps anymore
/etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;
# required by the resources constract
status)
# make sure that eth0:0 is up AND lo:0 is down
islothere=`/sbin/ifconfig lo:0 | grep $VIP`
iseththere=`/sbin/ifconfig eth0:0 | grep $VIP`
# are we ignoring arps on our VIP?
isarpignore=`/sbin/arptables -L | grep $VIP`
if [ "$islothere" -o "$isarpignore" -o ! "iseththere" ];then
# eth0:0 isnt there or lo:0 is there or we are ignoring arps
echo "LVS-DR director Stopped."
else
echo "LVS-DR director Running."
fi
;;
*)
# Invalid entry.
echo "$0: Usage: $0 {pre-start|start|status|stop|post-stop}"
exit 1
;;
esac
Nic Pottier wrote:
> Sorry, forgot to include some other relevant info.
>
> The heartbeat log as it kills my poor lo:0
> Dec 8 18:29:24 n02-bot ResourceManager[30021]: info: Running
> /etc/ha.d/resource.d/IPaddr 192.168.0.120/24/eth0 stop
> Dec 8 18:29:24 n02-bot IPaddr[30286]: INFO: /sbin/route -n del -host
> 192.168.0.120
> Dec 8 18:29:24 n02-bot IPaddr[30286]: INFO: /sbin/ifconfig lo:0
> 192.168.0.120 down
> Dec 8 18:29:24 n02-bot IPaddr[30286]: INFO: IP Address 192.168.0.120
> released
> Dec 8 18:29:24 n02-bot IPaddr[30204]: INFO: IPaddr Success
> Dec 8 18:29:24 n02-bot heartbeat: [30011]: info: foreign HA resource
> release completed (standby).
>
> And my haresources:
> n01-bot.trileet.com 192.168.0.120/24/eth0 arp-answer
> ldirectord::/etc/ha.d/ldirectord.cf
>
> Note that 'arp-answer' is a resource I wrote to turn on/off arp requests
> for the VIP as RHEL must use arptables to ignore arp requests on lo.
>
> Thanks again,
>
> -Nic
>
> Nic Pottier wrote:
>>
>> I'm been trying to set up a two node LVS-DR setup on CentOS 4.4 such
>> as the one outlined here:
>> <http://koto.ultramonkey.org/3/topologies/sl-ha-lb-eg.html>
>>
>> I feel I have my head wrapped around things rather well after a day of
>> mucking with this and a considerable amount of reading, but I have one
>> last gotcha which despite many workarounds I can't get through.
>>
>> On my secondary node (heartbeat controlling ldirectord), heartbeat
>> insists on killing my loopback with the VIP. This of course prevents
>> that box from accepting any traffic on the VIP, making it unable to
>> answer requests. Worse still, the primary (live) director believes it
>> to be up since the RIP happily answers.
>>
>> This must be something dumb on my part, but I swear I've followed
>> every howto, searched every archive and read every book I could find
>> and I'm not having any luck.
>>
>> If I manually bring up lo:0 AFTER heartbeat kills it, sure enough
>> things work fine.
>>
>> I even tried writing my own resource to bring lo:0 up when Heartbeat
>> releases it, but the killing takes place AFTER the stop so it doesn't
>> do any good.
>>
>> What's the two line answer to this? I know there must be one, but for
>> the life of me I can't figure it out.
>>
>> I'm using the CentOS 4.4 extras packages, namely:
>> heartbeat.x86_64 2.0.7-1.c4 installed
>> heartbeat-ldirectord.x86_64 2.0.7-1.c4 installed
>>
>> I promise I'll write a CentOS howto on this when i figure it out.
>>
>> Many thanks,
>>
>> -Nic
>
More information about the Linux-HA
mailing list