[Linux-HA] Heartbeat kills my lo:0, and therefore my LVS-DR

Nic Pottier nicpottier at yahoo.com
Fri Dec 8 19:45:39 MST 2006


A bit more digging finally yielded a solution, the key being to put my 
arp-answer resource before the IPaddr resource:

n01-bot.trileet.com arp-answer 192.168.0.120 ldirectord::ldirectord.cf

For anybody who might find such a thing useful, I've included my 
arp-answer resource script which takes care of the various requirements 
for LVS-DR setup on RHEL or CentOS.

-Nic

#!/bin/bash
# arp-answer
#
# Handles the various init procedures required on RHEL4 systems to
# use LVS-DR.
#
# Specifically:
#     - enables forwarding via sysctl
#     - kills lo:0 before starting heartbeat to avoid complaints
#     - handles turning on/off arp responses via arptables
#     - brings up lo:0 when a node is NOT the director so it can be a
#       real server.
#
# You'll want to set this resource before IPaddr or IPaddr2 in your
# haresource as otherwise the lo:0 link will be killed by IPaddr.
#    ie: node1 arp-answer 192.168.0.120 ldirectord::ldirectord.cf
#
# You can also use this script as the /etc/ha.d/resource.d/startstop
# script to make sure the system is in an ok state when heartbeat starts
# and a valid LVS-DR node when heartbeat stops. (IE, has a VIP on lo:0
# and doesn't answer arps)
#_____________________________________________________________________
# !!!!! You must set the VIP address to use here !!!!!
#---------------------------------------------------------------------
VIP=192.168.0.120

host=`/bin/hostname`
case "$1" in

# if linked as startstop this is triggered when heartbeat first loads
pre-start)
         # turn on forwarding (assumes sysctl.conf has been modified to
         # have net.ipv4.ip_forward = 1)
         /sbin/sysctl -p

         # give up lo:0
         /sbin/ifdown lo:0

         # don't answer arps yet
         /etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;

# called when this resource becomes active
start)
         # give up lo:0
         /sbin/ifdown lo:0

         # we want to answer arps, do so
         /etc/ha.d/rc.d/arptables-noarp-addr_takeip $VIP
;;

# startstop hook, noop
post-start)
;;

# startstop hook, noop
pre-stop)
;;

# called either when heartbeat exits completely, or a resource
# is taken down by heartbeat.
stop|post-stop)
         # bring up lo:0 again, this will be our loopback with our VIP
         /sbin/ifup lo:0

         # but do not answer arps anymore
         /etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;

# required by the resources constract
status)
         # make sure that eth0:0 is up AND lo:0 is down
         islothere=`/sbin/ifconfig lo:0 | grep $VIP`
         iseththere=`/sbin/ifconfig eth0:0 | grep $VIP`

         # are we ignoring arps on our VIP?
         isarpignore=`/sbin/arptables -L | grep $VIP`

         if [ "$islothere" -o "$isarpignore" -o ! "iseththere" ];then
             # eth0:0 isnt there or lo:0 is there or we are ignoring arps
             echo "LVS-DR director Stopped."
         else
             echo "LVS-DR director Running."
         fi
;;
*)
         # Invalid entry.
         echo "$0: Usage: $0 {pre-start|start|status|stop|post-stop}"
         exit 1
;;
esac



Nic Pottier wrote:
> Sorry, forgot to include some other relevant info.
> 
> The heartbeat log as it kills my poor lo:0
> Dec  8 18:29:24 n02-bot ResourceManager[30021]: info: Running 
> /etc/ha.d/resource.d/IPaddr 192.168.0.120/24/eth0 stop
> Dec  8 18:29:24 n02-bot IPaddr[30286]: INFO: /sbin/route -n del -host 
> 192.168.0.120
> Dec  8 18:29:24 n02-bot IPaddr[30286]: INFO: /sbin/ifconfig lo:0 
> 192.168.0.120 down
> Dec  8 18:29:24 n02-bot IPaddr[30286]: INFO: IP Address 192.168.0.120 
> released
> Dec  8 18:29:24 n02-bot IPaddr[30204]: INFO: IPaddr Success
> Dec  8 18:29:24 n02-bot heartbeat: [30011]: info: foreign HA resource 
> release completed (standby).
> 
> And my haresources:
> n01-bot.trileet.com 192.168.0.120/24/eth0 arp-answer 
> ldirectord::/etc/ha.d/ldirectord.cf
> 
> Note that 'arp-answer' is a resource I wrote to turn on/off arp requests 
> for the VIP as RHEL must use arptables to ignore arp requests on lo.
> 
> Thanks again,
> 
> -Nic
> 
> Nic Pottier wrote:
>>
>> I'm been trying to set up a two node LVS-DR setup on CentOS 4.4 such 
>> as the one outlined here:
>> <http://koto.ultramonkey.org/3/topologies/sl-ha-lb-eg.html>
>>
>> I feel I have my head wrapped around things rather well after a day of 
>> mucking with this and a considerable amount of reading, but I have one 
>> last gotcha which despite many workarounds I can't get through.
>>
>> On my secondary node (heartbeat controlling ldirectord), heartbeat 
>> insists on killing my loopback with the VIP.  This of course prevents 
>> that box from accepting any traffic on the VIP, making it unable to 
>> answer requests.  Worse still, the primary (live) director believes it 
>> to be up since the RIP happily answers.
>>
>> This must be something dumb on my part, but I swear I've followed 
>> every howto, searched every archive and read every book I could find 
>> and I'm not having any luck.
>>
>> If I manually bring up lo:0 AFTER heartbeat kills it, sure enough 
>> things work fine.
>>
>> I even tried writing my own resource to bring lo:0 up when Heartbeat 
>> releases it, but the killing takes place AFTER the stop so it doesn't 
>> do any good.
>>
>> What's the two line answer to this?  I know there must be one, but for 
>> the life of me I can't figure it out.
>>
>> I'm using the CentOS 4.4 extras packages, namely:
>>    heartbeat.x86_64              2.0.7-1.c4     installed
>>    heartbeat-ldirectord.x86_64   2.0.7-1.c4     installed
>>
>> I promise I'll write a CentOS howto on this when i figure it out.
>>
>> Many thanks,
>>
>> -Nic
> 



More information about the Linux-HA mailing list