[Linux-HA] Re: Failover affecting other services on HA hosts
robin-lists at robinbowes.com
Thu Oct 11 06:58:22 MDT 2007
David Lee wrote:
> On Wed, 10 Oct 2007, Robin Bowes wrote:
>> We have a simple heartbeat setup failing over between two hosts running
>> Essentially, we have two mysql server instances. maindb runs on db0 and
>> leafdb runs on db1. The system is configured so that if db0 fails, db1
>> takes over maindb and if db1 fails, db0 takes over leafdb.
>> haresources looks like this:
>> db0 IPaddr2::172.28.28.10/32 Filesystem::-Lsan0::/mnt/san0::ext3 mysql-main
>> db1 IPaddr2::172.28.28.9/32 IPaddr2::172.28.28.11/32
>> Filesystem::-Lsan1::/mnt/san1::ext3 mysql-leaf
>> On the same machines, we also have instances of dnscache and tinydns
>> running on different IP addresses. dnscache runs on 172.28.28.6 on
>> machine db0 and 172.28.28.5 on db1.
>> What we are seeing is that the DNS service stops working after a
>> failover until the dnscache/tinydns services are restarted.
>> I have no idea why this might be - any ideas?
> By default the 'bind' DNS daemon seems (please correct me if I'm wrong!)
> to find its local interfaces at start-up and listen explicitly and only on
> those. If other things happen, such as heartbeat adding and removing
> interfaces, then that doesn't get picked up by 'bind'. (I think this
> behaviour of bind is a deliberate feature, not a bug.) So although
> 'heartbeat' might migrate that public IP address for you onto a new
> machines, 'bind' (by default) won't listen on it.
> There is a 'bind' option called 'interface-interval' which tells it to
> re-scan for changes in interfaces every 'n' minutes. If heartbeat adjusts
> the interfaces (e.g. importing the 'public' IP address) then 'bind'
> should pick it up within that interval.
> It might be worth investigating that.
Thanks for the reply.
We're actually using djb's dnscache and tinydns.
dnscache is on a different IP/interface to the one used for the mysql
failover. tinydns is on 127.0.0.1
So, I'm guessing that there must be something happening when heartbeat
re-jigs the interfaces when it fails-over that screws up the interface
on which dnscache is running.
>> Regardless, I can think of a couple of options we can implement to
>> prevent this scenario:
>> 1. Have heartbeat restart dnscache and tinydns after a failover. How can
>> I do this?
>> 2. Add a new resource failover over the dnscache/tinydns services.
>> 3. Move dnscache and tinydns off the HA hosts onto a couple of "normal"
> Another aspect to consider is this: Suppose you have two public IP
> addresses (/etc/resolv.conf on your clients); then host one address on
> each machine in your pair, active/active, and let heartbeat handle both
> addresses. Perhaps include failback, so that when normal service resumes
> it reverts to active/active.
> If one is master and the other secondary (as distinct from two
> secondaries), then presumably some sort of resource would be needed to
> restart with the revised named.conf configuration (which would need to be
> maintained and available across both machines).
We are considering something like that, i.e. having the interfaces on
which dnscache runs managed by heartbeat.
> When you get it working, it might be worth writing it up as an example for
> the heartbeat wiki or other documentation.
Sure - I'll see what I can do.
More information about the Linux-HA