[Linux-HA] Ipfail support for heartbeat 2.0.x

Lars Marowsky-Bree lmb at suse.de
Tue Oct 18 13:58:28 MDT 2005


On 2005-10-18T14:33:06, Guochun Shi <gshi at ncsa.uiuc.edu> wrote:

> I think the cheap version is good.
> If  ping nodes are unstable, then we cannot do anything about it but
> moving resources accordingly.

Note that this would be a feature regression; ipfail does not do this,
it does "vote" on which side can see the most nodes.

Customers will hate you if you bounce resources around like that and
cause more observed downtime (due to the unneeded migrations) then
necessary.

> As for normal cases whether a ping node dies, we can avoid moving a
> resource if ipfial delays reporting the node by one heartbeat interval
> -- at that time all nodes should all notices the ping node is dead.

This doesn't work.
0,	heartbeat interval
1,0	heartbeat interval
1,9	node1 notices that ping node is dead, delays by one interval
2,0	heartbeat interval
2,1     node2 notices, delays by one interval
2,9     node1 reports
...

Remember they are not running in lockstep.


Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"




More information about the Linux-HA mailing list