More Linux-HA heartbeat thoughts

alanr at bell-labs.com alanr at bell-labs.com
Thu Mar 19 15:17:30 MST 1998


> alanr at bell-labs.com wrote:
> > 
> > I can't argue with what you said here, but I still claim that you want
> > one version of the heartbeat that is based on a very reliable
> > communication medium.  This helps better and more reliably diagnose the
> > nature of the failure, and allows the cluster to transition smoothly more
> > often. 
> 
> Not necessarily. The way I described gives you redundant message paths
> which means you have no need for a single, highly reliable HB path. If this
> HB path is single, it is a single point of failure, and you don't want
> that.

If you implement a "bidirectional ring" network with serial ports and you have
a two or 3-machine network, you *have* redundancy in your communication path. 
And since the underlying hardware is much simpler and less failure prone than
ethernet (and doesn't require any additional slots), you have a very
inexpensive, reliable (low-bandwidth) communication medium with no single point
of failure.

Although you potentially have the problem of network isolation with 4 or more
machines, it still doesn't have any *single* point of failure.  I certainly
believe that clusters of 3 or fewer machines are a very important special case.

Regarding the reliability of the heartbeat mechanism, software is still the
most difficult (and unreliable) part of an HA system.  The more kinds of
failure modes the software is likely to run into, the more likely it is to not
function correctly.  It's difficult enough to correctly fail over the functions
of the system gracefully when it's in full knowledge of the real configuration
of the system, that it is undesirable to unnecessarily "push" the matter
further by exercising the software's ability to infer the current system
configuration over less-reliable communications.

This is not to say that the software *shouldn't* be able to infer the system
state in the presence of unreliable communications, or that this ability of the
system shouldn't be well-tested.  It should be able to do this, and it should
be well-tested.  But -- as a user of an HA system -- if you give me the choice
of using reliable communications, or relying on ethernet, I'll opt for the
reliable comm and let *someone else* fully test the more sophisticated form of
recovery.  And I suspect my system will likely have a slight edge in
reliability in practice -- particularly for the 3-or-fewer node case.

	-- Alan Robertson
	   alanr at bell-labs.com



More information about the Linux-HA mailing list