Cluster manager structure for Linux-HA (?)

alanr at bell-labs.com alanr at bell-labs.com
Fri Mar 20 06:53:01 MST 1998


> > 
> > You can read more about mon at:
> >         http://consult.ml.org/~trockij/mon/
> > 
> > Here's the basics of what mon can do:
> > 
> >         - Run "monitors" at scheduled intervals,
> >         - manage dependencies between monitors:
> >         - trigger "alerts" when monitors go off
> > 
> > It seems to have a great deal of overlap in framework and basic structure
> > with some of our needs in Linux-HA.  I have not yet run it in my
> > environment.
> 
> mon is wonderful. Im already a committed mon fan and user. You can also
> run multiple mon's on different boxes so that mon itself isnt a failure
> point

My assumption is that you would run "mon" on every cluster element, but that
the "alerts" that perform cluster reconfiguration would still be quite complex,
and would do some kind of negotiation or voting on how to do the change as per
some as-yet undesigned process.  Perhaps it would use the heartbeat
communication facility to do this -- perhaps not.

But the ability to monitor "things", detect non-working facilities and
services, invoke recovery actions (and in a future release) handle dependencies
seems like a good place to start, rather than re-inventing the wheel.

	-- Alan Robertson
	   alanr at bell-labs.com



More information about the Linux-HA mailing list