[Linux-HA] failed resources
Matthias Dahl
mdmlha at designassembly.de
Tue Sep 12 14:59:56 MDT 2006
Hello...
This is actually a combined post about failed resources. I hope nobody minds
but this all is somehow connected. :-)
First of all, I am currently looking for a way to have Heartbeat check a
failed resource from time to time to see if it is functional again and in
case it is, restart the resource. Does Heartbeat have such a feature...?
I have read http://www.linux-ha.org/v2/faq/forced_failover carefully, yet,
being new to Heartbeat, I just don't get it. :) From what I understand,
Heartbeat detects a failure through the monitor functionality of a OCF
resource agent for example. If a failure happens, it tries once to restart
the resource and if that fails, stops it. Now what the FAQ says is, I can set
failure stickiness and based on some formula, a resource can fail up to X
times and then gets migrated to a new node. That's the point: how can a
resource fail several times on one node, if Heartbeat tries just once to
restart it and keeps the resource stopped if that fails?
Thanks to anyone who can shed some light on this. :-)
Best regards,
Matthias Dahl
More information about the Linux-HA
mailing list