[Linux-HA] unpredictable failover of resources
Andrew Beekhof
beekhof at gmail.com
Thu Dec 28 02:15:27 MST 2006
On 12/21/06, Harald Haspl <harald.haspl at apus.co.at> wrote:
> Hello,
>
> I have a 2-nodes heartbeat cluster on which I want to monitor the
> running resources. If one resource fails, all resources shall be
> re-located on the other node.
>
> What I have encountered is, that at the first-time one of the resources
> fails, the resources are simply restarted on the current node instead of
> failing over. But only on subsequental failures, the failover is performed.
>
>
> I am using heartbeat-2.0.7 with patch 62f1b3607975 (Andrew Beekhof
> TE: Update resource failcount in all required cases)
>
> Why are the resources simply restarted, and how can I get them
> relocated on the other node?
>
>
> Example:
> Node "A", Node "B", Resource "res"
> default_resource_stickiness=INFINITY
> default_resource_failure_stickiness=-INFINITY
>
> * cib.xml was loaded into a clean CIB.
> * resource is cleaned up:
> crm_resource -C -r res
> * resource "res" is started on node A.
>
> After a while, "res" fails on node A: "monitor" returns "$OCF_ERR_GENERIC".
> This causes Heartbeat to "stop" and afterwards "start" the resource
> on node A again. (start and stop work and return "$OCF_SUCCESS")
> - fail-count is not yet set in the cib.
can you provide the configuration and logs (zipped)?
otherwise i'm working blind
>
> After the second time the resource fails, it is stopped on node A
> and failed over to node B.
> - at this time, fail-count is set to 1.
>
> I'd rather wanted the resource to have failed over already at the first
> time it has failed.
>
>
> best regards,
> Harald.
>
> --
> Harald Haspl
>
> apus | Software GmbH
> Bahnhofstrasse 1, A-8074 Graz-Raaba
> T | +43 316 401629 0 F | +43 316 401629 9
> http://www.apus.co.at harald.haspl at apus.co.at
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list