[Linux-HA] Fail Count problem
Zachár Balázs
zachar at direkt-kfki.hu
Wed Sep 13 07:53:28 MDT 2006
I don't understand it... I will write everything about the problem:
I have tried to simulate resources failures... I did these steps:
1., I have an alias IPaddr on a-node on eth0 with heartbeat (monitored
with the standard OCF). I bring down my eth0 (ifdown eth0) to trying the
monitoring stuff. It was great, the resources are stopped on the a-node
and started on the b-node.
2., I bring up my eth0 on a-node, and I would like to migrate back there
the resources with the crm_resources command like this:
crm_resource -M -U a-linux -r group_1
(it didn't work. It stopped resources on b-node but on a-node didn't
start the resources)
3., I read some documents in HA's webpage and i found: I must clear the
fail counters before I failback the cluster. OK! Idid this:
crm_failcount -G -U a-node -r IPaddr_192_168_66_101 (i found, that is
the resource which found the error firstly)
The counter value was: 1
crm_failcount -D -U a-node -r IPaddr_192_168_66_101 (It was success)
4., Now I think the cluster is ready to "failback" and i did again:
crm_resource -M -U a-linux -r group_1
But It still not work!
What maybe the problem?
Thanks for help:
Balázs
More information about the Linux-HA
mailing list