[Linux-HA] pingd in V2
Serge Dubrouski
sergeyfd at gmail.com
Mon Jun 4 16:05:21 MDT 2007
Hello -
I played with pingd in v2 heartbeat and found some problems (or
inconvenience) there:
My configuration includes a group of resources and a rsc_location rule
for a primary node. If I configure pingd in the ha.cf and add
rsc_location rule with score -INFINITY for pingd attribute not define
or less or equal then 0 everything works like it should. My group
starts on a primary node and fails over to backup node if primary
looses its network connection.
Problems start when I move pingd from ha.cf to cib.xml and configure a
clone for it there. It looks like (I'm not absolutely sure in that)
that when pingd starts up it doesn't have enough time to update CIB
before Heartbeat starts other resources. Because of that Heartbeat
complains that there is no nodes available for resources or that
resources can't run on any node in the cluster. With the second check
heartbeat sees nodes available but at this time there is no guarantee
that resources will be started on a desired primary node.
I hope that I explained the problem correctly. The possible fix could
be implementing a short timeout (OCF_RESKEY_dampen + 3s for example)
in the start function of pingd RA.
There were also some mistakes in the v2/faq/pingd document that I
corrected in wiki.linux-ha.org
Serge.
More information about the Linux-HA
mailing list