[Linux-HA] pingd in V2
Andrew Beekhof
beekhof at gmail.com
Tue Jun 5 02:12:11 MDT 2007
On 6/5/07, Serge Dubrouski <sergeyfd at gmail.com> wrote:
> Hello -
>
> I played with pingd in v2 heartbeat and found some problems (or
> inconvenience) there:
>
> My configuration includes a group of resources and a rsc_location rule
> for a primary node. If I configure pingd in the ha.cf and add
> rsc_location rule with score -INFINITY for pingd attribute not define
> or less or equal then 0 everything works like it should. My group
> starts on a primary node and fails over to backup node if primary
> looses its network connection.
>
> Problems start when I move pingd from ha.cf to cib.xml and configure a
> clone for it there. It looks like (I'm not absolutely sure in that)
> that when pingd starts up it doesn't have enough time to update CIB
> before Heartbeat starts other resources.
do you have ordering constraints between the pingd resource and the
other resources?
> Because of that Heartbeat
> complains that there is no nodes available for resources or that
> resources can't run on any node in the cluster.
presumably because there are no pingd scores yet - thats perfectly normal so far
> With the second check
> heartbeat sees nodes available but at this time there is no guarantee
> that resources will be started on a desired primary node.
this bit i'm not sure i understand
do you mean the pingd scores haven't stabilized?
or that they're equal and you can't make the resource start on a
particular node?
>
> I hope that I explained the problem correctly. The possible fix could
> be implementing a short timeout (OCF_RESKEY_dampen + 3s for example)
> in the start function of pingd RA.
>
> There were also some mistakes in the v2/faq/pingd document that I
> corrected in wiki.linux-ha.org
thanks!
More information about the Linux-HA
mailing list