[Linux-HA] querying resource failure count fails

Zachár Balázs zachar at direkt-kfki.hu
Wed Sep 13 04:09:32 MDT 2006


Hello!

What should I do with this fix?
If I right, it is enough if I copy the events.c to my heartbeat source
and I rabuild it...

Thanks,
Balázs


Andrew Beekhof írta:
> On 9/12/06, Matthias Dahl <mdmlha at designassembly.de> wrote:
>> On Tuesday 12 September 2006 22:56, John R Mocho wrote:
>>
>> > I ususally use: cibadmin -Ql -o status | grep fail-count
>> > to search the whole cluster for any fail-count entries
>>
>> Returns nothing even though I just failed two resources.
>>
>> > From my experience, only failures due to monitor actions that
>> return an
>> > exit code of 1 (resource failure) will cause the fail-count to
>> increment
>> > (as opposed to exit code of 7 which simply means that the resource
>> is not
>> > running, very different that an error). Start and (aah-hem) stop
>> errors
>> > will not effect the fail-count.
>>
>> During my failure tests, the OCF resource agent returns
>> OCF_ERR_GENERIC which
>> is 1. Nevertheless, no failure count gets started or increased. :-(
>
> I just fixed a bug that could be the cause here.
> There were some scenarios in which the failure count was not incremented.
> The fix can be found in http://hg.beekhof.net/lha/crm-stable
>
> The exact change was:
>    http://hg.beekhof.net/lha/crm-stable?cmd=changeset;node=62f1b3607975
>
>>
>> Could this somehow be related to my "adventurous" cib.xml? (see
>> attached)
>>
>> Best regards,
>> Matthias Dahl
>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>>
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>



More information about the Linux-HA mailing list