[Linux-HA] failcount for master/slave resource
Andrew Beekhof
beekhof at gmail.com
Mon Apr 21 06:04:40 MDT 2008
On Mon, Apr 21, 2008 at 11:07 AM, Junko IKEDA <ikedaj at intellilink.co.jp> wrote:
> Hi,
>
> I have one master/slave resource.
> (Heartbeat 2.2.0 + Pacemaker 0.6.2)
>
> Master/Slave Set: ms-sf
> stateful-1:0 (ocf::heartbeat:Stateful):Master node-b
> stateful-1:1 (ocf::heartbeat:Stateful):Started node-a
>
> If stateful-1:0 fails, crm_mon would show like this;
>
> Master/Slave Set: ms-sf
> stateful-1:0 (ocf::heartbeat:Stateful):Stopped
> stateful-1:1 (ocf::heartbeat:Stateful):Master node-a
>
> Failed actions:
> stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete
>
> I tried to clear the failcount of stateful-1:0 with crm_failcount.
That doesn't remove the failed operation though... only the counter
which tracks how many times the resource failed.
Perhaps try crm_resource -C
>
> # crm_failcount -r stateful-1:0 -U node-b -D
>
> After that, crm_mon says,
>
> Master/Slave Set: ms-sf
> stateful-1:0 (ocf::heartbeat:Stateful):Master node-b
> stateful-1:1 (ocf::heartbeat:Stateful):Slave node-a
>
> Failed actions:
> stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete
>
> This looks nice.
> But if stateful-1:0 fails again, something is wrong.
>
> Master/Slave Set: ms-sf
> stateful-1:0 (ocf::heartbeat:Stateful):Master node-b FAILED
> stateful-1:1 (ocf::heartbeat:Stateful):Slave node-a
>
> Failed actions:
> stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete
> stateful-1:0_monitor_10000 (node=node-b, call=11, rc=7): complete
>
> Is it expected?
> How can I rescue stateful-1:0?
>
> Best Regards,
> Junko Ikeda
>
> NTT DATA INTELLILINK CORPORATION
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list