[Linux-HA] failcount for master/slave resource

Junko IKEDA ikedaj at intellilink.co.jp
Mon Apr 21 03:07:24 MDT 2008


Hi,

I have one master/slave resource.
(Heartbeat 2.2.0 + Pacemaker 0.6.2)

Master/Slave Set: ms-sf
stateful-1:0 (ocf::heartbeat:Stateful):Master node-b
stateful-1:1 (ocf::heartbeat:Stateful):Started node-a

If stateful-1:0 fails, crm_mon would show like this;

Master/Slave Set: ms-sf
stateful-1:0 (ocf::heartbeat:Stateful):Stopped
stateful-1:1 (ocf::heartbeat:Stateful):Master node-a
 
Failed actions:
    stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete

I tried to clear the failcount of stateful-1:0 with crm_failcount.

# crm_failcount -r stateful-1:0 -U node-b -D

After that, crm_mon says,

Master/Slave Set: ms-sf
stateful-1:0 (ocf::heartbeat:Stateful):Master node-b
stateful-1:1 (ocf::heartbeat:Stateful):Slave node-a

Failed actions:
    stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete

This looks nice.
But if stateful-1:0 fails again, something is wrong.

Master/Slave Set: ms-sf
stateful-1:0 (ocf::heartbeat:Stateful):Master node-b FAILED
stateful-1:1 (ocf::heartbeat:Stateful):Slave node-a

Failed actions:
    stateful-1:0_demote_0 (node=node-b, call=7, rc=7): complete
    stateful-1:0_monitor_10000 (node=node-b, call=11, rc=7): complete

Is it expected?
How can I rescue stateful-1:0?

Best Regards,
Junko Ikeda

NTT DATA INTELLILINK CORPORATION

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hb_report.tar.gz
Type: application/octet-stream
Size: 70093 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20080421/8f0daf28/hb_report.tar-0001.obj


More information about the Linux-HA mailing list