[Linux-HA] R2 Two-node apache cluster with STONITH
Andrew Beekhof
beekhof at gmail.com
Thu Mar 8 05:57:40 MST 2007
its very unlikely to shoot anything if the stonith agents cant start
Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
unpack_rsc_op:unpack.cProcessing failed op
(test-1_drac_DoFencing:0_start_0) for test-1_drac_DoFencing:0 on
ldap-1.domain
either the RA is broken or your configuration is
On 3/7/07, Bjorn Oglefjorn <sys.mailing at gmail.com> wrote:
> On 3/6/07, Alan Robertson <alanr at unix.sh> wrote:
> >
> > Bjorn Oglefjorn wrote:
> > > Hello,
> > >
> > > I have tried at length to follow the documentation and peruse this
> > mailing
> > > list, but as of yet I am unable to have this work properly. Is there
> > any
> > > one who can provide me with some direction here?
> > >
> > > The STONITH plugin (external/drac4) is a custom one that I have created.
> > > I've tested it with the /usr/sbin/stonith command and it works as
> > outlined
> > > in the documentation.
> > >
> > > Below are my test config files. Thanks in advance for any help you all
> > can
> > > provide.
> >
> >
> > "unable to have this work properly"...
> >
> > Could you be a little more specific on exactly what your
> > problems/symptoms are?
> >
> >
> > --
> > Alan Robertson <alanr at unix.sh>
> >
> > "Openness is the foundation and preservative of friendship... Let me
> > claim from you at all times your undisguised opinions." - William
> > Wilberforce
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
> Hello Alan,
>
> Thanks for the quick reply. Specifically, I can tell you that my cluster
> does seem to start up properly, however when I simulate node failure,
> STONITH does not succeed. The STONITH process is attempted over and again,
> but is never actually performed (ie: the failed node is never shot in the
> head).
>
> I'm sorry to say that I'm not really sure what to make of the log entries as
> they seem very obtuse. What I can tell you is that these lines are from the
> host which should have been fenced, which I assume is wrong. Here are some
> excerpts:
>
> Mar 7 09:30:19 ldap-2 stonithd: [5698]: info: client tengine [pid: 5711]
> want a STONITH operation RESET to node ldap-2.domain.
> Mar 7 09:30:19 ldap-2 tengine: [5711]: info:
> te_fence_node:actions.cExecuting reboot fencing operation (21) on
> ldap-2.domain (timeout=2500)
> Mar 7 09:30:19 ldap-2 stonithd: [5698]: info: Broadcasting the message
> succeeded: require others to stonith node ldap-2.domain.
> Mar 7 09:30:19 ldap-2 tengine: [5711]: info:
> te_pseudo_action:actions.cPseudo action 12 confirmed
> Mar 7 09:30:19 ldap-2 tengine: [5711]: info:
> te_pseudo_action:actions.cPseudo action 9 confirmed
> Mar 7 09:30:22 ldap-2 stonithd: [5698]: info: Failed to STONITH the node
> ldap-2.domain: optype=1, op_result=2
> Mar 7 09:30:22 ldap-2 tengine: [5711]: info: tengine_stonith_callback:
> callbacks.c call=-173, optype=1, node_name=ldap-2.domain, result=2, nod
> e_list=, action=21;175:d1784142-1161-4f9b-8865-731e40b59e13
> Mar 7 09:30:22 ldap-2 tengine: [5711]: ERROR: tengine_stonith_callback:
> callbacks.c Stonith of ldap-2.domain failed (2)... aborting transition.
>
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-1_drac_DoFencing:0_start_0) for
> test-1_drac_DoFencing:0 on l
> dap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-1_drac_DoFencing:0 on
> ldap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-1_drac_DoFencing:1_start_0) for
> test-1_drac_DoFencing:1 on l
> dap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-1_drac_DoFencing:1 on
> ldap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-2_drac_DoFencing:0_start_0) for
> test-2_drac_DoFencing:0 on l
> dap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-2_drac_DoFencing:0 on
> ldap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-2_drac_DoFencing:1_start_0) for
> test-2_drac_DoFencing:1 on l
> dap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-2_drac_DoFencing:1 on
> ldap-1.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: determine_online_status:
> unpack.c Node ldap-2.domain is online
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-1_drac_DoFencing:0_start_0) for
> test-1_drac_DoFencing:0 on l
> dap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-1_drac_DoFencing:0 on
> ldap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op
> (test-2_drac_DoFencing:0_start_0) for
> test-2_drac_DoFencing:0 on l
> dap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-2_drac_DoFencing:0 on
> ldap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op (test_IP_monitor_5000) for
> test_IP on
> ldap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cHandling failed start for
> test-2_drac_DoFencing:0 on
> ldap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> unpack_rsc_op:unpack.cProcessing failed op (test_IP_monitor_5000) for
> test_IP on
> ldap-2.domain
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: Resource Group: test_group
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: test_IP
> (heartbeat::ocf:IPaddr): Started ldap-2.domain FAILED
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: httpd (lsb:httpd):
> Stopped
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: Clone Set: test-1_drac
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info:
> test-1_drac_DoFencing:0 (stonith:external/drac4): Stopped
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info:
> test-1_drac_DoFencing:1 (stonith:external/drac4): Stopped
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: Clone Set: test-2_drac
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info:
> test-2_drac_DoFencing:0 (stonith:external/drac4): Stopped
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info:
> test-2_drac_DoFencing:1 (stonith:external/drac4): Stopped
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> text2task:common.cUnsupported action: status
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: NoRoleChange:native.c Move
> resource test_IP (ldap-2.domain -> ldap-1.domain)
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: Recurring:native.c
> ldap-1.domain test_IP_monitor_5000
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: StartRsc:native.c
> ldap-1.domain Start httpd
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> text2task:common.cUnsupported action: status
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: Recurring:native.c
> ldap-1.domain httpd_status_5000
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN: stage6:allocate.c Scheduling
> Node ldap-2.domain for STONITH
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN: native_stop_constraints:
> native.c Stop of failed resource test_IP is implict after ldap-2.domain
> is fenced
> Mar 7 09:30:22 ldap-2 pengine: [5712]: info: native_stop_constraints:
> native.c Re-creating actions for test_group
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: NoRoleChange:native.c Move
> resource test_IP (ldap-2.domain -> ldap-1.domain)
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: Recurring:native.c
> ldap-1.domain test_IP_monitor_5000
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: StartRsc:native.c
> ldap-1.domain Start httpd
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> text2task:common.cUnsupported action: status
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: Recurring:native.c
> ldap-1.domain httpd_status_5000
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> text2task:common.cUnsupported action: status
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> text2task:common.cUnsupported action: status
> Mar 7 09:30:22 ldap-2 pengine: [5712]: notice: stage8:allocate.c Created
> transition graph 176.
> Mar 7 09:30:22 ldap-2 pengine: [5712]: WARN:
> process_pe_message:pengine.cNo value specified for cluster preference:
> pe-error-series-max
> Mar 7 09:30:22 ldap-2 pengine: [5712]: ERROR:
> process_pe_message:pengine.cTransition 176: ERRORs found during PE
> processing. PEngine Input stored in:
> /var
> /lib/heartbeat/pengine/pe-error-1408.bz2
>
> --BO
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list