Resource doesn't start (was: [Linux-HA] Version 2 resource groups and more)
Andrew Beekhof
beekhof at gmail.com
Mon Jul 11 05:40:26 MDT 2005
Per lmb's suggestion, CVS will shortly complain and assign a default
value for ID if a tag has attributes but no ID field (its working
locally, i just want to refine it a little).
On 7/10/05, Jack Forester, Jr. <linux_wiz at verizon.net> wrote:
> Hm...as soon as I modified my hb_res script to include an id field for
> the <nvpair> flags, the resources would start. I guess I need to go
> back and read the DTD a little more closely. Seems a few things have
> changed between 1.99.5 and 1.99.6 :)
well you picked up on the CIB name changes so you did pretty well :)
>
> On Sun, 2005-07-10 at 10:49 +0200, Andrew Beekhof wrote:
> > Can you include the complete logs (not just the PE) please?
> >
> > It looks like the PE thinks it should start the resource and the TE
> > does too. Indeed the TE makes the first step towards starting it but
> > then... well i dont know because I need the logs.
> >
> > Before you send them though, can you make sure all <nvpair> tags have
> > an "id" field. It doesn't really matter what they contain, as long as
> > they are unique.
> >
> > Andrew
> >
> > On 7/10/05, Jack Forester, Jr. <linux_wiz at verizon.net> wrote:
> > > Sorry it's taken me so long to get back around to this. I've grabbed
> > > the latest bits off of CVS and built it. The delete is working for me
> > > now. Thanks, Andrew!
> > >
> > > Unfortunately, now I'm unable to get heartbeat to start a resource.
> > > I've been looking at logs and code and the closest I can determine is
> > > that heartbeat is marking the resource un-runnable. I'm only running on
> > > one cluster node at the moment, but I get the same result if I have both
> > > nodes up.
> > >
> > > I don't see where the resource script is ever invoked. Any pointers to
> > > where I should look?
> > >
> > > Here's my cib.xml:
> > >
> > > <cib generated="true" cib_feature_revision="1" admin_epoche="0"
> > > epoche="1" num_updates="10" have_quorum="true" last_writt
> > > en="Sat Jul 9 16:19:40 2005" ccm_transition="1" origin="persian"
> > > num_peers="1" dc_uuid="21510af9-f998-420d-8fe8-7258329cc
> > > b4a" debug_source="finalize_join">
> > > <configuration>
> > > <crm_config/>
> > > <nodes>
> > > <node id="21510af9-f998-420d-8fe8-7258329ccb4a" uname="persian"
> > > type="member"/>
> > > </nodes>
> > > <resources>
> > > <primitive id="WebServerIP" class="ocf" type="IPaddr"
> > > provider="heartbeat" on_stopfail="ignore">
> > > <instance_attributes>
> > > <attributes>
> > > <nvpair name="nic" value="eth0"/>
> > > <nvpair name="ip" value="192.168.1.100"/>
> > > </attributes>
> > > </instance_attributes>
> > > </primitive>
> > > </resources>
> > > <constraints/>
> > > </configuration>
> > > <status>
> > > <node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
> > > uname="persian" in_ccm="true" join="member" origin="do_lrm_quer
> > > y" crmd="online" ha="active" expected="member">
> > > <lrm>
> > > <lrm_resources>
> > > <lrm_resource id="WebServerIP" rsc_state="starting"
> > > op_status="-1" rc_code="-1" last_op="start">
> > > <lrm_rsc_op id="WebServerIP_start_0" operation="start"
> > > op_status="-1" rc_code="-1" origin="cib_action_update"
> > > transition_id="4" transition_magic="4:-1"/>
> > > </lrm_resource>
> > > </lrm_resources>
> > > </lrm>
> > > </node_state>
> > > </status>
> > > </cib>
> > >
> > >
> > > Here's some of my ha-debug log:
> > >
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(unpack.c:unpack_nodes):
> > > Begining unpack... nodes
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_resources): Begining unpack... resources
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_resources [2]): Begining unpack... primitive
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > Processing resource input...:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <primitive id="WebServerIP" class="ocf" type="IPaddr"
> > > provider="heartbeat" on_stopfail="ignore">
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <instance_attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <nvpair name="nic" value="eth0"/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <nvpair name="ip" value="192.168.1.100"/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </instance_attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </primitive>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]): Options for WebServerIP
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]): Dependancy restart handling: ignore
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]): Multiple running resource recovery: stop/s
> > > tart
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]): Placement: optimal (default)
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > contents %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > found in %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
> > > uname="persian" in_ccm="true" join="member" origin="do_lrm_query"
> > > crmd="online" ha="active" expected="member">
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </node_state>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > contents %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > found in %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:determine_online_status [2]): Node persian is online
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_constraints): Begining unpack... constraints
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
> > > color 0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(stages.c:stage4 [2]): No
> > > node available for color 0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:custom_action
> > > [2]): Creating action 1: start for WebServerIP on pe
> > > rsian
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
> > > [2]): Action start requires: quorum (default)
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
> > > [2]): start failure results in: resource stop (d
> > > efault)
> > > pengine[32492]: 2005/07/09_16:16:55 info:
> > > mask(native.c:native_create_actions): Start resource WebServerIP
> > > (persian)
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(native.c:create_recurring_actions [2]): Creating recurring actions
> > > for Web
> > > ServerIP
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(graph.c:update_action_states [2]): Updating 1 actions
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(pengine.c:do_calculations [2]): ========= All Actions =========
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action
> > > [2]): : (Provisional) Action 1: start WebServerI
> > > P @ persian
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:16:56 debug:
> > > mask(pengine.c:do_calculations [2]): ========= Set -1
> > > (Un-runnable) ===
> > > ======
> > > pengine[32492]: 2005/07/09_16:16:56 info: mask(stages.c:stage8):
> > > Creating transition graph 1.
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(unpack.c:determine_online_status [2]): Node persian is online
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(unpack.c:unpack_rsc_op
> > > [2]): Unpacking task WebServerIP/start (-1) on pers
> > > ian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Creating action 1: start for WebServerIP on <N
> > > ULL>
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]): Action start requires: quorum (default)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]): start failure results in: resource stop (d
> > > efault)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Action 1 marked manditory
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(unpack.c:unpack_constraints): Begining unpack... constraints
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
> > > color 0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(stages.c:stage4 [2]): No
> > > node available for color 0
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(native.c:native_create_actions [2]): Stop and restart of
> > > WebServerIP
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Creating action 2: stop for WebServerIP on per
> > > sian
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_stop_0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]): Action stop requires: nothing (default)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]): stop failure results in: nothing
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:find_actions
> > > [2]): While looking for WebServerIP_start_0 action on
> > > persian, found an unallocated one. Assigning it to the requested
> > > node...
> > > pengine[32492]: 2005/07/09_16:17:39 info:
> > > mask(native.c:native_create_actions): Leave resource WebServerIP
> > > (persian)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(native.c:create_recurring_actions [2]): Creating recurring actions
> > > for Web
> > > ServerIP
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(complex.c:order_actions
> > > [2]): Ordering 1: Action 2 before 1
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > LH (order_actions): Optional Action 2: stop WebSe
> > > rverIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > RH (order_actions): (Provisional) Action 1: start
> > > WebServerIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(graph.c:update_action_states [2]): Updating 2 actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]): ========= All Actions =========
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [2]): : (Provisional) Action 1: start WebServerI
> > > P @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [3]): : Optional Action 2: stop WebServe
> > > rIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=1)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [2]): : Optional Action 2: stop WebServerIP @ pe
> > > rsian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [3]): : (Provisional) Action 1: start We
> > > bServerIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=1, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]): ========= Set -1
> > > (Un-runnable) ===
> > > ======
> > >
> > >
> > > On Mon, 2005-06-27 at 10:45 +0200, Andrew Beekhof wrote:
> > > > Hi Jack,
> > > >
> > > > I've just finished testing and fixing delete (/me makes a note to add
> > > > regression tests for cibadmin one day). If you're confident in using
> > > > CVS, I would recommend giving it a try. This page will explain how to
> > > > obtain it: http://wwnew.linux-ha.org/CVS
> > > >
> > > > Thanks for pointing out the problem!
> > >
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>
More information about the Linux-HA
mailing list