Resource doesn't start (was: [Linux-HA] Version 2 resource groups and more)

Andrew Beekhof beekhof at gmail.com
Mon Jul 11 05:40:26 MDT 2005


Per lmb's suggestion, CVS will shortly complain and assign a default
value for ID if a tag has attributes but no ID field (its working
locally, i just want to refine it a little).

On 7/10/05, Jack Forester, Jr. <linux_wiz at verizon.net> wrote:
> Hm...as soon as I modified my hb_res script to include an id field for
> the <nvpair> flags, the resources would start.  I guess I need to go
> back and read the DTD a little more closely.  Seems a few things have
> changed between 1.99.5 and 1.99.6 :)

well you picked up on the CIB name changes so you did pretty well :)

> 
> On Sun, 2005-07-10 at 10:49 +0200, Andrew Beekhof wrote:
> > Can you include the complete logs (not just the PE) please?
> >
> > It looks like the PE thinks it should start the resource and the TE
> > does too.  Indeed the TE makes the first step towards starting it but
> > then... well i dont know because I need the logs.
> >
> > Before you send them though, can you make sure all <nvpair> tags have
> > an "id" field.  It doesn't really matter what they contain, as long as
> > they are unique.
> >
> > Andrew
> >
> > On 7/10/05, Jack Forester, Jr. <linux_wiz at verizon.net> wrote:
> > > Sorry it's taken me so long to get back around to this.  I've grabbed
> > > the latest bits off of CVS and built it.  The delete is working for me
> > > now.  Thanks, Andrew!
> > >
> > > Unfortunately, now I'm unable to get heartbeat to start a resource.
> > > I've been looking at logs and code and the closest I can determine is
> > > that heartbeat is marking the resource un-runnable.  I'm only running on
> > > one cluster node at the moment, but I get the same result if I have both
> > > nodes up.
> > >
> > > I don't see where the resource script is ever invoked.  Any pointers to
> > > where I should look?
> > >
> > > Here's my cib.xml:
> > >
> > >  <cib generated="true" cib_feature_revision="1" admin_epoche="0"
> > > epoche="1" num_updates="10" have_quorum="true" last_writt
> > > en="Sat Jul  9 16:19:40 2005" ccm_transition="1" origin="persian"
> > > num_peers="1" dc_uuid="21510af9-f998-420d-8fe8-7258329cc
> > > b4a" debug_source="finalize_join">
> > >    <configuration>
> > >      <crm_config/>
> > >      <nodes>
> > >        <node id="21510af9-f998-420d-8fe8-7258329ccb4a" uname="persian"
> > > type="member"/>
> > >      </nodes>
> > >      <resources>
> > >        <primitive id="WebServerIP" class="ocf" type="IPaddr"
> > > provider="heartbeat" on_stopfail="ignore">
> > >          <instance_attributes>
> > >            <attributes>
> > >              <nvpair name="nic" value="eth0"/>
> > >              <nvpair name="ip" value="192.168.1.100"/>
> > >            </attributes>
> > >          </instance_attributes>
> > >        </primitive>
> > >      </resources>
> > >      <constraints/>
> > >    </configuration>
> > >    <status>
> > >      <node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
> > > uname="persian" in_ccm="true" join="member" origin="do_lrm_quer
> > > y" crmd="online" ha="active" expected="member">
> > >        <lrm>
> > >          <lrm_resources>
> > >            <lrm_resource id="WebServerIP" rsc_state="starting"
> > > op_status="-1" rc_code="-1" last_op="start">
> > >              <lrm_rsc_op id="WebServerIP_start_0" operation="start"
> > > op_status="-1" rc_code="-1" origin="cib_action_update"
> > >  transition_id="4" transition_magic="4:-1"/>
> > >            </lrm_resource>
> > >          </lrm_resources>
> > >        </lrm>
> > >      </node_state>
> > >    </status>
> > >  </cib>
> > >
> > >
> > > Here's some of my ha-debug log:
> > >
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(unpack.c:unpack_nodes):
> > > Begining unpack... nodes
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_resources): Begining unpack... resources
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_resources [2]): Begining unpack... primitive
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > Processing resource input...:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <primitive id="WebServerIP" class="ocf" type="IPaddr"
> > >  provider="heartbeat" on_stopfail="ignore">
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <instance_attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <nvpair name="nic" value="eth0"/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > <nvpair name="ip" value="192.168.1.100"/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </instance_attributes>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
> > > </primitive>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]): Options for WebServerIP
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]):   Dependancy restart handling: ignore
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]):   Multiple running resource recovery: stop/s
> > > tart
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
> > > [2]):   Placement: optimal (default)
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > contents    %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > found in    %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
> > >  uname="persian" in_ccm="true" join="member" origin="do_lrm_query"
> > > crmd="online" ha="active" expected="member">
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </node_state>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > contents    %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > found in    %s:
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > <lrm_resources/>
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
> > > </lrm>
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:determine_online_status [2]): Node persian is online
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(unpack.c:unpack_constraints): Begining unpack... constraints
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
> > > color 0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(stages.c:stage4 [2]): No
> > > node available for color 0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:custom_action
> > > [2]): Creating action 1: start for WebServerIP on pe
> > > rsian
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
> > > [2]):  Action start requires: quorum (default)
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
> > > [2]):  start failure results in: resource stop (d
> > > efault)
> > > pengine[32492]: 2005/07/09_16:16:55 info:
> > > mask(native.c:native_create_actions): Start resource WebServerIP
> > > (persian)
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(native.c:create_recurring_actions [2]): Creating recurring actions
> > > for Web
> > > ServerIP
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(graph.c:update_action_states [2]): Updating 1 actions
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
> > > pengine[32492]: 2005/07/09_16:16:55 debug:
> > > mask(pengine.c:do_calculations [2]): ========= All Actions =========
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action
> > > [2]):        : (Provisional) Action 1: start WebServerI
> > > P @ persian
> > > pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:16:56 debug:
> > > mask(pengine.c:do_calculations [2]):         ========= Set -1
> > > (Un-runnable) ===
> > > ======
> > > pengine[32492]: 2005/07/09_16:16:56 info: mask(stages.c:stage8):
> > > Creating transition graph 1.
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(unpack.c:determine_online_status [2]): Node persian is online
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(unpack.c:unpack_rsc_op
> > > [2]): Unpacking task WebServerIP/start (-1) on pers
> > > ian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Creating action 1: start for WebServerIP on <N
> > > ULL>
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]):  Action start requires: quorum (default)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]):  start failure results in: resource stop (d
> > > efault)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Action 1 marked manditory
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(unpack.c:unpack_constraints): Begining unpack... constraints
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > rsc action: : !!Non-Startable!! Action 1:
> > > start WebServerIP @ (null)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
> > > color 0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(stages.c:stage4 [2]): No
> > > node available for color 0
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(native.c:native_create_actions [2]): Stop and restart of
> > > WebServerIP
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
> > > [2]): Creating action 2: stop for WebServerIP on per
> > > sian
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_stop_0
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]):  Action stop requires: nothing (default)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
> > > [2]):  stop failure results in: nothing
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:find_actions
> > > [2]): While looking for WebServerIP_start_0 action on
> > >  persian, found an unallocated one.  Assigning it to the requested
> > > node...
> > > pengine[32492]: 2005/07/09_16:17:39 info:
> > > mask(native.c:native_create_actions): Leave resource WebServerIP
> > > (persian)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(native.c:create_recurring_actions [2]): Creating recurring actions
> > > for Web
> > > ServerIP
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(complex.c:order_actions
> > > [2]): Ordering 1: Action 2 before 1
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > LH (order_actions): Optional Action 2: stop WebSe
> > > rverIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > RH (order_actions): (Provisional) Action 1: start
> > >  WebServerIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
> > > (seen=0, before=0, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(graph.c:update_action_states [2]): Updating 2 actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]): ========= All Actions =========
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [2]):        : (Provisional) Action 1: start WebServerI
> > > P @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [3]):                : Optional Action 2: stop WebServe
> > > rIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=0, after=1)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [2]):        : Optional Action 2: stop WebServerIP @ pe
> > > rsian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Preceeding Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== Subsequent Actions
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
> > > [3]):                : (Provisional) Action 1: start We
> > > bServerIP @ persian
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > (seen=0, before=1, after=0)
> > > pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
> > > ====== End
> > > pengine[32492]: 2005/07/09_16:17:39 debug:
> > > mask(pengine.c:do_calculations [2]):         ========= Set -1
> > > (Un-runnable) ===
> > > ======
> > >
> > >
> > > On Mon, 2005-06-27 at 10:45 +0200, Andrew Beekhof wrote:
> > > > Hi Jack,
> > > >
> > > > I've just finished testing and fixing delete (/me makes a note to add
> > > > regression tests for cibadmin one day).  If you're confident in using
> > > > CVS, I would recommend giving it a try.  This page will explain how to
> > > > obtain it: http://wwnew.linux-ha.org/CVS
> > > >
> > > > Thanks for pointing out the problem!
> > >
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>



More information about the Linux-HA mailing list