Resource doesn't start (was: [Linux-HA] Version 2 resource groups and more)

Jack Forester, Jr. linux_wiz at verizon.net
Sat Jul 9 18:16:36 MDT 2005


Sorry it's taken me so long to get back around to this.  I've grabbed
the latest bits off of CVS and built it.  The delete is working for me
now.  Thanks, Andrew!

Unfortunately, now I'm unable to get heartbeat to start a resource.
I've been looking at logs and code and the closest I can determine is
that heartbeat is marking the resource un-runnable.  I'm only running on
one cluster node at the moment, but I get the same result if I have both
nodes up.

I don't see where the resource script is ever invoked.  Any pointers to
where I should look?

Here's my cib.xml:

 <cib generated="true" cib_feature_revision="1" admin_epoche="0"
epoche="1" num_updates="10" have_quorum="true" last_writt
en="Sat Jul  9 16:19:40 2005" ccm_transition="1" origin="persian"
num_peers="1" dc_uuid="21510af9-f998-420d-8fe8-7258329cc
b4a" debug_source="finalize_join">
   <configuration>
     <crm_config/>
     <nodes>
       <node id="21510af9-f998-420d-8fe8-7258329ccb4a" uname="persian"
type="member"/>
     </nodes>
     <resources>
       <primitive id="WebServerIP" class="ocf" type="IPaddr"
provider="heartbeat" on_stopfail="ignore">
         <instance_attributes>
           <attributes>
             <nvpair name="nic" value="eth0"/>
             <nvpair name="ip" value="192.168.1.100"/>
           </attributes>
         </instance_attributes>
       </primitive>
     </resources>
     <constraints/>
   </configuration>
   <status>
     <node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
uname="persian" in_ccm="true" join="member" origin="do_lrm_quer
y" crmd="online" ha="active" expected="member">
       <lrm>
         <lrm_resources>
           <lrm_resource id="WebServerIP" rsc_state="starting"
op_status="-1" rc_code="-1" last_op="start">
             <lrm_rsc_op id="WebServerIP_start_0" operation="start"
op_status="-1" rc_code="-1" origin="cib_action_update"
 transition_id="4" transition_magic="4:-1"/>
           </lrm_resource>
         </lrm_resources>
       </lrm>
     </node_state>
   </status>
 </cib>


Here's some of my ha-debug log:

pengine[32492]: 2005/07/09_16:16:55 debug: mask(unpack.c:unpack_nodes):
Begining unpack... nodes
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(unpack.c:unpack_resources): Begining unpack... resources
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(unpack.c:unpack_resources [2]): Begining unpack... primitive
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
Processing resource input...:
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
<primitive id="WebServerIP" class="ocf" type="IPaddr"
 provider="heartbeat" on_stopfail="ignore">
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
<instance_attributes>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
<attributes>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
<nvpair name="nic" value="eth0"/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
<nvpair name="ip" value="192.168.1.100"/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
</attributes>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
</instance_attributes>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(common_unpack [2]):
</primitive>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
[2]): Options for WebServerIP
pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
[2]):   Dependancy restart handling: ignore
pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
[2]):   Multiple running resource recovery: stop/s
tart
pengine[32492]: 2005/07/09_16:16:55 debug: mask(complex.c:common_unpack
[2]):   Placement: optimal (default)
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
contents    %s:
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm_resources/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
</lrm>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
found in    %s:
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<node_state id="21510af9-f998-420d-8fe8-7258329ccb4a"
 uname="persian" in_ccm="true" join="member" origin="do_lrm_query"
crmd="online" ha="active" expected="member">
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm_resources/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
</lrm>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
</node_state>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
contents    %s:
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm_resources/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
found in    %s:
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
<lrm_resources/>
pengine[32492]: 2005/07/09_16:16:55 debug: mask(find_xml_node [5]):
</lrm>
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(unpack.c:determine_online_status [2]): Node persian is online
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(unpack.c:unpack_constraints): Begining unpack... constraints
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
color 0
pengine[32492]: 2005/07/09_16:16:55 debug: mask(stages.c:stage4 [2]): No
node available for color 0
pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:custom_action
[2]): Creating action 1: start for WebServerIP on pe
rsian
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
[2]):  Action start requires: quorum (default)
pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:unpack_operation
[2]):  start failure results in: resource stop (d
efault)
pengine[32492]: 2005/07/09_16:16:55 info:
mask(native.c:native_create_actions): Start resource WebServerIP
(persian)
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(native.c:create_recurring_actions [2]): Creating recurring actions
for Web
ServerIP
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(graph.c:update_action_states [2]): Updating 1 actions
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
pengine[32492]: 2005/07/09_16:16:55 debug:
mask(pengine.c:do_calculations [2]): ========= All Actions =========
pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action
[2]):        : (Provisional) Action 1: start WebServerI
P @ persian
pengine[32492]: 2005/07/09_16:16:55 debug: mask(utils.c:log_action [3]):
====== Preceeding Actions
pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
====== Subsequent Actions
pengine[32492]: 2005/07/09_16:16:56 debug: mask(utils.c:log_action [3]):
====== End
pengine[32492]: 2005/07/09_16:16:56 debug:
mask(pengine.c:do_calculations [2]):         ========= Set -1
(Un-runnable) ===
======
pengine[32492]: 2005/07/09_16:16:56 info: mask(stages.c:stage8):
Creating transition graph 1.
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(unpack.c:determine_online_status [2]): Node persian is online
pengine[32492]: 2005/07/09_16:17:39 debug: mask(unpack.c:unpack_rsc_op
[2]): Unpacking task WebServerIP/start (-1) on pers
ian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
[2]): Creating action 1: start for WebServerIP on <N
ULL>
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_start_0
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
[2]):  Action start requires: quorum (default)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
[2]):  start failure results in: resource stop (d
efault)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
[2]): Action 1 marked manditory
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(unpack.c:unpack_constraints): Begining unpack... constraints
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
rsc action: : !!Non-Startable!! Action 1:
start WebServerIP @ (null)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
rsc action: : !!Non-Startable!! Action 1:
start WebServerIP @ (null)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
rsc action: : !!Non-Startable!! Action 1:
start WebServerIP @ (null)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
rsc action: : !!Non-Startable!! Action 1:
start WebServerIP @ (null)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
rsc action: : !!Non-Startable!! Action 1:
start WebServerIP @ (null)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(stages.c:choose_node_from_list [2]): Could not allocate a node for
color 0
pengine[32492]: 2005/07/09_16:17:39 debug: mask(stages.c:stage4 [2]): No
node available for color 0
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(native.c:native_create_actions [2]): Stop and restart of
WebServerIP
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:custom_action
[2]): Creating action 2: stop for WebServerIP on per
sian
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(utils.c:find_rsc_op_entry [2]): No matching for WebServerIP_stop_0
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
[2]):  Action stop requires: nothing (default)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:unpack_operation
[2]):  stop failure results in: nothing
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:find_actions
[2]): While looking for WebServerIP_start_0 action on
 persian, found an unallocated one.  Assigning it to the requested
node...
pengine[32492]: 2005/07/09_16:17:39 info:
mask(native.c:native_create_actions): Leave resource WebServerIP
(persian)
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(native.c:create_recurring_actions [2]): Creating recurring actions
for Web
ServerIP
pengine[32492]: 2005/07/09_16:17:39 debug: mask(complex.c:order_actions
[2]): Ordering 1: Action 2 before 1
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
LH (order_actions): Optional Action 2: stop WebSe
rverIP @ persian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
RH (order_actions): (Provisional) Action 1: start
 WebServerIP @ persian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [4]):
(seen=0, before=0, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(graph.c:update_action_states [2]): Updating 2 actions
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(pengine.c:do_calculations [2]): =#=#=#=#= Summary =#=#=#=#=
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(pengine.c:do_calculations [2]): ========= All Actions =========
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
[2]):        : (Provisional) Action 1: start WebServerI
P @ persian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== Preceeding Actions
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
[3]):                : Optional Action 2: stop WebServe
rIP @ persian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=0, after=1)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== Subsequent Actions
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== End
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
[2]):        : Optional Action 2: stop WebServerIP @ pe
rsian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== Preceeding Actions
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== Subsequent Actions
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action
[3]):                : (Provisional) Action 1: start We
bServerIP @ persian
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
(seen=0, before=1, after=0)
pengine[32492]: 2005/07/09_16:17:39 debug: mask(utils.c:log_action [3]):
====== End
pengine[32492]: 2005/07/09_16:17:39 debug:
mask(pengine.c:do_calculations [2]):         ========= Set -1
(Un-runnable) ===
======


On Mon, 2005-06-27 at 10:45 +0200, Andrew Beekhof wrote:
> Hi Jack,
> 
> I've just finished testing and fixing delete (/me makes a note to add
> regression tests for cibadmin one day).  If you're confident in using
> CVS, I would recommend giving it a try.  This page will explain how to
> obtain it: http://wwnew.linux-ha.org/CVS
> 
> Thanks for pointing out the problem!




More information about the Linux-HA mailing list