[Linux-HA] Resource not monitored?

Jim Wong jwong at sharpcast.com
Wed Feb 7 04:47:07 MST 2007


Folks,

We may have done something goofy in our configuration here, but we're
stumped.  We've got a resource, managed by a custom OCF resource agent, that
is listed as running in the output of crm_mon:

    xeroxd      (sharpcast::ocf:xeroxd):        Started abg01

However, the associated process is definitely not running (due to a
configuration error), and the monitor operation seems to indicate as such:

bash-3.00# /usr/lib/ocf/resource.d/sharpcast/xeroxd monitor ; echo "Result:
$?"
xeroxd dead but pid file exists
Result: 1

It's been in this state for an hour or so, but heartbeat doesn't seem to
have figured it out.  Is there a particular reason heartbeat wouldn't be
monitoring this resource?  There doesn't seem to be anything relevant in the
logs during that time...

Feb  7 02:18:48 abg01 crmd: [16643]: info: do_lrm_rsc_op:lrm.c Performing op
start on xeroxd (interval=0ms, key=112:3c6c0960-29d1-4f5b-9497-da5324195891)
Feb  7 02:18:48 abg01 tengine: [18920]: info: match_graph_event:events.c
Action xeroxd_monitor_0 (3) confirmed
Feb  7 02:18:48 abg01 tengine: [18920]: info: send_rsc_command:actions.c
Initiating action 15: xeroxd_start_0 on abg01
Feb  7 02:18:48 abg01 cibmon: [16645]: info: cib_update: +
<lrm_resource id="xeroxd" type="xeroxd" class="ocf" provider="sharpcast"
__crm_diff_marker__="added:top">
Feb  7 02:18:48 abg01 cibmon: [16645]: info: cib_update: +
<lrm_rsc_op id="xeroxd_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource"
transition_key="112:3c6c0960-29d1-4f5b-9497-da5324195891"
transition_magic="4:7;112:3c6c0960-29d1-4f5b-9497-da5324195891" call_id="42"
crm_feature_set="1.0.6" rc_code="7" op_status="4" interval="0"
op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
Feb  7 02:18:48 abg01 lrmd: [16640]: info: RA output: (xeroxd:start:stdout)
Starting xeroxd: 
Feb  7 02:18:48 abg01 lrmd: [16640]: info: RA output: (xeroxd:start:stdout)
[  
Feb  7 02:18:48 abg01 lrmd: [16640]: info: RA output: (xeroxd:start:stdout)
OK  ]
Feb  7 02:18:48 abg01 lrmd: [16640]: info: RA output: (xeroxd:start:stdout)
Feb  7 02:18:48 abg01 crmd: [16643]: info: process_lrm_event:lrm.c LRM
operation (40) start_0 on xeroxd complete
Feb  7 02:18:48 abg01 tengine: [18920]: info: match_graph_event:events.c
Action xeroxd_start_0 (15) confirmed
Feb  7 02:18:48 abg01 cibmon: [16645]: info: cib_update: +
<lrm_resource id="xeroxd">
Feb  7 02:18:48 abg01 cibmon: [16645]: info: cib_update: +
<lrm_rsc_op id="xeroxd_start_0" operation="start"
crm-debug-origin="do_update_resource"
transition_key="112:3c6c0960-29d1-4f5b-9497-da5324195891"
transition_magic="0:0;112:3c6c0960-29d1-4f5b-9497-da5324195891" call_id="40"
crm_feature_set="1.0.6" rc_code="0" op_status="0" interval="0"
op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"
__crm_diff_marker__="added:top"/>
Feb  7 02:35:09 abg01 pengine: [18921]: info:     xeroxd
(sharpcast::ocf:xeroxd):        Started abg01
Feb  7 02:35:09 abg01 pengine: [18921]: notice: NoRoleChange:native.c Leave
resource xeroxd     (abg01)
Feb  7 02:40:29 abg01 pengine: [18921]: info:     xeroxd
(sharpcast::ocf:xeroxd):        Started abg01
Feb  7 02:40:29 abg01 pengine: [18921]: notice: NoRoleChange:native.c Leave
resource xeroxd     (abg01)
Feb  7 02:40:30 abg01 pengine: [18921]: info:     xeroxd
(sharpcast::ocf:xeroxd):        Started abg01
Feb  7 02:40:31 abg01 pengine: [18921]: notice: NoRoleChange:native.c Leave
resource xeroxd     (abg01)




-- 
Jim Wong (jwong at sharpcast.com)



More information about the Linux-HA mailing list