[Linux-HA] crm_abort: get_lrm_resource: Triggered non-fatal assert
at lrm.c:864
Sebastian Reitenbach
sebastia at l00-bugdead-prods.de
Thu Mar 1 07:07:14 MST 2007
Hi list,
I had a cluster running with 5 nodes, and 5 stonith resources, these were
working great so
far. Then I added 21 ocfs clone resources, these stayed "unused" (white lamp in
the gui),
then I stopped the stonith resources, and started them again, and they stayed
white too.
after restarting heartbeat on all nodes, having all but one in standby, the one
living
took the stonith resources and the ocfs resources too.
this i saw in the log file, after adding the ocfs2 resoureces via cibadmin:
Mar 1 14:22:27 ppsdb102 sudo: ppsadmin : TTY=pts/6 ; PWD=/tmp/from_dol ;
USER=root ;
COMMAND=/bin/bash
Mar 1 14:24:48 ppsdb102 cib: [7259]: info: cib_diff_notify: Update (client:
10549,
call:2): 0.5.88 -> 0.5.89 (ok)
Mar 1 14:24:48 ppsdb102 haclient: on_event:evt:cib_changed
Mar 1 14:24:48 ppsdb102 cib: [10550]: info: write_cib_contents: Wrote version
0.5.89 of
the CIB to disk (digest: 3025b3d515c63168a77c5240516c57db)
Mar 1 14:25:18 ppsdb102 mgmtd: [7264]: info: on_set_target_role:<clone
id="Clone_PPS_CACHE_1"><primitive id="PPS_CACHE_1"><instance_attributes
id="PPS_CACHE_1_instance_attrs"><attributes><nvpair
id="PPS_CACHE_1:0_target_role"
name="target_role"
value="started"/></attributes></instance_attributes></primitive></clone>
Mar 1 14:25:18 ppsdb102 cib: [7259]: info: cib_diff_notify: Update (client:
7264,
call:94): 0.5.89 -> 0.5.90 (ok)
Mar 1 14:25:18 ppsdb102 cib: [10635]: info: write_cib_contents: Wrote version
0.5.90 of
the CIB to disk (digest: edbe9b27df26aa27b3b3cbd6918c53f5)
Mar 1 14:25:18 ppsdb102 haclient: on_event: from message queue: evt:cib_changed
Mar 1 14:25:46 ppsdb102 crmd: [7263]: info: verify_stopped: Checking for active
resources
before exit
Mar 1 14:25:46 ppsdb102 crmd: [7263]: ERROR: crm_abort: get_lrm_resource:
Triggered
non-fatal assert at lrm.c:864 : class != NULL
Mar 1 14:25:46 ppsdb102 crmd: [7263]: ERROR: do_lrm_invoke: Invalid resource
definition
Mar 1 14:25:46 ppsdb102 crmd: [7263]: WARN: log_data_element: do_lrm_invoke:
Bad command
<rsc_op transition_key="mgmtd-7264">
Mar 1 14:25:46 ppsdb102 crmd: [7263]: WARN: log_data_element: do_lrm_invoke:
Bad command
<primitive id="PPS_CACHE_1:0"/>
Mar 1 14:25:46 ppsdb102 crmd: [7263]: WARN: log_data_element: do_lrm_invoke:
Bad command
<attributes crm_feature_set="1.0.8"/>
Mar 1 14:25:46 ppsdb102 crmd: [7263]: WARN: log_data_element: do_lrm_invoke:
Bad command
</rsc_op>
Mar 1 14:25:46 ppsdb102 crmd: [7263]: info: verify_stopped: Checking for active
resources
before exit
Mar 1 14:25:46 ppsdb102 crmd: [7263]: info: append_restart_list: Resource
Stonith:0 does
not support reloads
Mar 1 14:25:46 ppsdb102 crmd: [7263]: info: do_lrm_invoke: Forcing a local LRM
refresh
Mar 1 14:25:46 ppsdb102 crmd: [7263]: info: verify_stopped: Checking for active
resources
before exit
Mar 1 14:25:46 ppsdb102 cib: [7259]: info: cib_diff_notify: Update (client:
7263,
call:94): 0.5.90 -> 0.5.91 (ok)
Mar 1 14:25:46 ppsdb102 cib: [10717]: info: write_cib_contents: Wrote version
0.5.91 of
the CIB to disk (digest: fc4cfb29da8b0fd01a3979e9a07d9ec3)
I found this thread, with in my eyes the same problem:
http://lists.community.tummy.com/pipermail/linux-ha/2006-October/021987.html
which later links to this patch:
http://hg.linux-ha.org/dev?cs=d0f8d4c45eab
but the assert in lrm.c is on a different line.
is this related to my problem? Is the patch added in heartbeat 2.0.8?
I use heartbeat 2.0.8, on SLES 10, x86_64.
Kind regards
Sebastian
More information about the Linux-HA
mailing list