[Linux-HA] STONITH ERROR

fabiomm at br.ibm.com fabiomm at br.ibm.com
Tue Jan 2 12:56:02 MST 2007


Hi There!!

Still about this problem. It happened again as follows:

Dec 29 21:23:48 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 29 21:23:48 lxt010001 lrmd: [28131]: WARN: mapped the invalid return 
code 14.
Dec 29 21:23:48 lxt010001 crmd: [5216]: WARN: process_lrm_event:lrm.c LRM 
operation (14) monitor_40000 on STONITH:0 Error: (1) unknown error
Dec 29 21:23:50 lxt010001 lrmd: [28150]: info: Try to stop STONITH 
resource <rsc_id=STONITH:0> : Device=lic_vps
Dec 29 21:23:50 lxt010001 crmd: [5216]: info: do_lrm_rsc_op:lrm.c 
Performing op stop on STONITH:0 (interval=0ms, 
key=7:cccc8d48-5f0f-4fa1-8b05-ff9d77eb9ccf)
Dec 29 21:23:50 lxt010001 crmd: [5216]: WARN: process_lrm_event:lrm.c LRM 
operation (14) monitor_40000 on STONITH:0 Cancelled
Dec 29 21:23:50 lxt010001 crmd: [5216]: info: process_lrm_event:lrm.c LRM 
operation (24) stop_0 on STONITH:0 complete
Dec 29 21:23:50 lxt010001 cib: [5212]: info: activateCibXml:io.c CIB size 
is 489616 bytes (was 489856)
Dec 29 21:23:50 lxt010001 cib: [5212]: info: cib_diff_notify:notify.c 
Update (client: 5216, call:35): 0.24.730 -> 0.24.731 (ok)
Dec 29 21:23:50 lxt010001 cib: [28152]: info: write_cib_contents:io.c 
Wrote version 0.24.731 of the CIB to disk (digest: 
000bb0e9a0793743a1cf6c3462d74241)
Dec 29 21:23:52 lxt010001 crmd: [5216]: info: do_lrm_rsc_op:lrm.c 
Performing op start on STONITH:0 (interval=0ms, 
key=7:cccc8d48-5f0f-4fa1-8b05-ff9d77eb9ccf)
Dec 29 21:23:52 lxt010001 lrmd: [28153]: info: Try to start STONITH 
resource <rsc_id=STONITH:0> : Device=lic_vps
Dec 29 21:23:52 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 29 21:23:52 lxt010001 crmd: [5216]: WARN: process_lrm_event:lrm.c LRM 
operation (25) start_0 on STONITH:0 Error: (1) unknown error
Dec 29 21:23:52 lxt010001 cib: [5212]: info: activateCibXml:io.c CIB size 
is 492716 bytes (was 489616)
Dec 29 21:23:52 lxt010001 cib: [5212]: info: cib_diff_notify:notify.c 
Update (client: 5216, call:36): 0.24.731 -> 0.24.732 (ok)
Dec 29 21:23:52 lxt010001 cib: [28155]: info: write_cib_contents:io.c 
Wrote version 0.24.732 of the CIB to disk (digest: 
e22089f51856e529b67127f3f93cb2f1)
Dec 29 21:23:54 lxt010001 crmd: [5216]: info: do_lrm_rsc_op:lrm.c 
Performing op stop on STONITH:0 (interval=0ms, 
key=8:cccc8d48-5f0f-4fa1-8b05-ff9d77eb9ccf)
Dec 29 21:23:54 lxt010001 lrmd: [28195]: info: Try to stop STONITH 
resource <rsc_id=STONITH:0> : Device=lic_vps
Dec 29 21:23:54 lxt010001 stonithd: [5214]: notice: try to stop a resource 
STONITH:0 who is not in started resource queue.
Dec 29 21:23:54 lxt010001 crmd: [5216]: info: process_lrm_event:lrm.c LRM 
operation (26) stop_0 on STONITH:0 complete
Dec 29 21:23:54 lxt010001 cib: [5212]: info: activateCibXml:io.c CIB size 
is 492596 bytes (was 492716)
Dec 29 21:23:54 lxt010001 cib: [5212]: info: cib_diff_notify:notify.c 
Update (client: 5216, call:37): 0.24.732 -> 0.24.733 (ok)
Dec 29 21:23:54 lxt010001 cib: [28197]: info: write_cib_contents:io.c 
Wrote version 0.24.733 of the CIB to disk (digest: 
675a4a1885f65dcc44f46201a91c9c41)
Dec 29 21:23:57 lxt010001 cib: [5212]: info: activateCibXml:io.c CIB size 
is 492476 bytes (was 492596)
Dec 29 21:23:57 lxt010001 cib: [5212]: info: cib_diff_notify:notify.c 
Update (client: 5216, call:38): 0.24.733 -> 0.24.734 (ok)
Dec 29 21:23:57 lxt010001 cib: [28198]: info: write_cib_contents:io.c 
Wrote version 0.24.734 of the CIB to disk (digest: 
4de1cd51869df91701ad1f3a9660cbe2)
Dec 29 21:30:01 lxt010001 /USR/SBIN/CRON[30255]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 21:30:07 lxt010001 cib: [5212]: info: cib_stats:main.c Processed 13 
operations (0.00us average, 0% utilization) in the last 10min
Dec 29 21:40:02 lxt010001 /USR/SBIN/CRON[1212]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 21:50:01 lxt010001 /USR/SBIN/CRON[4600]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 21:59:01 lxt010001 /USR/SBIN/CRON[7710]: (root) CMD ( rm -f 
/var/spool/cron/lastrun/cron.hourly) 
Dec 29 22:00:01 lxt010001 /USR/SBIN/CRON[8053]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:10:01 lxt010001 /USR/SBIN/CRON[11453]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:20:01 lxt010001 /USR/SBIN/CRON[14869]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:30:01 lxt010001 /USR/SBIN/CRON[18245]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:40:01 lxt010001 /USR/SBIN/CRON[21659]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:49:39 lxt010001 rcd[31222]: Running heartbeat at Fri Dec 29 
22:49:39 2006 
Dec 29 22:49:40 lxt010001 rcd[31222]: Memory limit reached, restarting
Dec 29 22:49:40 lxt010001 rcd[31222]: Shutting down daemon...
Dec 29 22:49:40 lxt010001 rcd[31222]: Shutting down local server
Dec 29 22:49:40 lxt010001 rcd[31222]: Shutting down remote server
Dec 29 22:49:40 lxt010001 rcd[24945]: Red Carpet Daemon 2.4.9
Dec 29 22:49:40 lxt010001 rcd[24945]: Copyright (C) 2000-2003 Ximian Inc.
Dec 29 22:49:40 lxt010001 rcd[24945]: Start time: Fri Dec 29 22:49:40 2006
Dec 29 22:49:40 lxt010001 rcd[24945]: Initializing RPC system
Dec 29 22:49:40 lxt010001 rcd[24945]: Initializing modules
Dec 29 22:49:40 lxt010001 rcd[24945]: [rcd.serverpoll] Starting 
server-poll
Dec 29 22:49:40 lxt010001 rcd[24945]: Starting local server
Dec 29 22:49:40 lxt010001 rcd[24945]: Starting remote server
Dec 29 22:49:42 lxt010001 rcd[24945]: Loading system packages
Dec 29 22:49:43 lxt010001 rcd[24945]: Done loading system packages
Dec 29 22:49:50 lxt010001 rcd[24945]: id=1 COMPLETE 'Downloading 
https://update.novell.com/data/serviceinfo.xml' time=6s (failed)
Dec 29 22:49:50 lxt010001 rcd[24945]: Unable to download service info: 
Can't connect - Cannot connect to destination 
(https://update.novell.com/data/serviceinfo.xml)
Dec 29 22:49:50 lxt010001 rcd[24945]: Unable to load service for default 
host URL 'https://update.novell.com/data': Unable to download service 
info: Can't connect - Cannot connect to destination 
(https://update.novell.com/data/serviceinfo.xml)
Dec 29 22:49:50 lxt010001 rcd[24945]: Can't find rcd 1.x subscription file 
'/var/lib/redcarpet/subscriptions.xml'
Dec 29 22:49:50 lxt010001 rcd[24945]: Starting heartbeat
Dec 29 22:50:01 lxt010001 /USR/SBIN/CRON[25077]: (root) CMD ( 
/usr/lib/sa/sa1) 
Dec 29 22:59:01 lxt010001 /USR/SBIN/CRON[28109]: (root) CMD ( rm -f 
/var/spool/cron/lastrun/cron.hourly) 
Dec 29 23:00:01 lxt010001 /USR/SBIN/CRON[28460]: (root) CMD ( 
/usr/lib/sa/sa1) 

Any idea?!?

Best Regards,



Fábio Martins
SW Support Specialist
IBM Brazil - CAC/SW Support
Mobile: +55-11-8564-0862
Phone: +55-11-2132-2547
Mailto: fabiomm at br.ibm.com 
Visit us: www.ibm.com.br



linux-ha-bounces at lists.linux-ha.org wrote on 02/01/2007 14:24:48:

> Hi There,
> 
> I have an environment that runs SLES 9 + Linux-HA 2.0.7 + Tivoli 
> Applications + STONITH with lic_vps plugin and we are finding some 
> problems with STONITH here.
> 
> We can start the resource groups on the correct machines and it work 
fine. 
> The STONITH is configured as a clone resource, and it starts fine on 
both 
> nodes too, but after some time (nothing specially happens), it simply 
> stops and some error messages are logged into /var/log/messages, as 
> follows:
> 
> Dec 27 21:43:57 lxt010001 : lic_vps_status: Preparing access for 
> 10.209.2.11(type lpar)
> Dec 27 21:45:18 lxt010001 last message repeated 2 times
> Dec 27 21:45:58 lxt010001 : lic_vps_status: Preparing access for 
> 10.209.2.11(type lpar)
> Dec 27 21:46:38 lxt010001 : lic_vps_status: Preparing access for 
> 10.209.2.11(type lpar)
> Dec 27 21:46:38 lxt010001 lrmd: [23692]: WARN: mapped the invalid return 

> code 14.
> Dec 27 21:46:38 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c 
LRM 
> operation (17) monitor_40000 on STONITH:1 Error: (1) unknown error
> Dec 27 21:46:40 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
> Performing op stop on STONITH:1 (interval=0ms, 
> key=3:46b74166-ad3d-4330-8308-d6beb14a6f31)
> Dec 27 21:46:40 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c 
LRM 
> operation (17) monitor_40000 on STONITH:1 Cancelled
> Dec 27 21:46:40 lxt010001 lrmd: [23702]: info: Try to stop STONITH 
> resource <rsc_id=STONITH:1> : Device=lic_vps
> Dec 27 21:46:40 lxt010001 crmd: [3462]: info: process_lrm_event:lrm.c 
LRM 
> operation (24) stop_0 on STONITH:1 complete
> Dec 27 21:46:40 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 447656 bytes (was 447896)
> Dec 27 21:46:40 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 3462, call:37): 0.21.639 -> 0.21.640 (ok)
> Dec 27 21:46:41 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 449796 bytes (was 447656)
> Dec 27 21:46:41 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 26608, call:2): 0.21.640 -> 0.21.641 (ok)
> Dec 27 21:46:41 lxt010001 cib: [23704]: info: write_cib_contents:io.c 
> Wrote version 0.21.641 of the CIB to disk (digest: 
> 0796bb2ff05cfe03fb78ca6b06b2fb7f)
> Dec 27 21:46:42 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
> Performing op start on STONITH:1 (interval=0ms, 
> key=3:46b74166-ad3d-4330-8308-d6beb14a6f31)
> Dec 27 21:46:42 lxt010001 lrmd: [23714]: info: Try to start STONITH 
> resource <rsc_id=STONITH:1> : Device=lic_vps
> Dec 27 21:46:42 lxt010001 : lic_vps_status: Preparing access for 
> 10.209.2.11(type lpar)
> Dec 27 21:46:42 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c 
LRM 
> operation (25) start_0 on STONITH:1 Error: (1) unknown error
> Dec 27 21:46:42 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 452896 bytes (was 449796)
> Dec 27 21:46:42 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 3462, call:38): 0.21.641 -> 0.21.642 (ok)
> Dec 27 21:46:42 lxt010001 cib: [23716]: info: write_cib_contents:io.c 
> Wrote version 0.21.642 of the CIB to disk (digest: 
> d714b7d18609ce751718b1ecbf4971fe)
> Dec 27 21:46:44 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 452656 bytes (was 452896)
> Dec 27 21:46:44 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 26604, call:52): 0.21.642 -> 0.21.643 (ok)
> Dec 27 21:46:45 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 452536 bytes (was 452656)
> Dec 27 21:46:45 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 3462, call:39): 0.21.643 -> 0.21.644 (ok)
> Dec 27 21:46:45 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
> Performing op stop on STONITH:1 (interval=0ms, 
> key=4:46b74166-ad3d-4330-8308-d6beb14a6f31)
> Dec 27 21:46:45 lxt010001 lrmd: [23740]: info: Try to stop STONITH 
> resource <rsc_id=STONITH:1> : Device=lic_vps
> Dec 27 21:46:45 lxt010001 stonithd: [3460]: notice: try to stop a 
resource 
> STONITH:1 who is not in started resource queue.
> Dec 27 21:46:45 lxt010001 crmd: [3462]: info: process_lrm_event:lrm.c 
LRM 
> operation (26) stop_0 on STONITH:1 complete
> Dec 27 21:46:45 lxt010001 cib: [23739]: info: write_cib_contents:io.c 
> Wrote version 0.21.644 of the CIB to disk (digest: 
> 62b3c354980bae9dce012f59f5c21fcc)
> Dec 27 21:46:45 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 455636 bytes (was 452536)
> Dec 27 21:46:45 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 26604, call:54): 0.21.644 -> 0.21.645 (ok)
> Dec 27 21:46:45 lxt010001 cib: [3458]: WARN: G_SIG_dispatch: Dispatch 
> function for SIGCHLD was delayed 120 ms (> 100 ms) before being called 
> (GSource: 0x800248e0)
> Dec 27 21:46:45 lxt010001 cib: [3458]: info: G_SIG_dispatch: started at 
> 116859506 should have started at 116859494
> Dec 27 21:46:45 lxt010001 cib: [23742]: info: write_cib_contents:io.c 
> Wrote version 0.21.645 of the CIB to disk (digest: 
> 73d9e1bfb526ba99af482f073e3a8d94)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 455516 bytes (was 455636)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 3462, call:40): 0.21.645 -> 0.21.646 (ok)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 455396 bytes (was 455516)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 26604, call:55): 0.21.646 -> 0.21.647 (ok)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB 
size 
> is 455276 bytes (was 455396)
> Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
> Update (client: 26604, call:57): 0.21.647 -> 0.21.648 (ok)
> Dec 27 21:46:47 lxt010001 cib: [23760]: info: write_cib_contents:io.c 
> Wrote version 0.21.648 of the CIB to disk (digest: 
> 64ca859ff5e6aac4ce298b62c9ea3acf)
> 
> After that, the STONITH simply becames stopped on both nodes.
> 
> Do someone has any idea about this problem?
> 
> Best Regards,
> Fabio Martins
> 
> 
> 
> Fábio Martins
> SW Support Specialist
> IBM Brazil - CAC/SW Support
> Mailto: fabiomm at br.ibm.com 
> Visit us: www.ibm.com.br
> 
> [image removed] _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 3451 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20070102/0465d4df/attachment.jpe


More information about the Linux-HA mailing list