[Linux-HA] STONITH ERROR

fabiomm at br.ibm.com fabiomm at br.ibm.com
Tue Jan 2 09:24:48 MST 2007


Hi There,

I have an environment that runs SLES 9 + Linux-HA 2.0.7 + Tivoli 
Applications + STONITH with lic_vps plugin and we are finding some 
problems with STONITH here.

We can start the resource groups on the correct machines and it work fine. 
The STONITH is configured as a clone resource, and it starts fine on both 
nodes too, but after some time (nothing specially happens), it simply 
stops and some error messages are logged into /var/log/messages, as 
follows:

Dec 27 21:43:57 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 27 21:45:18 lxt010001 last message repeated 2 times
Dec 27 21:45:58 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 27 21:46:38 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 27 21:46:38 lxt010001 lrmd: [23692]: WARN: mapped the invalid return 
code 14.
Dec 27 21:46:38 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c LRM 
operation (17) monitor_40000 on STONITH:1 Error: (1) unknown error
Dec 27 21:46:40 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
Performing op stop on STONITH:1 (interval=0ms, 
key=3:46b74166-ad3d-4330-8308-d6beb14a6f31)
Dec 27 21:46:40 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c LRM 
operation (17) monitor_40000 on STONITH:1 Cancelled
Dec 27 21:46:40 lxt010001 lrmd: [23702]: info: Try to stop STONITH 
resource <rsc_id=STONITH:1> : Device=lic_vps
Dec 27 21:46:40 lxt010001 crmd: [3462]: info: process_lrm_event:lrm.c LRM 
operation (24) stop_0 on STONITH:1 complete
Dec 27 21:46:40 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 447656 bytes (was 447896)
Dec 27 21:46:40 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 3462, call:37): 0.21.639 -> 0.21.640 (ok)
Dec 27 21:46:41 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 449796 bytes (was 447656)
Dec 27 21:46:41 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 26608, call:2): 0.21.640 -> 0.21.641 (ok)
Dec 27 21:46:41 lxt010001 cib: [23704]: info: write_cib_contents:io.c 
Wrote version 0.21.641 of the CIB to disk (digest: 
0796bb2ff05cfe03fb78ca6b06b2fb7f)
Dec 27 21:46:42 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
Performing op start on STONITH:1 (interval=0ms, 
key=3:46b74166-ad3d-4330-8308-d6beb14a6f31)
Dec 27 21:46:42 lxt010001 lrmd: [23714]: info: Try to start STONITH 
resource <rsc_id=STONITH:1> : Device=lic_vps
Dec 27 21:46:42 lxt010001 : lic_vps_status: Preparing access for 
10.209.2.11(type lpar)
Dec 27 21:46:42 lxt010001 crmd: [3462]: WARN: process_lrm_event:lrm.c LRM 
operation (25) start_0 on STONITH:1 Error: (1) unknown error
Dec 27 21:46:42 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 452896 bytes (was 449796)
Dec 27 21:46:42 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 3462, call:38): 0.21.641 -> 0.21.642 (ok)
Dec 27 21:46:42 lxt010001 cib: [23716]: info: write_cib_contents:io.c 
Wrote version 0.21.642 of the CIB to disk (digest: 
d714b7d18609ce751718b1ecbf4971fe)
Dec 27 21:46:44 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 452656 bytes (was 452896)
Dec 27 21:46:44 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 26604, call:52): 0.21.642 -> 0.21.643 (ok)
Dec 27 21:46:45 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 452536 bytes (was 452656)
Dec 27 21:46:45 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 3462, call:39): 0.21.643 -> 0.21.644 (ok)
Dec 27 21:46:45 lxt010001 crmd: [3462]: info: do_lrm_rsc_op:lrm.c 
Performing op stop on STONITH:1 (interval=0ms, 
key=4:46b74166-ad3d-4330-8308-d6beb14a6f31)
Dec 27 21:46:45 lxt010001 lrmd: [23740]: info: Try to stop STONITH 
resource <rsc_id=STONITH:1> : Device=lic_vps
Dec 27 21:46:45 lxt010001 stonithd: [3460]: notice: try to stop a resource 
STONITH:1 who is not in started resource queue.
Dec 27 21:46:45 lxt010001 crmd: [3462]: info: process_lrm_event:lrm.c LRM 
operation (26) stop_0 on STONITH:1 complete
Dec 27 21:46:45 lxt010001 cib: [23739]: info: write_cib_contents:io.c 
Wrote version 0.21.644 of the CIB to disk (digest: 
62b3c354980bae9dce012f59f5c21fcc)
Dec 27 21:46:45 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 455636 bytes (was 452536)
Dec 27 21:46:45 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 26604, call:54): 0.21.644 -> 0.21.645 (ok)
Dec 27 21:46:45 lxt010001 cib: [3458]: WARN: G_SIG_dispatch: Dispatch 
function for SIGCHLD was delayed 120 ms (> 100 ms) before being called 
(GSource: 0x800248e0)
Dec 27 21:46:45 lxt010001 cib: [3458]: info: G_SIG_dispatch: started at 
116859506 should have started at 116859494
Dec 27 21:46:45 lxt010001 cib: [23742]: info: write_cib_contents:io.c 
Wrote version 0.21.645 of the CIB to disk (digest: 
73d9e1bfb526ba99af482f073e3a8d94)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 455516 bytes (was 455636)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 3462, call:40): 0.21.645 -> 0.21.646 (ok)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 455396 bytes (was 455516)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 26604, call:55): 0.21.646 -> 0.21.647 (ok)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: activateCibXml:io.c CIB size 
is 455276 bytes (was 455396)
Dec 27 21:46:47 lxt010001 cib: [3458]: info: cib_diff_notify:notify.c 
Update (client: 26604, call:57): 0.21.647 -> 0.21.648 (ok)
Dec 27 21:46:47 lxt010001 cib: [23760]: info: write_cib_contents:io.c 
Wrote version 0.21.648 of the CIB to disk (digest: 
64ca859ff5e6aac4ce298b62c9ea3acf)

After that, the STONITH simply becames stopped on both nodes.

Do someone has any idea about this problem?

Best Regards,
Fabio Martins



Fábio Martins
SW Support Specialist
IBM Brazil - CAC/SW Support
Mailto: fabiomm at br.ibm.com 
Visit us: www.ibm.com.br

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 3451 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20070102/dd722e1e/attachment.jpe


More information about the Linux-HA mailing list