[Linux-HA] STONITH device failure and then what?

Wojciech Turek wjt27 at cam.ac.uk
Thu Sep 27 06:56:04 MDT 2007


Dear All,

I have 2 nodes HA cluster configuration. Both nodes are connected to  
shared storage. Is it critical  that nodes will not mount the same  
LUN at the same time. That is why I am using STONITH for node  
fencing. My STONITH is based on IPMI device.
I am considering scenario that IPMI device on node that need to be  
power down fails:

Sep 27 13:10:12 storage09 heartbeat: [17161]: ERROR: STONITH device  
external/ipmi  not operational!
Sep 27 13:10:12 storage09 heartbeat: [7438]: WARN: Exiting STONITH- 
stat process 17161 returned rc 1.
Sep 27 13:10:12 storage09 heartbeat: [7438]: ERROR: STONITH status  
operation failed.
Sep 27 13:10:12 storage09 heartbeat: [7438]: info: This may mean that  
the STONITH device has failed!

Is there a way to configure STONITH in such a way that if status of  
the device exit with code 1 then it will for example stop heartbeat  
and wait for administrator intervention?

Best Regards

Wojciech






More information about the Linux-HA mailing list