[Linux-HA] Question regarding the stonith resources
Andrew Beekhof
beekhof at gmail.com
Mon Oct 31 02:53:24 MST 2005
On 10/31/05, peinkofe at fhm.edu <peinkofe at fhm.edu> wrote:
> Hello Andrew,
> On Mon, Oct 31, 2005 at 09:37:18AM +0100, Andrew Beekhof wrote:
> > On 10/30/05, peinkofe at fhm.edu <peinkofe at fhm.edu> wrote:
> > > Hello everybody,
> > >
> > > I use two wti_nps stonith devices, kill_sarek and kill_spock. So in my cib.xml I have two stonith resources. Additionally I have two constraints which say that kill_sarek can only run on spock and kill_spock can only run on sarek.
> > >
> > > In a recent mail on the list Peter Kruse said he uses one apc powerswitch and have the stonith resources configured as clones.
> > >
> > > I noticed recently that sometimes, nodes want to stonith themselve, which of course doesn't work in my configuration.
> >
> > if both nodes are up then the stonithd is supposed to relay the
> > request to the other node - so even in your configuration, spock can
> > ask for itself to be shot.
> >
> > if spock is all alone - then you are correct - there is no-one to shoot it.
> >
> I think I remeber that there is a stonith device which calls for human intervention. Would it be possible to somehow define: try to kill a node with the wti_nps stonith resource, and if wti_nps is not started (or wti_nps failed) use the human intervention resource.
sounds like a plausible thing to want to do - no idea if its possible though.
be aware that the value of transition_timeout will be relevant -
because this is how long you have for human intervention before we'll
try again.
> > >
> > > So I wondered if such a clone configuration would work with my wti_nps stonith devices. I think it cannot work because the wti's allow only one simultaneously telnet connection. If a second connection is established, it returns immediately with an error. So running more than one instances of the stonith resources may not work because if the nodes try to perform a operation a the same time it fails on one node.
> >
> > i cant really comment on this - i dont know how the plugins work
> >
> > > After a while thinking about it I had the idea to weak the location constrains by giving them a score of less than -INFINITY
> >
> > Any node with a score less than 0 wont be able to run the resource.
> > We use -INFINITY so that it is impossible for the node to end up with
> > a positive score.
> > So I dont think it will have any effect here.
> >
> > > and set the start_prereq to fencing.
> >
> > I'm not quite sure what you're getting at here... for which resources
> > are you doing this?
> >
> I wanted to set start_prereq to fencing on the fencing resources themselfe, to make sure that only one host can run the resource at one time. But in some cases this is a hen - egg problem.
start_prereq=fencing means dont start the resource until you've
completed (successfully) all required fencing operations (which will
fail unless the fencing resource is active).
so unless both nodes are up... this is _always_ a chicken-egg problem
and therefore unlikely to be what you're after ;-)
it might be a situation to use the external stonith plugin to call
halt/reboot/shutdown/...
because if a node is healthy enough to be DC and ordering itself shot,
then these calls are also likely to be functional.
just make sure stonithd knows each node can only shoot itself this way.
More information about the Linux-HA
mailing list