[Linux-HA] Question regarding the stonith resources
peinkofe at fhm.edu
peinkofe at fhm.edu
Mon Oct 31 08:42:26 MST 2005
On Mon, Oct 31, 2005 at 08:16:15AM -0700, Alan Robertson wrote:
> Peter Kruse wrote:
> > Hello,
> > Andrew Beekhof wrote:
> >> On 10/30/05, peinkofe at fhm.edu <peinkofe at fhm.edu> wrote:
> >>> I use two wti_nps stonith devices, kill_sarek and kill_spock. So in
> >>> my cib.xml I have two stonith resources. Additionally I have two
> >>> constraints which say that kill_sarek can only run on spock and
> >>> kill_spock can only run on sarek.
> >>> In a recent mail on the list Peter Kruse said he uses one apc
> >>> powerswitch and have the stonith resources configured as clones.
> >>> I noticed recently that sometimes, nodes want to stonith themselve,
> >>> which of course doesn't work in my configuration.
> > It would be very helpful if there where supported examples
> > on the webpage for this situation. Even the apc powerswitches
> > don't have to be reachable on the whole network, it sometims
> > is desirable to connect them with a crossover cable directly.
> > This would result in the same situation:
> > node A can _only_ shoot node B and vice versa.
> > Meaning a node _cannot_ shoot itself, and asking to do so
> > will result in a failure, so it would be better not to ask
> > it to do it in the first place.
> >> if both nodes are up then the stonithd is supposed to relay the
> >> request to the other node - so even in your configuration, spock can
> >> ask for itself to be shot.
> >> if spock is all alone - then you are correct - there is no-one to
> >> shoot it.
> >>> So I wondered if such a clone configuration would work with my
> >>> wti_nps stonith devices. I think it cannot work because the wti's
> >>> allow only one simultaneously telnet connection. If a second
> >>> connection is established, it returns immediately with an error. So
> >>> running more than one instances of the stonith resources may not
> >>> work because if the nodes try to perform a operation a the same
> >>> time it fails on one node.
> > That will probably not work, I will probably also change
> > my configuration to not use clones but normal primitives
> > with an infinite constraint. That should work, shouldn't it?
> > But I haven't figured out yet how to tell the resource
> > that it only can shoot the _other_ node...
> That depends on how you configure the wti_nps device. Some devices know
> what names go with the outlets (in the device firmware), and some have
> to be told through configuring the plugin.
> If the first case, you need to configure the outlet names correctly in
> the device. For the second case, you need to configure the plugin
> The WTI NPS switch appears to be the first type.
> /* The status command output contains mapping of hosts to outlets */
> SEND(nps->wrfd, "/s\r");
> So, it sends a /s (status?) command to the device, and the device tells
> it what outlets are hooked to what hosts.
>From the configuration point of view the clone configuration would work.
The problem is that simultaneous connections to the wti_nps device are not supported. Well that would be no problem, if the wti_nps device would behave like a syncronisation object (block until the first connection is finished). But what it does is to immediately return a connection failure to the second connection.
So since operations are not (and are not supposed to be) synchronised, for example a simultaneously monitor operation would properbly fail on one node,
because the wti_nps plugin would only recognise a connection error.
But I like the idea from Andrew, to use an additional external stonith plugin tocommit suicide. Maybe by triggering a kernel panic or something.
So I would have four stontith devices: kill_spock, kill_sarek, suicide_spock suicide_sarek.
kill_spock and suicide_sarek can run only on sarek
kill_sarek and suicide_spock can only run on spock
> Alan Robertson <alanr at unix.sh>
> "Openness is the foundation and preservative of friendship... Let me
> claim from you at all times your undisguised opinions." - William
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA