[Linux-HA] CRM and STONITH questions

Andrew Beekhof beekhof at gmail.com
Fri Oct 19 04:52:07 MDT 2007


On 10/19/07, matilda matilda <matilda at grandel.de> wrote:
> >>> "Spam Filter" <Spam at citadelcomputer.com.au> 19.10.2007 04:36 >>>
> > Hi,
>
> Also hi, hi all,
>
> > Is the nvpair for clone_max and clone_node_max a HA parameter or meant
> >for my script? If HA, how do I know if I need the example settings or
> >changed for a 2 node fail over system?
>
> The stonith plugin for HAv2 has to be configures like a normal resource
> (single resource or clone resource). The configuration example in the
> wiki article uses a clone configuration. In the example you have a
> two node cluster, therefore the example states
> '<nvpair name="clone_max" value="2"/>'
> because a maximum of 2 clones is requested. Without any other configuration
> these 2 clones can run one one node if requested. But this doesn't make
> much sense if exactly this node gets crazy (has to be stonithed).
> Because of this there is the config snippet
> <nvpair name="clone_node_max" value="1"/>
> saying that on every node only a maximum of 1 clone has to be run.
> These two config snippets together lead to a situation (in normal
> circumstances) where exactly one stonith clone runs on every node.
> One node can shoot the other node or itself. That is NOT specified
> by this configuration.
> Short answer to your question: clone_max and clone_node_max are
> config parameters for stonithd at the end.
>
> > What exactly does the "monitor" do, is it just a status check as my
> > device is a webpage and passing a 'status' returns a success if it can
> > reach the website to stonith the nodes?
> The monitor action does the same as with a normal resource, checking if
> this resource is operational. If you have configured the monitor action
> stonithd calls the external monitor plugin with the argument 'status'.
> If the external stonith plugin resturns with return code 0, everything
> is fine, if it returns with something different, stonithd is assuming
> a failure of the plugin (stonith channel) and is propagating this failure
> to the deciding instance of HA (lrm->crm->pengine).

7 means stopped which depending on the context, may or may not be
treated as a failure by the crm

> In an error case the failcount of this stonith resource is incremented.
> Failover behaviour is the same as for normal resources

yep - the crm (mostly) doesnt actually know its a stonith resource



More information about the Linux-HA mailing list