[Linux-HA] Preventing STONITH deathmatch
Andreas Kurz
andreas.kurz at gmail.com
Tue Sep 4 06:34:06 MDT 2007
On 9/4/07, Daniel X Moore <dxm at sgi.com> wrote:
> Is there any way to rate-limit STONITH attempts? We occasionally cause a
> problem in one of our plugins that causes the status & stop actions to
> always fail. This causes both nodes to continually kill the other node
> (and themselves, I suspect).
Depending on the application and the availability you expect you could
add "on_fail=block" to the stop operation of your resource instead of
the default "fence" (in case you are using stonith) so heartbeat waits
for manual interaction.
Regards,
Andreas
>
> This "deathmatch" behaviour makes it pretty difficult to get in and
> reconfigure/fix things since the nodes are being killed almost
> immediately they come back up.
>
> Is there any way to force a delay (with associated lack of availability)
> between STONITH attempts?
>
> Constantly rebooting machines are actually less available than machines
> not running a specific service :)
>
> --
> -------------------------------------------------------------------
> Daniel Moore dxm at sgi.com
> Engineering Manager: AppMan + HA Phone: +61-3-9963-1957
> SGI Australian Software Group Mobile: +61-4-1360-4720
> -------------------------------------------------------------------
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list