[Linux-HA] Stonith agent with IPMI

Cavailles, Philippe philippe.cavailles at egis.fr
Thu Oct 11 01:25:51 MDT 2007


A compter du 1er juin 2007, l'adresse de messagerie de votre correspondant ci-dessus comme celles de tous les collaborateurs des sociétés françaises d'Egis changent pour le format unique suivant : [prenom].[nom]@egis.fr
Merci de modifier vos annuaires en conséquence.
-------------------------------------
>From 1st June 2007, your above correspondant and all staff of Egis Group French companies have a new e-mail address based on the following format: [firstname].[familyname]@egis.fr
Please update your contact database accordingly.


Dear all,
I am using HAv2 (2.1.2-4) with Linux RedHat ES4 on a DELL PowerEdge that implement an IPMI interface.
I want to configure an IPMI stonith agent that is able to reboot a failed node, but I am facing some configuration
problems.

In CIB "crm_config" part I have set "stonith-enabled" and "always-stonith-failed-nodes" parameter to "true" as
specified in http://www.linux-ha.org/NodeFencingImplementation, chapter "STONITHing failed nodes in general".

----------------------------------------------------------------------
<crm_config>
 <cluster_property_set id="cluster_property_set">
  <attributes>
  <nvpair id="transition-idle-timer" name="transition-idle-timer" value="60s"/>
  <nvpair id="symetric-cluster" name="symetric-cluster" value="true"/>
  <nvpair id="stonith-enabled" name="stonith-enabled" value="true"/>
  <nvpair id="no-quorum-policy" name="no-quorum-policy" value="stop"/>
  <nvpair id="default-resource-stickiness" name="default-resource-stickiness" value="INFINITY"/>
  <nvpair id="is-managed-default" name="is-managed-default" value="true"/>
  <nvpair id="stop-orphan-resources" name="stop-orphan-resources" value="true"/>
  <nvpair id="stop-orphan-actions" name="stop-orphan-actions" value="true"/>
  <nvpair id="always-stonith-failed-nodes" name="always-stonith-failed-nodes" value="true"/>
  </attributes>
  </cluster_property_set>
</crm_config
----------------------------------------------------------------------

I am using the external stonith pluging provided by Martin (http://lists.community.tummy.com/pipermail/linux-ha-dev/2007-February/013989.html)
and I have created two resources with a location constraint in order to force each resource to run on a separate node.
The plugin seems to works correctly, I am able to check the power status of each node (monitor action is executed
periodically), but, when a note become "offline" heartbeat doesn't request the stonith plugin to stop and restart the
failed node.

----------------------------------------------------------------------
<primitive id="cluster1-stonith" class="stonith" provider="heartbeat" type="external/ipmi_stonith" resource_stickiness="INFINITY">
 <operations>
  <op id="op-cluster1-stonith-1" name="stop" timeout="60s"/>
  <op id="op-cluster1-stonith-2" name="start" timeout="30s"/>
  <op id="op-cluster1-stonith-3" name="monitor" timeout="5s" interval="10s"/>
 </operations>
 <instance_attributes id="c4f5a286-5dca-459e-90d6-ac649068b97f">
  <attributes>
    <nvpair id="ia-cluster1-stonith-0" name="target_role" value="started"/>
    <nvpair id="ia-cluster1-stonith-1" name="hostname" value="cluster1"/>
    <nvpair id="ia-cluster1-stonith-2" name="ipaddr" value="192.168.31.220"/>
    <nvpair id="ia-cluster1-stonith-3" name="userid" value="root"/>
    <nvpair id="ia-cluster1-stonith-4" name="passwd" value="xxxxx"/>
  </attributes>
 </instance_attributes>
</primitive>

<primitive id="cluster2-stonith" class="stonith" provider="heartbeat" type="external/ipmi_stonith" resource_stickiness="INFINITY">
 <operations>
  <op id="op-cluster2-stonith-1" name="stop" timeout="60s"/>
  <op id="op-cluster2-stonith-2" name="start" timeout="30s"/>
  <op id="op-cluster2-stonith-3" name="monitor" timeout="5s" interval="10s"/>
 </operations>
 <instance_attributes id="6a60de8a-9e90-4a9e-bd83-a4d5e1174619">
  <attributes>
    <nvpair id="ia-cluster2-stonith-0" name="target_role" value="started"/>
    <nvpair id="ia-cluster2-stonith-1" name="hostname" value="cluster2"/>
    <nvpair id="ia-cluster2-stonith-2" name="ipaddr" value="192.168.31.221"/>
    <nvpair id="ia-cluster2-stonith-3" name="userid" value="root"/>
    <nvpair id="ia-cluster2-stonith-4" name="passwd" value="xxxxx"/>
  </attributes>
 </instance_attributes>
</primitive>
----------------------------------------------------------------------
----------------------------------------------------------------------
<rsc_location id="cluster1-stonith-placement" rsc="cluster1-stonith">
  <rule id="ri-cluster1-stonith-placement-1" score="INFINITY">
   <expression id="ri-ex-cluster1-stonith-placement-1" value="cluster2" attribute="#uname" operation="eq"/>
  </rule>
  <rule id="ri-cluster1-stonith-placement-2" score="-INFINITY">
   <expression id="ri-ex-cluster1-stonith-placement-2" value="cluster2" attribute="#uname" operation="ne"/>
  </rule>
</rsc_location>

<rsc_location id="cluster2-stonith-placement" rsc="cluster2-stonith">
 <rule id="ri-cluster2-stonith-placement-1" score="INFINITY">
  <expression id="ri-ex-cluster2-stonith-placement-1" value="cluster1" attribute="#uname" operation="eq"/>
 </rule>
 <rule id="ri-cluster2-stonith-placement-2" score="-INFINITY">
  <expression id="ri-ex-cluster2-stonith-placement-2" value="cluster1" attribute="#uname" operation="ne"/>
 </rule>
</rsc_location>
----------------------------------------------------------------------

Could you advice?

Thank you,
Best regards,
Philippe





-------------- next part --------------
A non-text attachment was scrubbed...
Name: philippe.cavailles.vcf
Type: text/x-vcard
Size: 278 bytes
Desc: philippe.cavailles.vcf
URL: <http://lists.linux-ha.org/pipermail/linux-ha/attachments/20071011/f9e4d60b/attachment.vcf>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: disclaimer.txt
URL: <http://lists.linux-ha.org/pipermail/linux-ha/attachments/20071011/f9e4d60b/attachment.txt>


More information about the Linux-HA mailing list