[Linux-HA] Stonith Shutdown & Failback OFF
Cristina Bulfon
cristina.bulfon at roma1.infn.it
Thu Jul 2 06:30:51 MDT 2009
Ciao,
- regarding to set the heartbeat OFF during the boot, I decide to
follow your advice.
In any case if I set the HP ILO (riloe) and configuring Stonith, can
I play with
the
"cib-bootstrap-options-stonith-action" value="poweroff | reboot ?
Just to know if I spend another two cent on this issue.
- the reset for the fail count works great !!
Thanks
cristian
On Jun 23, 2009, at 5:51 PM, Cristina Bulfon wrote:
> Ciao Dejan,
>
> sorry for delay. actually I am out of the office
>
> As soon as I came back work I will try to reset the count of fail and
> repeat the test.
>
> thanks
>
> cristina
>
> Dejan Muhamedagic wrote:
>> On Thu, Jun 18, 2009 at 10:33:32AM +0200, Cristina Bulfon wrote:
>>
>>> Ciao,
>>>
>>> I am still setting the HA configurazion, we have 2 node cluster in
>>> active/passive mode.
>>>
>>> - to check the SAN storage device we use an external stonith
>>> device and in
>>> case of SAN's failure
>>> on the active machine, it should do a shutdown instead it makes a
>>> reboot.
>>>
>>> In the cluster property section of the cib.xml I have
>>>
>>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>>> name="stonith-enabled" value="true"/>
>>> <nvpair name="stonith-action"
>>> id="cib-bootstrap-options-stonith-action" value="poweroff"/>
>>>
>>>
>>> - regarding the "failback OFF"
>>>
>>> In the cluster property section of the cib.xml file I have
>>>
>>> <nvpair name="default-resource-stickiness"
>>> id="cib-bootstrap-options-default-resource-stickiness"
>>> value="INFINITY"/>
>>> <nvpair
>>> id="cib-bootstrap-options-default-resource-failure-stickiness"
>>> name="default-resource-failure-stickiness" value="0"/>
>>>
>>> The failback is working , when the active node is coming back the
>>> resource
>>> still remains on the passive node , for migrating the resource
>>> I have to execute the following command
>>>
>>> crm_resource -M -H <active_node> -r <group> -f (it doesn't
>>> migrate
>>> without -f )
>>> crm_resource -U -H <passive_node>
>>>
>>> If I repeat the exercise: simulate failure on the active node a
>>> couple of
>>> times it happens that the passive_node don't take the resource.
>>>
>>
>> Just found one infinitely high fail count for node
>> afsitfs3.roma1.infn.it.
>>
>> <nvpair
>> id="status-586817af-703a-4eff-ac9b-b96de063493a-fail-count-
>> Filesystem_2"
>> name="fail-count-Filesystem_2" value="INFINITY"/>
>>
>> That resource can't start on that node until you reset the
>> failcount.
>>
>> Thanks,
>>
>> Dejan
>>
>>
>>> In attachment you will find the ouput of cibadmin -Q
>>>
>>> Thanks in advance for any help
>>>
>>>
>>> cristina
>>>
>>>
>>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1763 bytes
Desc: not available
Url : http://lists.linux-ha.org/pipermail/linux-ha/attachments/20090702/e455b045/attachment.bin
More information about the Linux-HA
mailing list