[Linux-HA] Stonith Shutdown & Failback OFF
Dejan Muhamedagic
dejanmm at fastmail.fm
Thu Jul 2 07:07:00 MDT 2009
Ciao,
On Thu, Jul 02, 2009 at 02:30:51PM +0200, Cristina Bulfon wrote:
> Ciao,
>
> - regarding to set the heartbeat OFF during the boot, I decide to follow
> your advice.
> In any case if I set the HP ILO (riloe) and configuring Stonith, can I
> play with
> the
> "cib-bootstrap-options-stonith-action" value="poweroff | reboot ?
You should be OK with reboot, since the heartbeat won't start
automatically on boot.
Thanks,
Dejan
> Just to know if I spend another two cent on this issue.
>
> - the reset for the fail count works great !!
>
> Thanks
>
> cristian
>
>
> On Jun 23, 2009, at 5:51 PM, Cristina Bulfon wrote:
>
>> Ciao Dejan,
>>
>> sorry for delay. actually I am out of the office
>>
>> As soon as I came back work I will try to reset the count of fail and
>> repeat the test.
>>
>> thanks
>>
>> cristina
>>
>> Dejan Muhamedagic wrote:
>>> On Thu, Jun 18, 2009 at 10:33:32AM +0200, Cristina Bulfon wrote:
>>>
>>>> Ciao,
>>>>
>>>> I am still setting the HA configurazion, we have 2 node cluster in
>>>> active/passive mode.
>>>>
>>>> - to check the SAN storage device we use an external stonith device and
>>>> in
>>>> case of SAN's failure
>>>> on the active machine, it should do a shutdown instead it makes a
>>>> reboot.
>>>>
>>>> In the cluster property section of the cib.xml I have
>>>>
>>>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>>>> name="stonith-enabled" value="true"/>
>>>> <nvpair name="stonith-action"
>>>> id="cib-bootstrap-options-stonith-action" value="poweroff"/>
>>>>
>>>>
>>>> - regarding the "failback OFF"
>>>>
>>>> In the cluster property section of the cib.xml file I have
>>>>
>>>> <nvpair name="default-resource-stickiness"
>>>> id="cib-bootstrap-options-default-resource-stickiness"
>>>> value="INFINITY"/>
>>>> <nvpair
>>>> id="cib-bootstrap-options-default-resource-failure-stickiness"
>>>> name="default-resource-failure-stickiness" value="0"/>
>>>>
>>>> The failback is working , when the active node is coming back the
>>>> resource
>>>> still remains on the passive node , for migrating the resource
>>>> I have to execute the following command
>>>>
>>>> crm_resource -M -H <active_node> -r <group> -f (it doesn't migrate
>>>> without -f )
>>>> crm_resource -U -H <passive_node>
>>>>
>>>> If I repeat the exercise: simulate failure on the active node a couple
>>>> of
>>>> times it happens that the passive_node don't take the resource.
>>>>
>>>
>>> Just found one infinitely high fail count for node
>>> afsitfs3.roma1.infn.it.
>>>
>>> <nvpair
>>>
>>> id="status-586817af-703a-4eff-ac9b-b96de063493a-fail-count-Filesystem_2"
>>> name="fail-count-Filesystem_2" value="INFINITY"/>
>>>
>>> That resource can't start on that node until you reset the
>>> failcount.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>
>>>> In attachment you will find the ouput of cibadmin -Q
>>>>
>>>> Thanks in advance for any help
>>>>
>>>>
>>>> cristina
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list