[Linux-HA] Stonith Shutdown & Failback OFF

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Jul 2 07:07:00 MDT 2009


Ciao,

On Thu, Jul 02, 2009 at 02:30:51PM +0200, Cristina Bulfon wrote:
> Ciao,
>
> - regarding  to set the heartbeat OFF during the boot, I decide to follow 
> your advice.
>  In any case if I set the HP ILO (riloe) and configuring Stonith, can I 
> play with
> the
>     "cib-bootstrap-options-stonith-action" value="poweroff | reboot  ?

You should be OK with reboot, since the heartbeat won't start
automatically on boot.

Thanks,

Dejan

> Just to know if I spend another two cent on this issue.
>
> - the reset for the fail count works great !!
>
> Thanks
>
> cristian
>
>
> On Jun 23, 2009, at 5:51 PM, Cristina Bulfon wrote:
>
>> Ciao Dejan,
>>
>> sorry for delay. actually I am out of the office
>>
>> As soon as I came back work I will try to reset the count of fail and
>> repeat the test.
>>
>> thanks
>>
>> cristina
>>
>> Dejan Muhamedagic wrote:
>>> On Thu, Jun 18, 2009 at 10:33:32AM +0200, Cristina Bulfon wrote:
>>>
>>>> Ciao,
>>>>
>>>> I am still setting the HA configurazion, we have 2 node cluster in
>>>> active/passive mode.
>>>>
>>>> - to check the SAN storage device we use an external stonith device and 
>>>> in
>>>> case of SAN's  failure
>>>> on the active machine, it  should do a shutdown instead it makes a 
>>>> reboot.
>>>>
>>>> In the cluster property section of the cib.xml I have
>>>>
>>>> 	   <nvpair id="cib-bootstrap-options-stonith-enabled"
>>>> name="stonith-enabled" value="true"/>
>>>>           <nvpair name="stonith-action"
>>>> id="cib-bootstrap-options-stonith-action" value="poweroff"/>
>>>>
>>>>
>>>> - regarding the "failback OFF"
>>>>
>>>>   In the cluster property section of the cib.xml file I have
>>>>
>>>>           <nvpair name="default-resource-stickiness"
>>>> id="cib-bootstrap-options-default-resource-stickiness" 
>>>> value="INFINITY"/>
>>>>           <nvpair
>>>> id="cib-bootstrap-options-default-resource-failure-stickiness"
>>>> name="default-resource-failure-stickiness" value="0"/>
>>>>
>>>> The failback is working , when the active node is coming back the 
>>>> resource
>>>> still remains on the passive node , for migrating the resource
>>>> I have to execute the following command
>>>>
>>>> 		crm_resource -M -H <active_node> -r <group> -f  (it doesn't migrate
>>>> without -f )
>>>>                crm_resource -U  -H <passive_node>
>>>>
>>>> If I repeat the exercise: simulate failure on the active node a couple 
>>>> of
>>>> times it happens that the passive_node don't take the resource.
>>>>
>>>
>>> Just found one infinitely high fail count for node
>>> afsitfs3.roma1.infn.it.
>>>
>>> 	 <nvpair
>>> 	 
>>> id="status-586817af-703a-4eff-ac9b-b96de063493a-fail-count-Filesystem_2"
>>> 	 name="fail-count-Filesystem_2" value="INFINITY"/>
>>>
>>> That resource can't start on that node until you reset the
>>> failcount.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>
>>>> In attachment you will find the ouput of cibadmin -Q
>>>>
>>>> Thanks in advance for any help
>>>>
>>>>
>>>> cristina
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>



> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems


More information about the Linux-HA mailing list