[Linux-HA] Failover of Group on Monitor event

Bernd Broermann bernd at broermann.com
Mon Feb 25 12:45:46 MST 2008


Dejan Muhamedagic schrieb:
> Hi,
>
> On Sun, Feb 24, 2008 at 08:01:02PM +0100, Bernd Broermann wrote:
>   
>> Hello
>> I want failover when a resource monitior fails.
>>
>> Version: heartbeat-2    2.1.3-2~bpo40+
>> crm = yes
>>
>> Configuration  in short:
>> 2 Node Aktiv /Passiv derived from
>>
>> /usr/lib/heartbeat/haresources2cib.py --stdout -c /etc/ha.d/ha.cf
>> /root/haresources> /var/lib/heartbeat/crm/cib.xml
>> <nodes>
>> - node1 DC
>> - node2
>> </nodes>
>> Resources:
>> <group>
>> - IPADDR (OCF)
>> - Application (LSB init Script)
>> </group>
>>
>> If Application is not runnable it should migrate with IPADDR to node2.
>> As I read this should work with a cib entry  - operation monitor.
>>
>> cibadmin -U -o resources -X '<op id="Application_mon" interval="10s"
>> name="monitor" timeout="20s"/>'
>>     
>
> In the attached CIB, there's a monitor operation defined. If you
> want to change it, you'd have to use the same id. Also, try to
> extract the whole resource, then change whatever you want in it
> (but retain the same ids), then do cibadmin -U (or -R).
>
>   
>> It do not work !!!, Ressources stay unrunnable on node1.
>>
>>
>> For a workaround I put following in the Application init-script.
>>   status)
>>         echo -n "Status of $DESC: "
>>
>>         if  myApp_runnable  >/dev/null ; then
>>         echo -n "OK  Application runnable"
>>         crm_standby -U node1   -v false
>>         else
>>         echo "ERROR "
>>         crm_standby -U node1   -v true
>>         exit 3
>>         fi
>>     
>
> This is no good. You shouldn't put nodes in standby from the RA.
> Just returning proper exit codes should suffice.
>
>   
>> Question:
>> Is it possible to initiate a failover  of  the Group IPADDR and
>> Application , when the recource monitor states an error?  How to modify
>> the cib.xml to make it work ?
>>     
>
> You cib looks OK to me. The monitor op runs every 120secs. Did
> you wait long enough?
>
>   

Thank you fot your answer.
Even when I put something like

cibadmin -U -o resources -X '<op id="MyApplication_mon" interval="1s"
name="monitor" timeout="2s"/>'

to cib.xml, no failover happens.

Do you know the right exit code number  for the failover event ?

Regards, Bernd


More information about the Linux-HA mailing list