[Linux-HA] colocating resources on failed restart :CRM-Stable-4a0d4e40eeb0

Alex and Gill Strachan asgks at yahoo.com
Sun Nov 5 20:53:02 MST 2006


I converted the cib.xml back to using groups which fixes the colocation problem but how to do perform system upgrades.

e.g.
I run command 
  crm_resource -r resource_sinfids3A_aims -p target_role -v stopped
my hope here is that hb will stop the resource and stop monitoring, allowing the aims software to be manually updated/started/checked/stopped; then run
  crm_resource -r resource_sinfids3A_aims -p target_role -v started
hb will start the resource if necessary then continue to monitor.

What actually happens is that hb stops ALL resources within the group_sinfids3A  -  aarrrgh.


I realise that I am very confused on how to configure the primitive restart_type, and the on_fail for the monitor operation.  Any help on this would be wonderful.


The scenarios are 

  crm_resource -r resource_sinfids3A_aims -p target_role -v stopped      [as above]
    stop aims - allow manual updates...

  crm_resource -r resource_sinfids3A_oralsnr -p target_role -v stopped 
    stop oracle listener - allow manaul updates



  crm_resource -r resource_sinfids3A_oracle -p target_role -v stopped     
   
stop aims, oracle listener, then oracle


  crm_resource -r resource_sinfids3A_smb -p target_role -v stopped      
    stop samba - allow manual updates


Node: sinfids3b1 (338afa76-8997-4d66-8381-fc36ec4b456b): online
Node: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9): online
Node: sinfids3a1 (b757aece-0e47-41e5-92b7-6a80b4f3eea7): online

Resource Group: group_sinfids3
    resource_sinfids3_vip       (heartbeat::ocf:IPaddr):        Started sinfids3a1
Resource Group: group_sinfids3A
    resource_sinfids3A_vip      (heartbeat::ocf:IPaddr):        Started sinfids3a1
    resource_sinfids3A_drbd     (heartbeat:drbddisk):   Started sinfids3a1
    resource_sinfids3A_fs       (heartbeat::ocf:Filesystem):    Started sinfids3a1
    resource_sinfids3A_smb      (lsb:smb):      Started sinfids3a1
    resource_sinfids3A_oracle   (heartbeat::ocf:oracle):        Started sinfids3a1
    resource_sinfids3A_oralsnr  (heartbeat::ocf:oralsnr):       Started sinfids3a1
    resource_sinfids3A_aims     (lsb:aims):     Started sinfids3a1
Resource Group: group_sinfids3B
    resource_sinfids3B_vip      (heartbeat::ocf:IPaddr):        Started sinfids3b1


My current settings for the primitives, res_order and monitor:

[root at sinfids3b1 ~]# egrep "primitive |monitor|rsc_ord|group " saved-cib.xml
       <group id="group_sinfids3" ordered="true" collocated="true" is_managed="true" restart_type="restart">
         <primitive id="resource_sinfids3_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
             <op id="IPaddr_sinfids3_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
       <group id="group_sinfids3A" ordered="true" collocated="true" is_managed="true" restart_type="ignore">
         <primitive id="resource_sinfids3A_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
             <op id="IPaddr_sinfids3A_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
         <primitive id="resource_sinfids3A_drbd" class="heartbeat" type="drbddisk" provider="heartbeat" is_managed="true" restart_type="restart">
             <op id="drbddisk_sinfids3A_drbd_mon" name="monitor" interval="60s" timeout="60s" on_fail="restart"/>
         <primitive class="ocf" type="Filesystem" provider="heartbeat" id="resource_sinfids3A_fs" is_managed="true" restart_type="restart">
             <op name="monitor" timeout="60s" id="Filesystem_sinfids3A_fs_mon" interval="60s" on_fail="restart"/>
         <primitive class="lsb" type="smb" id="resource_sinfids3A_smb" is_managed="true" restart_type="restart">
             <op name="monitor" timeout="60s" id="smb_sinfids3A_smb_mon" interval="30s" on_fail="restart"/>
         <primitive class="ocf" type="oracle" provider="heartbeat" id="resource_sinfids3A_oracle" is_managed="true" restart_type="restart">
             <op name="monitor" timeout="60s" id="oracle_sinfids3A_oracle_mon" interval="300s" on_fail="restart"/>
         <primitive class="ocf" type="oralsnr" provider="heartbeat" id="resource_sinfids3A_oralsnr" is_managed="true" restart_type="restart">
             <op name="monitor" timeout="60s" id="oralsnr_sinfids3A_oralsnr_mon" interval="300s" on_fail="restart"/>
         <primitive class="lsb" type="aims" id="resource_sinfids3A_aims" is_managed="true" restart_type="ignore">
             <op name="monitor" timeout="240s" id="aims_sinfids3A_aims_mon" interval="180s" on_fail="restart"/>
       <group id="group_sinfids3B" ordered="true" collocated="true" is_managed="true" restart_type="restart">
         <primitive id="resource_sinfids3B_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
             <op id="IPaddr_sinfids3B_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
       <rsc_order id="order_sinfids3_sinfids3A" from="group_sinfids3" action="start" type="after" to="group_sinfids3A" symmetrical="true"/>
       <rsc_order id="order_sinfids3_sinfids3B" from="group_sinfids3" action="start" type="after" to="group_sinfids3B" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_vip" from="resource_sinfids3A_vip" action="start" type="before" to="resource_sinfids3A_drbd" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_drbd" from="resource_sinfids3A_drbd" action="start" type="after" to="resource_sinfids3A_vip" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_fs" from="resource_sinfids3A_fs" action="start" type="after" to="resource_sinfids3A_drbd" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_smb" from="resource_sinfids3A_smb" action="start" type="after" to="resource_sinfids3A_fs" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_oracle" from="resource_sinfids3A_oracle" action="start" type="after" to="resource_sinfids3A_fs" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" action="start" type="after" to="resource_sinfids3A_oracle" symmetrical="true"/>
       <rsc_order id="order_sinfids3A_aims" from="resource_sinfids3A_aims" action="start" type="after" to="resource_sinfids3A_oralsnr" symmetrical="true"/>




----- Original Message ----
From: Serge Dubrouski <sergeyfd at gmail.com>
To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
Sent: Sunday, 5 November, 2006 11:47:05 PM
Subject: Re: [Linux-HA] colocating resources on failed restart :CRM-Stable-4a0d4e40eeb0

Why not to ombine your resources into a group with collcated=true. In
this case they'll always stick together for all operations:
start/stop/move etc...

On 11/4/06, Alex and Gill Strachan <asgks at yahoo.com> wrote:
> I have a group of resources linked by the name 3A, these resources must always run together so I allocated large co-location scores.
>
> When the resource_sinfids3A_aims fails and it is moved to another node I need all of the 3A resources to move with it and to start before.
>
> e.g.
> resource_sinfids3A_aims  fails on node 3a2
> hb restarts and reduces node weight for that node..
> resource_sinfids3A_aims  fails on node 3a2
> hb is unable to restart on node 3a2 so decides to relocate to 3a1
>
> ...How do I inform hb to stop all the other 3A resources on 3a2 and move
> ...everything to 3a1, also starting in a particular order.
>
> Why didn't the colocation scores help in keeping the 3A resources together?
>
>
> I originally had colocation scores of INFINITY for the 3A group but this then prevents the ability to specifiy that resource smb can fail 3 times while resource aims can only fail once.
>
>
> I originally had this working by using groups and on_fail="fence" but it doesn't offer enough flexibility.
>
> e.g.
> I would like heartbeat to restart smb on failure 3 times before moving to another node; using resource_stickiness.  When using groups the restart of smb would trigger a stop of all higher resources, then start smb followed by start the higher resources.  This behaviour was not wanted.
>
>
>
>
>
>
> ============
> Last updated: Sun Nov  5 14:02:46 2006
> Current DC: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9)
> 3 Nodes configured.
> 9 Resources configured.
> ============
>
> Node: sinfids3b1 (338afa76-8997-4d66-8381-fc36ec4b456b): online
>        resource_sinfids3B_vip  (heartbeat::ocf:IPaddr)
> Node: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9): online
>        resource_sinfids3A_drbd (heartbeat:drbddisk)
>        resource_sinfids3A_fs   (heartbeat::ocf:Filesystem)
>        resource_sinfids3A_smb  (lsb:smb)
>        resource_sinfids3A_vip  (heartbeat::ocf:IPaddr)
>        resource_sinfids3A_oralsnr      (heartbeat::ocf:oralsnr)
>        resource_sinfids3_vip   (heartbeat::ocf:IPaddr)
>        resource_sinfids3A_oracle       (heartbeat::ocf:oracle)
>        resource_sinfids3A_aims (lsb:aims)
> Node: sinfids3a1 (b757aece-0e47-41e5-92b7-6a80b4f3eea7): online
>
>
>
>       <rsc_order id="order_sinfids3_sinfids3A" from="resource_sinfids3_vip" type="after" to="resource_sinfids3A_vip"/>
>       <rsc_order id="order_sinfids3_sinfids3B" from="resource_sinfids3_vip" type="after" to="resource_sinfids3B_vip"/>
>       <rsc_order id="order_sinfids3A_drbd" from="resource_sinfids3A_drbd" type="after" to="resource_sinfids3A_vip"/>
>       <rsc_order id="order_sinfids3A_fs" from="resource_sinfids3A_fs" type="after" to="resource_sinfids3A_drbd"/>
>       <rsc_order id="order_sinfids3A_smb" from="resource_sinfids3A_smb" type="after" to="resource_sinfids3A_fs"/>
>       <rsc_order id="order_sinfids3A_oracle" from="resource_sinfids3A_oracle" type="after" to="resource_sinfids3A_fs"/>
>       <rsc_order id="order_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" type="after" to="resource_sinfids3A_oracle"/>
>       <rsc_order id="order_sinfids3A_aims" from="resource_sinfids3A_aims" type="after" to="resource_sinfids3A_oralsnr"/>
>
>       <rsc_colocation id="colocation_sinfids3_sinfids3A" from="resource_sinfids3_vip" to="resource_sinfids3A_vip" score="9000"/>
>       <rsc_colocation id="colocation_sinfids3_sinfids3B" from="resource_sinfids3_vip" to="resource_sinfids3B_vip" score="3000"/>
>
>       <rsc_colocation id="colocation_sinfids3A_drbd" from="resource_sinfids3A_drbd" to="resource_sinfids3A_vip" score="100000"/>
>       <rsc_colocation id="colocation_sinfids3A_fs" from="resource_sinfids3A_fs" to="resource_sinfids3A_drbd" score="100000"/>
>       <rsc_colocation id="colocation_sinfids3A_smb" from="resource_sinfids3A_smb" to="resource_sinfids3A_fs" score="100000"/>
>       <rsc_colocation id="colocation_sinfids3A_oracle" from="resource_sinfids3A_oracle" to="resource_sinfids3A_fs" score="100000"/>
>       <rsc_colocation id="colocation_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" to="resource_sinfids3A_oracle" score="100000"/>
>       <rsc_colocation id="colocation_sinfids3A_aims" from="resource_sinfids3A_aims" to="resource_sinfids3A_oralsnr" score="100000"/>
>
>
>       <primitive class="lsb" type="aims" id="resource_sinfids3A_aims" restart_type="restart">
>         <operations>
>           <op name="monitor" timeout="240s" id="aims_sinfids3A_aims_mon" interval="180s"/>
>         </operations>
>         <instance_attributes id="resource_sinfids3A_aims">
>           <attributes>
>             <nvpair id="resource_sinfids3A_aims-target_role" name="target_role" value="started"/>
>           </attributes>
>         </instance_attributes>
>       </primitive>
>
>
>
>
>
>
> Send instant messages to your online friends http://au.messenger.yahoo.com
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems






Send instant messages to your online friends http://au.messenger.yahoo.com 


More information about the Linux-HA mailing list