[Linux-HA] colocating resources on failed restart
:CRM-Stable-4a0d4e40eeb0
Alex and Gill Strachan
asgks at yahoo.com
Mon Nov 6 05:29:13 MST 2006
Some output from messages
Nov 6 22:18:06 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:43): 0.530.24978 -> 0.530.24979 (ok)
Nov 6 22:18:06 sinfids3a1 cib: [7558]: info: write_cib_contents: Wrote version 0.530.24979 of the CIB to disk (digest: 90f3210a8798fc63b18a57cb469a5974)
Nov 6 22:18:14 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (30) stop_0 on resource_sinfids3A_aims complete
Nov 6 22:18:16 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op start on resource_sinfids3A_aims (interval=0ms, key=283:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:18:16 sinfids3a1 lrmd: [7569]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:18:16 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:44): 0.530.24979 -> 0.530.24980 (ok)
Nov 6 22:18:17 sinfids3a1 cib: [7584]: info: write_cib_contents: Wrote version 0.530.24980 of the CIB to disk (digest: 65b101aca7a9a1eb28127cd70e1e9d92)
Nov 6 22:19:16 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op start on resource_sinfids3A_aims (interval=0ms, key=285:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:19:16 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 21698, call:140): 0.530.24980 -> 0.530.24981 (ok)
Nov 6 22:19:17 sinfids3a1 cib: [7865]: info: write_cib_contents: Wrote version 0.530.24981 of the CIB to disk (digest: 237ac0e781ff59e1911656955309cf9d)
Nov 6 22:19:22 sinfids3a1 cib: [4849]: info: cib_stats: Processed 109 operations (275.00us average, 0% utilization) in the last 10min
Nov 6 22:19:41 sinfids3a1 lrmd: [4850]: WARN: perform_ra_op: the operation operation start[32] on lsb::aims::resource_sinfids3A_aims for client 4853, its parameters: is_managed=[true] CRM_meta_op_target_rc=[7] CRM_meta_timeout=[240000] crm_feature_set=[1.0.7] stayed in operation list for 25330 ms (longer than 5000 ms)
Nov 6 22:19:41 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (31) start_0 on resource_sinfids3A_aims complete
Nov 6 22:19:41 sinfids3a1 lrmd: [7979]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:19:44 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:45): 0.530.24981 -> 0.530.24982 (ok)
Nov 6 22:19:44 sinfids3a1 cib: [8055]: info: write_cib_contents: Wrote version 0.530.24982 of the CIB to disk (digest: 9230c016d1ed1db85ad59e863019c53c)
Nov 6 22:19:45 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (32) start_0 on resource_sinfids3A_aims complete
Nov 6 22:19:46 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:46): 0.530.24982 -> 0.530.24983 (ok)
Nov 6 22:19:47 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op stop on resource_sinfids3A_aims (interval=0ms, key=286:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:19:47 sinfids3a1 lrmd: [8071]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:19:47 sinfids3a1 cib: [8070]: info: write_cib_contents: Wrote version 0.530.24983 of the CIB to disk (digest: ae1b4656b04ebe437e1dafd65db3caca)
Nov 6 22:20:47 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 21698, call:141): 0.530.24983 -> 0.530.24984 (ok)
Nov 6 22:20:47 sinfids3a1 cib: [8560]: info: write_cib_contents: Wrote version 0.530.24984 of the CIB to disk (digest: 2ad279a54f39c06b5b8678d626ed5f06)
Nov 6 22:20:48 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op stop on resource_sinfids3A_aims (interval=0ms, key=288:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:20:51 sinfids3a1 lrmd: [8571]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:20:51 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (33) stop_0 on resource_sinfids3A_aims complete
Nov 6 22:20:52 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:47): 0.530.24984 -> 0.530.24985 (ok)
Nov 6 22:20:53 sinfids3a1 cib: [8632]: info: write_cib_contents: Wrote version 0.530.24985 of the CIB to disk (digest: 5cabecb4cf510babdf3311a7717194f3)
Nov 6 22:21:01 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (34) stop_0 on resource_sinfids3A_aims complete
Nov 6 22:21:02 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op start on resource_sinfids3A_aims (interval=0ms, key=289:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:21:02 sinfids3a1 lrmd: [8664]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:21:02 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:48): 0.530.24985 -> 0.530.24986 (ok)
Nov 6 22:21:03 sinfids3a1 cib: [8669]: info: write_cib_contents: Wrote version 0.530.24986 of the CIB to disk (digest: 1a8999f2a6b94766aa8aa77b905c1e6c)
Nov 6 22:22:03 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 21698, call:142): 0.530.24986 -> 0.530.24987 (ok)
Nov 6 22:22:03 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op start on resource_sinfids3A_aims (interval=0ms, key=291:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:22:04 sinfids3a1 cib: [8961]: info: write_cib_contents: Wrote version 0.530.24987 of the CIB to disk (digest: c0fe3550b817e1150746ff498ad62be7)
Nov 6 22:22:28 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (35) start_0 on resource_sinfids3A_aims complete
Nov 6 22:22:28 sinfids3a1 lrmd: [4850]: WARN: perform_ra_op: the operation operation start[36] on lsb::aims::resource_sinfids3A_aims for client 4853, its parameters: is_managed=[true] CRM_meta_op_target_rc=[7] CRM_meta_timeout=[240000] crm_feature_set=[1.0.7] stayed in operation list for 24530 ms (longer than 5000 ms)
Nov 6 22:22:28 sinfids3a1 lrmd: [9075]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:22:28 sinfids3a1 lrmd: [4850]: WARN: G_SIG_dispatch: Dispatch function for SIGCHLD took too long to execute: 20 ms (> 10 ms) (GSource: 0x8dac2a8)
Nov 6 22:22:29 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:49): 0.530.24987 -> 0.530.24988 (ok)
Nov 6 22:22:30 sinfids3a1 cib: [9151]: info: write_cib_contents: Wrote version 0.530.24988 of the CIB to disk (digest: ebf0ceae26a140f5bf6a811589f8d552)
Nov 6 22:22:31 sinfids3a1 crmd: [4853]: info: process_lrm_event: LRM operation (36) start_0 on resource_sinfids3A_aims complete
Nov 6 22:22:33 sinfids3a1 cib: [4849]: info: cib_diff_notify: Update (client: 4853, call:50): 0.530.24988 -> 0.530.24989 (ok)
Nov 6 22:22:33 sinfids3a1 crmd: [4853]: info: do_lrm_rsc_op: Performing op stop on resource_sinfids3A_aims (interval=0ms, key=292:0693ed17-e3a5-4343-9b76-a57d11589f15)
Nov 6 22:22:33 sinfids3a1 lrmd: [9184]: WARN: For LSB init script, no additional parameters are needed.
Nov 6 22:22:33 sinfids3a1 cib: [9182]: info: write_cib_contents: Wrote version 0.530.24989 of the CIB to disk (digest: d3099e8490ef355f3c8949fbd1cd1f9c)
----- Original Message ----
From: Alex and Gill Strachan <asgks at yahoo.com>
To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
Sent: Monday, 6 November, 2006 9:59:04 PM
Subject: Re: [Linux-HA] colocating resources on failed restart :CRM-Stable-4a0d4e40eeb0
Spoke to soon. Now the resource_sinfds3A_aims
starts
stop
starts
stop
starts ... (very repititive!)
It looks like it's going to be a late night, already spent all day on this (AUS timezone) and I have run out of time to have a working hb2 solution. This is very disappointing :-(
There are no report failcounts, crm_verify shows nothing wrong.
Hopefully in the few hours you guys can help me.
Node: sinfids3b1 (338afa76-8997-4d66-8381-fc36ec4b456b): online
resource_sinfids3B_vip (heartbeat::ocf:IPaddr)
Node: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9): online
resource_sinfids3A_aims (lsb:aims)
resource_sinfids3_vip (heartbeat::ocf:IPaddr)
resource_sinfids3A_drbd (heartbeat:drbddisk)
resource_sinfids3A_fs (heartbeat::ocf:Filesystem)
resource_sinfids3A_vip (heartbeat::ocf:IPaddr)
resource_sinfids3A_oracle (heartbeat::ocf:oracle)
resource_sinfids3A_smb (lsb:smb)
resource_sinfids3A_oralsnr (heartbeat::ocf:oralsnr)
Node: sinfids3a1 (b757aece-0e47-41e5-92b7-6a80b4f3eea7): online
[root at sinfids3b1 ~]# cibadmin -Q |egrep "managed"
<nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="true"/>
<group id="group_sinfids3" ordered="true" collocated="true" is_managed="true" restart_type="restart">
<group id="group_sinfids3A" ordered="true" collocated="true" is_managed="true" restart_type="restart">
<nvpair id="resource_sinfids3A_aims-is_managed" name="is_managed" value="true"/>
<group id="group_sinfids3B" ordered="true" collocated="true" is_managed="true" restart_type="restart">
I removed the nvpair attribute for the resource, hoping that maybe it was causing some confusion.
[root at sinfids3b1 ~]# cibadmin -Q |egrep "managed"
<nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="true"/>
<group id="group_sinfids3" ordered="true" collocated="true" is_managed="true" restart_type="restart">
<group id="group_sinfids3A" ordered="true" collocated="true" is_managed="true" restart_type="restart">
<group id="group_sinfids3B" ordered="true" collocated="true" is_managed="true" restart_type="restart">
[root at sinfids3b1 ~]# crm_verify -L -VVV
crm_verify[27339]: 2006/11/06_21:50:40 info: main: =#=#=#=#= Getting XML =#=#=#=#=
crm_verify[27339]: 2006/11/06_21:50:40 info: main: Reading XML from: live cluster
crm_verify[27339]: 2006/11/06_21:50:40 notice: main: Required feature set: 1.1
crm_verify[27339]: 2006/11/06_21:50:40 notice: unpack_config: On loss of CCM Quorum: Ignore
crm_verify[27339]: 2006/11/06_21:50:40 info: determine_online_status: Node sinfids3b1 is online
crm_verify[27339]: 2006/11/06_21:50:40 info: determine_online_status: Node sinfids3a1 is online
crm_verify[27339]: 2006/11/06_21:50:40 info: determine_online_status: Node sinfids3a2 is online
[root at sinfids3b1 ~]# /usr/lib/heartbeat/ptest -L -VVVVVVVVVVVVVVV 2>&1 | egrep assign
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3B_vip, Node[0] sinfids3b1: 10100
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3B_vip, Node[1] sinfids3a1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3B_vip, Node[2] sinfids3a2: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3b1 to resource_sinfids3B_vip
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_aims, Node[0] sinfids3a2: 60700
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_aims, Node[1] sinfids3a1: 700
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_aims, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_aims
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oralsnr, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oralsnr, Node[1] sinfids3a1: 600
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oralsnr, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_oralsnr
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oracle, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oracle, Node[1] sinfids3a1: 500
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_oracle, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_oracle
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_smb, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_smb, Node[1] sinfids3a1: 400
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_smb, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_smb
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_fs, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_fs, Node[1] sinfids3a1: 300
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_fs, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_fs
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_drbd, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_drbd, Node[1] sinfids3a1: 200
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_drbd, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_drbd
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_vip, Node[0] sinfids3a2: 1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_vip, Node[1] sinfids3a1: 100
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3A_vip, Node[2] sinfids3b1: -1000000
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3A_vip
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3_vip, Node[0] sinfids3a2: 90100
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3_vip, Node[1] sinfids3b1: 3100
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Color resource_sinfids3_vip, Node[2] sinfids3a1: 100
ptest[27348]: 2006/11/06_21:50:51 debug: native_assign_node: Assigning sinfids3a2 to resource_sinfids3_vip
----- Original Message ----
From: Andreas Kurz <akurz at sms.at>
To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
Sent: Monday, 6 November, 2006 6:38:56 PM
Subject: Re: [Linux-HA] colocating resources on failed restart :CRM-Stable-4a0d4e40eeb0
Hello!
For system upgrades I prefer disabling the management (and so the
monitoring) of the affected resource. Works fine for me ... in your case
e.g:
crm_resource -p is_managed -r resource_sinfids3A_aims -t primitive -v off
... do your maintenance ....
crm_resource -p is_managed -r resource_sinfids3A_aims -t primitive -v on
Regards,
Andi
> I converted the cib.xml back to using groups which fixes the colocation problem but how to do perform system upgrades.
>
> e.g.
> I run command
> crm_resource -r resource_sinfids3A_aims -p target_role -v stopped
> my hope here is that hb will stop the resource and stop monitoring, allowing the aims software to be manually updated/started/checked/stopped; then run
> crm_resource -r resource_sinfids3A_aims -p target_role -v started
> hb will start the resource if necessary then continue to monitor.
>
> What actually happens is that hb stops ALL resources within the group_sinfids3A - aarrrgh.
>
>
> I realise that I am very confused on how to configure the primitive restart_type, and the on_fail for the monitor operation. Any help on this would be wonderful.
>
>
> The scenarios are
>
> crm_resource -r resource_sinfids3A_aims -p target_role -v stopped [as above]
> stop aims - allow manual updates...
>
> crm_resource -r resource_sinfids3A_oralsnr -p target_role -v stopped
> stop oracle listener - allow manaul updates
>
>
>
> crm_resource -r resource_sinfids3A_oracle -p target_role -v stopped
>
> stop aims, oracle listener, then oracle
>
>
> crm_resource -r resource_sinfids3A_smb -p target_role -v stopped
> stop samba - allow manual updates
>
>
> Node: sinfids3b1 (338afa76-8997-4d66-8381-fc36ec4b456b): online
> Node: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9): online
> Node: sinfids3a1 (b757aece-0e47-41e5-92b7-6a80b4f3eea7): online
>
> Resource Group: group_sinfids3
> resource_sinfids3_vip (heartbeat::ocf:IPaddr): Started sinfids3a1
> Resource Group: group_sinfids3A
> resource_sinfids3A_vip (heartbeat::ocf:IPaddr): Started sinfids3a1
> resource_sinfids3A_drbd (heartbeat:drbddisk): Started sinfids3a1
> resource_sinfids3A_fs (heartbeat::ocf:Filesystem): Started sinfids3a1
> resource_sinfids3A_smb (lsb:smb): Started sinfids3a1
> resource_sinfids3A_oracle (heartbeat::ocf:oracle): Started sinfids3a1
> resource_sinfids3A_oralsnr (heartbeat::ocf:oralsnr): Started sinfids3a1
> resource_sinfids3A_aims (lsb:aims): Started sinfids3a1
> Resource Group: group_sinfids3B
> resource_sinfids3B_vip (heartbeat::ocf:IPaddr): Started sinfids3b1
>
>
> My current settings for the primitives, res_order and monitor:
>
> [root at sinfids3b1 ~]# egrep "primitive |monitor|rsc_ord|group " saved-cib.xml
> <group id="group_sinfids3" ordered="true" collocated="true" is_managed="true" restart_type="restart">
> <primitive id="resource_sinfids3_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
> <op id="IPaddr_sinfids3_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
> <group id="group_sinfids3A" ordered="true" collocated="true" is_managed="true" restart_type="ignore">
> <primitive id="resource_sinfids3A_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
> <op id="IPaddr_sinfids3A_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
> <primitive id="resource_sinfids3A_drbd" class="heartbeat" type="drbddisk" provider="heartbeat" is_managed="true" restart_type="restart">
> <op id="drbddisk_sinfids3A_drbd_mon" name="monitor" interval="60s" timeout="60s" on_fail="restart"/>
> <primitive class="ocf" type="Filesystem" provider="heartbeat" id="resource_sinfids3A_fs" is_managed="true" restart_type="restart">
> <op name="monitor" timeout="60s" id="Filesystem_sinfids3A_fs_mon" interval="60s" on_fail="restart"/>
> <primitive class="lsb" type="smb" id="resource_sinfids3A_smb" is_managed="true" restart_type="restart">
> <op name="monitor" timeout="60s" id="smb_sinfids3A_smb_mon" interval="30s" on_fail="restart"/>
> <primitive class="ocf" type="oracle" provider="heartbeat" id="resource_sinfids3A_oracle" is_managed="true" restart_type="restart">
> <op name="monitor" timeout="60s" id="oracle_sinfids3A_oracle_mon" interval="300s" on_fail="restart"/>
> <primitive class="ocf" type="oralsnr" provider="heartbeat" id="resource_sinfids3A_oralsnr" is_managed="true" restart_type="restart">
> <op name="monitor" timeout="60s" id="oralsnr_sinfids3A_oralsnr_mon" interval="300s" on_fail="restart"/>
> <primitive class="lsb" type="aims" id="resource_sinfids3A_aims" is_managed="true" restart_type="ignore">
> <op name="monitor" timeout="240s" id="aims_sinfids3A_aims_mon" interval="180s" on_fail="restart"/>
> <group id="group_sinfids3B" ordered="true" collocated="true" is_managed="true" restart_type="restart">
> <primitive id="resource_sinfids3B_vip" class="ocf" type="IPaddr" provider="heartbeat" is_managed="true" restart_type="restart">
> <op id="IPaddr_sinfids3B_vip_mon" interval="60s" name="monitor" timeout="15s" on_fail="restart"/>
> <rsc_order id="order_sinfids3_sinfids3A" from="group_sinfids3" action="start" type="after" to="group_sinfids3A" symmetrical="true"/>
> <rsc_order id="order_sinfids3_sinfids3B" from="group_sinfids3" action="start" type="after" to="group_sinfids3B" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_vip" from="resource_sinfids3A_vip" action="start" type="before" to="resource_sinfids3A_drbd" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_drbd" from="resource_sinfids3A_drbd" action="start" type="after" to="resource_sinfids3A_vip" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_fs" from="resource_sinfids3A_fs" action="start" type="after" to="resource_sinfids3A_drbd" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_smb" from="resource_sinfids3A_smb" action="start" type="after" to="resource_sinfids3A_fs" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_oracle" from="resource_sinfids3A_oracle" action="start" type="after" to="resource_sinfids3A_fs" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" action="start" type="after" to="resource_sinfids3A_oracle" symmetrical="true"/>
> <rsc_order id="order_sinfids3A_aims" from="resource_sinfids3A_aims" action="start" type="after" to="resource_sinfids3A_oralsnr" symmetrical="true"/>
>
>
>
>
> ----- Original Message ----
> From: Serge Dubrouski <sergeyfd at gmail.com>
> To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
> Sent: Sunday, 5 November, 2006 11:47:05 PM
> Subject: Re: [Linux-HA] colocating resources on failed restart :CRM-Stable-4a0d4e40eeb0
>
> Why not to ombine your resources into a group with collcated=true. In
> this case they'll always stick together for all operations:
> start/stop/move etc...
>
> On 11/4/06, Alex and Gill Strachan <asgks at yahoo.com> wrote:
>> I have a group of resources linked by the name 3A, these resources must always run together so I allocated large co-location scores.
>>
>> When the resource_sinfids3A_aims fails and it is moved to another node I need all of the 3A resources to move with it and to start before.
>>
>> e.g.
>> resource_sinfids3A_aims fails on node 3a2
>> hb restarts and reduces node weight for that node..
>> resource_sinfids3A_aims fails on node 3a2
>> hb is unable to restart on node 3a2 so decides to relocate to 3a1
>>
>> ...How do I inform hb to stop all the other 3A resources on 3a2 and move
>> ...everything to 3a1, also starting in a particular order.
>>
>> Why didn't the colocation scores help in keeping the 3A resources together?
>>
>>
>> I originally had colocation scores of INFINITY for the 3A group but this then prevents the ability to specifiy that resource smb can fail 3 times while resource aims can only fail once.
>>
>>
>> I originally had this working by using groups and on_fail="fence" but it doesn't offer enough flexibility.
>>
>> e.g.
>> I would like heartbeat to restart smb on failure 3 times before moving to another node; using resource_stickiness. When using groups the restart of smb would trigger a stop of all higher resources, then start smb followed by start the higher resources. This behaviour was not wanted.
>>
>>
>>
>>
>>
>>
>> ============
>> Last updated: Sun Nov 5 14:02:46 2006
>> Current DC: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9)
>> 3 Nodes configured.
>> 9 Resources configured.
>> ============
>>
>> Node: sinfids3b1 (338afa76-8997-4d66-8381-fc36ec4b456b): online
>> resource_sinfids3B_vip (heartbeat::ocf:IPaddr)
>> Node: sinfids3a2 (ec74bd17-2016-4d32-a694-0f6983121cd9): online
>> resource_sinfids3A_drbd (heartbeat:drbddisk)
>> resource_sinfids3A_fs (heartbeat::ocf:Filesystem)
>> resource_sinfids3A_smb (lsb:smb)
>> resource_sinfids3A_vip (heartbeat::ocf:IPaddr)
>> resource_sinfids3A_oralsnr (heartbeat::ocf:oralsnr)
>> resource_sinfids3_vip (heartbeat::ocf:IPaddr)
>> resource_sinfids3A_oracle (heartbeat::ocf:oracle)
>> resource_sinfids3A_aims (lsb:aims)
>> Node: sinfids3a1 (b757aece-0e47-41e5-92b7-6a80b4f3eea7): online
>>
>>
>>
>> <rsc_order id="order_sinfids3_sinfids3A" from="resource_sinfids3_vip" type="after" to="resource_sinfids3A_vip"/>
>> <rsc_order id="order_sinfids3_sinfids3B" from="resource_sinfids3_vip" type="after" to="resource_sinfids3B_vip"/>
>> <rsc_order id="order_sinfids3A_drbd" from="resource_sinfids3A_drbd" type="after" to="resource_sinfids3A_vip"/>
>> <rsc_order id="order_sinfids3A_fs" from="resource_sinfids3A_fs" type="after" to="resource_sinfids3A_drbd"/>
>> <rsc_order id="order_sinfids3A_smb" from="resource_sinfids3A_smb" type="after" to="resource_sinfids3A_fs"/>
>> <rsc_order id="order_sinfids3A_oracle" from="resource_sinfids3A_oracle" type="after" to="resource_sinfids3A_fs"/>
>> <rsc_order id="order_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" type="after" to="resource_sinfids3A_oracle"/>
>> <rsc_order id="order_sinfids3A_aims" from="resource_sinfids3A_aims" type="after" to="resource_sinfids3A_oralsnr"/>
>>
>> <rsc_colocation id="colocation_sinfids3_sinfids3A" from="resource_sinfids3_vip" to="resource_sinfids3A_vip" score="9000"/>
>> <rsc_colocation id="colocation_sinfids3_sinfids3B" from="resource_sinfids3_vip" to="resource_sinfids3B_vip" score="3000"/>
>>
>> <rsc_colocation id="colocation_sinfids3A_drbd" from="resource_sinfids3A_drbd" to="resource_sinfids3A_vip" score="100000"/>
>> <rsc_colocation id="colocation_sinfids3A_fs" from="resource_sinfids3A_fs" to="resource_sinfids3A_drbd" score="100000"/>
>> <rsc_colocation id="colocation_sinfids3A_smb" from="resource_sinfids3A_smb" to="resource_sinfids3A_fs" score="100000"/>
>> <rsc_colocation id="colocation_sinfids3A_oracle" from="resource_sinfids3A_oracle" to="resource_sinfids3A_fs" score="100000"/>
>> <rsc_colocation id="colocation_sinfids3A_oralsnr" from="resource_sinfids3A_oralsnr" to="resource_sinfids3A_oracle" score="100000"/>
>> <rsc_colocation id="colocation_sinfids3A_aims" from="resource_sinfids3A_aims" to="resource_sinfids3A_oralsnr" score="100000"/>
>>
>>
>> <primitive class="lsb" type="aims" id="resource_sinfids3A_aims" restart_type="restart">
>> <operations>
>> <op name="monitor" timeout="240s" id="aims_sinfids3A_aims_mon" interval="180s"/>
>> </operations>
>> <instance_attributes id="resource_sinfids3A_aims">
>> <attributes>
>> <nvpair id="resource_sinfids3A_aims-target_role" name="target_role" value="started"/>
>> </attributes>
>> </instance_attributes>
>> </primitive>
>>
>>
>>
>>
>>
>>
>> Send instant messages to your online friends http://au.messenger.yahoo.com
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
>
>
>
>
> Send instant messages to your online friends http://au.messenger.yahoo.com
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Send instant messages to your online friends http://au.messenger.yahoo.com
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Send instant messages to your online friends http://au.messenger.yahoo.com
More information about the Linux-HA
mailing list