[Linux-HA] Groups vs colocations.... etc
Andrew Beekhof
beekhof at gmail.com
Thu Dec 7 12:39:36 MST 2006
On 12/7/06, Andreas Kurz <akurz at sms.at> wrote:
> Andrew Beekhof wrote:
> > On 12/7/06, Andrew Beekhof <beekhof at gmail.com> wrote:
> > let me get back to you...
> >
> > as of this version it should work:
> > http://hg.beekhof.net/lha/crm-stable/rev/1045cec0d37d
>
> Thanks! I have done some tests with ptest, and it works .... but only
> for primitives and not for a group. Is there a special reason for that?
i wish i'd never agreed to support groups :)
groups are a non-atomic resource and can't be considered to be running
anywhere themselves (groups aren't always collocated although this is
the most common configuration). so any rules that use node attributes
will only make sense for the contents of the group and not the group
itself.
> This were my test instance_attributes:
>
> <instance_attributes id="higher_failure_stickiness_inst" score="100">
> <rule id="higher_failure_stickiness_rule" boolean_op="and">
> <expression attribute="#uname" operation="eq"
> value="sms-nfs-02" id="test"/>
> </rule>
> <attributes>
> <nvpair id="higher_failure_stickiness_id"
> name="resource_failure_stickiness" value="-5"/>
> </attributes>
> </instance_attributes>
> <instance_attributes id="lower_failure_stickiness_inst" score="10">
> <attributes>
> <nvpair id="lower_failure_stickiness_id"
> name="resource_failure_stickiness" value="-10"/>
> </attributes>
> </instance_attributes>
>
>
> I added that instance_attributes for node based failure_stickiness to
> the primitives in my group and recognized the new way of the weight
> calculation. So the score for a group is no longer the sum of all resources.
right - it was decided that the multiplication was too confusing, so
now we only apply it once.
>
> But there is something strange for me. I have a group and these
> location_constraints:
>
> <rsc_location rsc="gr_HANFS_01" id="run_HANFS_01">
> <rule id="pref_run_HANFS_01" score="10">
> <expression attribute="#uname" operation="eq"
> value="sms-nfs-01" id="bf7e040f-041e-4971-aaf7-76cc56048be8"/>
> </rule>
> <rule id="pref_failover_HANFS_01" score="1">
> <expression attribute="#uname" operation="eq"
> value="sms-nfs-02" id="f3ecf8a0-596b-4aaa-9092-35b6393f8d2b"/>
> </rule>
> </rsc_location>
>
>
> The group is running on sms-nfs-02 and the
> default-resource-stickiness=15. I added a fail-count=4 for one (NOT the
> first in the group) resource and started ptest:
>
> ptest[22366]: 2006/12/07_17:51:29 info: group_print: Resource Group:
> gr_HANFS_01
> ptest[22366]: 2006/12/07_17:51:29 info: native_print:
> rsc_DRBD_drbd1_uploads_sms (heartbeat:drbddisk): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print:
> rsc_DRBD_drbd2_uploads_portal (heartbeat:drbddisk): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print:
> rsc_FS_HANFS01-sms_at (heartbeat::ocf:Filesystem): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print:
> rsc_FS_HANFS01-portal (heartbeat::ocf:Filesystem): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_nfslock_01
> (lsb:HA_nfslock_01): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_nfs_01
> (heartbeat:HA_nfs): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_IP_HA1
> (heartbeat::ocf:IPaddr): Started sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 info: native_print:
> rsc_MailAlarm_HANFS_01 (heartbeat::ocf:MailTo): Started
> sms-nfs-02
> ptest[22366]: 2006/12/07_17:51:29 debug: group_rsc_location: Processing
> rsc_location pref_run_HANFS_01 for gr_HANFS_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_DRBD_drbd1_uploads_sms
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd1_uploads_sms + sms-nfs-01 : 10
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd1_uploads_sms + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_DRBD_drbd2_uploads_portal
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd2_uploads_portal + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd2_uploads_portal + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_FS_HANFS01-sms_at
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-sms_at + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-sms_at + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_FS_HANFS01-portal
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-portal + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-portal + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_nfslock_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfslock_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfslock_01 + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_nfs_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfs_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfs_01 + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_IP_HA1
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_IP_HA1 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_IP_HA1 + sms-nfs-02 : -5
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_run_HANFS_01 (Unknown) to rsc_MailAlarm_HANFS_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_MailAlarm_HANFS_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_MailAlarm_HANFS_01 + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: group_rsc_location: Processing
> rsc_location pref_failover_HANFS_01 for gr_HANFS_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_DRBD_drbd1_uploads_sms
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd1_uploads_sms + sms-nfs-01 : 10
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd1_uploads_sms + sms-nfs-02 : 16
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_DRBD_drbd2_uploads_portal
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd2_uploads_portal + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_DRBD_drbd2_uploads_portal + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_FS_HANFS01-sms_at
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-sms_at + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-sms_at + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_FS_HANFS01-portal
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-portal + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_FS_HANFS01-portal + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_nfslock_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfslock_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfslock_01 + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_nfs_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfs_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_nfs_01 + sms-nfs-02 : 15
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_IP_HA1
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_IP_HA1 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_IP_HA1 + sms-nfs-02 : -5
> ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
> Applying pref_failover_HANFS_01 (Unknown) to rsc_MailAlarm_HANFS_01
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_MailAlarm_HANFS_01 + sms-nfs-01 : 0
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
> rsc_MailAlarm_HANFS_01 + sms-nfs-02 : 15
> ....
> ptest[22366]: 2006/12/07_17:51:29 debug: debug3: sort_node_weight:
> sms-nfs-01 (10) < sms-nfs-02 (16) : weight
> ....
>
> Hmmm ... so the score from the location constraint is only added to the
> first resource in the group.
it should pass it on to the rest of them but its effect is not multiplied
> The default-resource-stickiness is added to
> every resource in the group.
correct
> The node specific failure stickiness is
> deleted from the resources with fail-counts.
correct
> But the result of the
> group score
the group itself doesnt have a score - i'm not sure what you mean here
> only includes the values from the first resource in the
> group and resources with a negative score do not initiate a failover of
> the group like in 2.0.7. So at the moment the failure-stickiness feature
> works only for the first resource in a group, is that correct?
if you send me the config i might be able to comment more - it helps
to have all the pieces
More information about the Linux-HA
mailing list