[Linux-HA] Default-resource-stickiness of infinity with DRBD not keeping Primary stuck

Daniel Stickney dstickney at pronto.com
Sun Feb 3 22:34:06 MST 2008


Thanks so far for the feedback on my question. I have been unsuccessful 
so far in finding a way to configure the master DRBD resource to stick 
where it is unless there is a failure forcing it to move. We have it 
working awesome with auto failback, but don't we don't want this 
behavior. I want to make sure I am not trying to configure something 
that is impossible at this time, so I should ask:

Is setting a stickiness on the master drbd resource even possible? If 
so, how? Master_slave does not act like single instance resources like 
an IP, so I am not sure the logic applies as I expect it to.  If anyone 
on this list has DRBD in a Heartbeat V2 CRM mode configuration with the 
master DRBD resource successfully configured to stick as master on a 
node it fails over onto, even when the previous master node comes back 
online, can you please reply to me with it? I would be very 
appreciative. My cib.xml and ha.cf are below in my original email. I 
have tried dozens of permutations of this config with many 
default-resource-stickiness values, and placing resource-stickiness in 
the master_slave resource in instance_attributes and meta_attributes, 
and in the opening <master_slave> tag. All of this has been unsuccessful 
unfortunately. If the current DRBD/Heartbeat v2 versions simply do not 
support my preferred configuration logic, it would be great to know now.

Thanks for your time!
Daniel

Daniel Stickney wrote:
> Hello everyone,
>
> Our setup: CentOS 5 (kernel 2.6.18-53), Heartbeat 
> heartbeat-2.1.2-3.el5.centos, DRBD drbd-8.0.6-1.el5.centos
>
> We are running into a problem with getting the master DRBD resource to 
> stick on a node it has failed onto. We have a simple 2 node cluster 
> for demonstration of the issue, halinux1 and halinux2, with a single 
> DRBD resource. What we are seeing is halinux2 selected as the Master 
> node for DRBD on heartbeat startup, halinux1 as the slave. When 
> halinux2 is placed into standby, the halinux1 is promoted to DRBD 
> master as expected. When halinux2 is taken out of standby mode, 
> halinux1 is demoted to secondary and halinux2 is promoted to master. 
> We don't want this failback action. We want the DRBD master to stay on 
> whatever node it is on unless there is a failure requiring it to move. 
> We have default-resource-stickiness set to "infinity" in our cib.xml 
> file. I repeated this experiment with a single IP address resource (no 
> DRBD), and the stickiness of infinity worked exactly as expected: the 
> IP stayed on whatever node it was on unless there was a failure (or 
> standby mode) on the local node requiring the IP to move, so that was 
> a positive confirmation that outside of our testing with DRBD, the 
> stickiness of infinity works. We would very much appreciate 
> suggestions on how we might go about resolving this issue.
>
> Here is the cib.xml file:
> ----------------------------------
> <cib generated="true" admin_epoch="0" have_quorum="true" 
> ignore_dtd="false" num_peers="2" cib_feature_revision="1.3" epoch="35" 
> num_updates="1" cib-last-wr
> itten="Tue Jan 29 12:36:17 2008" ccm_transition="2" 
> dc_uuid="d2c440e4-9668-4a70-b7e2-de7f52834325">
>  <configuration>
>    <crm_config>
>      <cluster_property_set id="cluster_defaults">
>        <attributes>
>          <nvpair name="default-resource-stickiness" id="stickiness" 
> value="INFINITY"/>
>        </attributes>
>      </cluster_property_set>
>    </crm_config>
>    <nodes>
>      <node uname="halinux2" type="normal" 
> id="216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae">
>        <instance_attributes 
> id="nodes-216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae">
>          <attributes>
>            <nvpair name="standby" 
> id="standby-216a5f87-c472-4ce6-a3f1-7ce4f6dc1bae" value="false"/>
>          </attributes>
>        </instance_attributes>
>      </node>
>      <node uname="halinux1" type="normal" 
> id="d2c440e4-9668-4a70-b7e2-de7f52834325">
>        <instance_attributes 
> id="nodes-d2c440e4-9668-4a70-b7e2-de7f52834325">
>          <attributes>
>            <nvpair name="standby" 
> id="standby-d2c440e4-9668-4a70-b7e2-de7f52834325" value="false"/>
>          </attributes>
>        </instance_attributes>
>      </node>
>    </nodes>
>    <resources>
>      <master_slave id="ms-drbd0">
>        <meta_attributes id="ma-ms-drbd0">
>          <attributes>
>            <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
>            <nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
>            <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
>            <nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
>            <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
>            <nvpair id="ma-ms-drbd0-6" name="globally_unique" 
> value="false"/>
>            <nvpair id="ma-ms-drbd0-7" name="target_role" 
> value="started"/>
>          </attributes>
>        </meta_attributes>
>        <primitive id="DRBD" class="ocf" provider="heartbeat" type="drbd">
>          <instance_attributes id="ia-DRBD">
>            <attributes>
>              <nvpair id="ia-DRBD-1" name="drbd_resource" value="mysql"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>      </master_slave>
>    </resources>
>    <constraints/>
>  </configuration>
> </cib>
> ----------------------------------
> =========================================================================
>
> Here is our ha.cf file:
> ----------------------------------
> use_logd yes
> udpport 695
> bcast eth0
> node    halinux1
> node    halinux2
> crm on
> ----------------------------------
> =========================================================================
>
> Here is a link to the /var/log/messages output on halinux1 starting 
> from the time when halinux2 comes out of standby mode and the unwanted 
> failback occurs: http://pastebin.com/m6e55f6b3
>
> Thank you in advance for your time,
> -Daniel
>
-- 

Daniel Stickney - Linux Systems Administrator
Email: dstickney at pronto.com



More information about the Linux-HA mailing list