[Linux-HA] Simple Active-Standby failover configuration.

Andrew Beekhof beekhof at gmail.com
Thu Oct 19 09:20:52 MDT 2006


On 10/19/06, Peter Wong <peter.wong at mobidia.com> wrote:
> Greetings:
>
> Here is a description of my problem:
> ---
> I have an application identified to Heartbeat as a resource
> called "mag_resource". To start off the scenario,
> "mag_resource" runs on Node "pwhaone". This "mag_resource"
> needs to be monitored by Heartbeat.
>
> I have written an OCF Resource Agent to provide the
> implementation of the "monitor" subcommand. The monitoring
> is basically by looking at the timestamp of a pre-determined
> file.
>
> After Heartbeat started on Node "pwhaone", I also start
> Heartbeat on Node "pwhatwo" (the standby node). So I have
> entered the steady state in which Heartbeat and "mag_resource"
> both runs on Node "pwhaone" and on Node "pwhatwo", I just
> have Heartbeat running.
>
> Then I intentionally generate a failure of the "mag_resource"
> on Node "pwhaone" to force a switch-over to the standby Node
> "pwhatwo".
>
> But I'm seeing "mag_resource" keep restarting on the original
> active Node "pwhaone" regardless of how many times I kill
> the "mag_resource" on Node "pwhaone".
>
> I have attached my cib.xml file as well as listing it here:
> ---
>  <cib admin_epoch="0" have_quorum="true" generated="true" num_peers="2"
> cib_feature_revision="1.3" epoch="56" num_updates="1721"
> cib-last-written="Wed Oct 18 14:51:26 2006" ccm_transition="2"
> dc_uuid="38bc5ad2-37a9-44dc-a7bc-47d8a66b336a">
>    <configuration>
>      <crm_config>
>        <cluster_property_set id="cib-bootstrap-options">
>          <attributes>
>            <nvpair id="cib-bootstrap-options-symmetric_cluster"
> name="symmetric_cluster" value="true"/>
>            <nvpair id="cib-bootstrap-options-no_quorum_policy"
> name="no_quorum_policy" value="stop"/>
>            <nvpair id="cib-bootstrap-options-default_resource_stickiness"
> name="default_resource_stickiness" value="0"/>
>            <nvpair
> id="cib-bootstrap-options-default_resource_failure_stickiness"
> name="default_resource_failure_stickiness" value="-INFINITY"/>
>            <nvpair id="cib-bootstrap-options-stonith_enabled"
> name="stonith_enabled" value="false"/>
>            <nvpair id="cib-bootstrap-options-stonith_action"
> name="stonith_action" value="reboot"/>
>            <nvpair id="cib-bootstrap-options-stop_orphan_resources"
> name="stop_orphan_resources" value="true"/>
>            <nvpair id="cib-bootstrap-options-stop_orphan_actions"
> name="stop_orphan_actions" value="true"/>
>            <nvpair id="cib-bootstrap-options-remove_after_stop"
> name="remove_after_stop" value="false"/>
>            <nvpair id="cib-bootstrap-options-short_resource_names"
> name="short_resource_names" value="true"/>
>            <nvpair id="cib-bootstrap-options-transition_idle_timeout"
> name="transition_idle_timeout" value="5min"/>
>            <nvpair id="cib-bootstrap-options-default_action_timeout"
> name="default_action_timeout" value="5s"/>
>            <nvpair id="cib-bootstrap-options-is_managed_default"
> name="is_managed_default" value="true"/>
>            <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1160796889"/>
>          </attributes>
>        </cluster_property_set>
>      </crm_config>
>      <nodes>
>        <node id="43a8e76e-40b8-4a1d-adc5-45bbdcc4a23d" uname="pwhaone"
> type="normal"/>
>        <node id="38bc5ad2-37a9-44dc-a7bc-47d8a66b336a" uname="pwhatwo"
> type="normal"/>
>      </nodes>
>      <resources>
>        <primitive class="ocf" id="mag_resource" provider="mag" type="mag"
> resource_stickiness="0">
>          <operations>
>            <op id="mag_resource_mon" interval="20s" name="monitor"
> timeout="10s"/>
>          </operations>
>          <instance_attributes id="mag_resource">
>            <attributes>
>              <nvpair id="mag_resource-target_role" name="target_role"
> value="started"/>
>              <nvpair id="mag_resource-watcher_ts_period"
> name="watcher_ts_period" value="10"/>
>              <nvpair id="mag_resource-max_consecutive_missed_watcher_ts"
> name="max_consecutive_missed_watcher_ts" value="2"/>
>              <nvpair id="mag_resource-max_consecutive_watcher_restart"
> name="max_consecutive_watcher_restart" value="3"/>
>              <nvpair id="mag_resource-consecutive_missed_watcher_ts_count"
> name="consecutive_missed_watcher_ts_count" value="0"/>
>              <nvpair id="mag_resource-consecutive_watcher_restart_count"
> name="consecutive_watcher_restart_count" value="0"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>      </resources>
>      <constraints>
>        <rsc_location id="rsc_location_mag_resource" rsc="mag_resource">
>          <rule id="prefered_location_mag_resource" score="100">
>            <expression attribute="#uname"
> id="prefered_location_mag_resource_expr" operation="eq" value="pwhaone"/>
>          </rule>
>          <rule id="other_location_mag_resource" score="100">
>            <expression attribute="#uname"
> id="other_location_mag_resource_expr" operation="eq" value="pwhatwo"/>
>          </rule>
>        </rsc_location>
>      </constraints>
>    </configuration>
>  </cib>
> ---
>
> I have tried various "stickiness" setting" but getting nowhere.

you need the failure_stickiness attribute


More information about the Linux-HA mailing list