[Linux-HA] Simple Active-Standby failover configuration.
Andrew Beekhof
beekhof at gmail.com
Thu Oct 19 09:20:52 MDT 2006
On 10/19/06, Peter Wong <peter.wong at mobidia.com> wrote:
> Greetings:
>
> Here is a description of my problem:
> ---
> I have an application identified to Heartbeat as a resource
> called "mag_resource". To start off the scenario,
> "mag_resource" runs on Node "pwhaone". This "mag_resource"
> needs to be monitored by Heartbeat.
>
> I have written an OCF Resource Agent to provide the
> implementation of the "monitor" subcommand. The monitoring
> is basically by looking at the timestamp of a pre-determined
> file.
>
> After Heartbeat started on Node "pwhaone", I also start
> Heartbeat on Node "pwhatwo" (the standby node). So I have
> entered the steady state in which Heartbeat and "mag_resource"
> both runs on Node "pwhaone" and on Node "pwhatwo", I just
> have Heartbeat running.
>
> Then I intentionally generate a failure of the "mag_resource"
> on Node "pwhaone" to force a switch-over to the standby Node
> "pwhatwo".
>
> But I'm seeing "mag_resource" keep restarting on the original
> active Node "pwhaone" regardless of how many times I kill
> the "mag_resource" on Node "pwhaone".
>
> I have attached my cib.xml file as well as listing it here:
> ---
> <cib admin_epoch="0" have_quorum="true" generated="true" num_peers="2"
> cib_feature_revision="1.3" epoch="56" num_updates="1721"
> cib-last-written="Wed Oct 18 14:51:26 2006" ccm_transition="2"
> dc_uuid="38bc5ad2-37a9-44dc-a7bc-47d8a66b336a">
> <configuration>
> <crm_config>
> <cluster_property_set id="cib-bootstrap-options">
> <attributes>
> <nvpair id="cib-bootstrap-options-symmetric_cluster"
> name="symmetric_cluster" value="true"/>
> <nvpair id="cib-bootstrap-options-no_quorum_policy"
> name="no_quorum_policy" value="stop"/>
> <nvpair id="cib-bootstrap-options-default_resource_stickiness"
> name="default_resource_stickiness" value="0"/>
> <nvpair
> id="cib-bootstrap-options-default_resource_failure_stickiness"
> name="default_resource_failure_stickiness" value="-INFINITY"/>
> <nvpair id="cib-bootstrap-options-stonith_enabled"
> name="stonith_enabled" value="false"/>
> <nvpair id="cib-bootstrap-options-stonith_action"
> name="stonith_action" value="reboot"/>
> <nvpair id="cib-bootstrap-options-stop_orphan_resources"
> name="stop_orphan_resources" value="true"/>
> <nvpair id="cib-bootstrap-options-stop_orphan_actions"
> name="stop_orphan_actions" value="true"/>
> <nvpair id="cib-bootstrap-options-remove_after_stop"
> name="remove_after_stop" value="false"/>
> <nvpair id="cib-bootstrap-options-short_resource_names"
> name="short_resource_names" value="true"/>
> <nvpair id="cib-bootstrap-options-transition_idle_timeout"
> name="transition_idle_timeout" value="5min"/>
> <nvpair id="cib-bootstrap-options-default_action_timeout"
> name="default_action_timeout" value="5s"/>
> <nvpair id="cib-bootstrap-options-is_managed_default"
> name="is_managed_default" value="true"/>
> <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1160796889"/>
> </attributes>
> </cluster_property_set>
> </crm_config>
> <nodes>
> <node id="43a8e76e-40b8-4a1d-adc5-45bbdcc4a23d" uname="pwhaone"
> type="normal"/>
> <node id="38bc5ad2-37a9-44dc-a7bc-47d8a66b336a" uname="pwhatwo"
> type="normal"/>
> </nodes>
> <resources>
> <primitive class="ocf" id="mag_resource" provider="mag" type="mag"
> resource_stickiness="0">
> <operations>
> <op id="mag_resource_mon" interval="20s" name="monitor"
> timeout="10s"/>
> </operations>
> <instance_attributes id="mag_resource">
> <attributes>
> <nvpair id="mag_resource-target_role" name="target_role"
> value="started"/>
> <nvpair id="mag_resource-watcher_ts_period"
> name="watcher_ts_period" value="10"/>
> <nvpair id="mag_resource-max_consecutive_missed_watcher_ts"
> name="max_consecutive_missed_watcher_ts" value="2"/>
> <nvpair id="mag_resource-max_consecutive_watcher_restart"
> name="max_consecutive_watcher_restart" value="3"/>
> <nvpair id="mag_resource-consecutive_missed_watcher_ts_count"
> name="consecutive_missed_watcher_ts_count" value="0"/>
> <nvpair id="mag_resource-consecutive_watcher_restart_count"
> name="consecutive_watcher_restart_count" value="0"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </resources>
> <constraints>
> <rsc_location id="rsc_location_mag_resource" rsc="mag_resource">
> <rule id="prefered_location_mag_resource" score="100">
> <expression attribute="#uname"
> id="prefered_location_mag_resource_expr" operation="eq" value="pwhaone"/>
> </rule>
> <rule id="other_location_mag_resource" score="100">
> <expression attribute="#uname"
> id="other_location_mag_resource_expr" operation="eq" value="pwhatwo"/>
> </rule>
> </rsc_location>
> </constraints>
> </configuration>
> </cib>
> ---
>
> I have tried various "stickiness" setting" but getting nowhere.
you need the failure_stickiness attribute
More information about the Linux-HA
mailing list