[Linux-HA] RE: Re: Failover of resource

Andrew Beekhof beekhof at gmail.com
Mon Jul 9 09:20:43 MDT 2007


On 7/9/07, Taldevkar, Chetan <chetan.taldevkar at patni.com> wrote:
>
> Thanks Andrew,
>
> I have modified the configuration as per the details given under
> http://www.linux-ha.org/v2/faq/forced_failover  link.
>
> I am using resource of class "heartbeat". The script which is doing
> check on the database is copied into /etc/ha.d/resource.d folder. I am
> using two operation one is start and other I have tried 'status' and
> 'monitor'.


monitor is the correct name
(the LRM will magically change the action to status for heartbeat and lsb
scripts)

Both options are not working. They continue to execute the
> script even though it returned "stopped".



more information?

at startup, we will call you script to check that it is not already running
in the cluster.
is this what you are talking about or something else?

Am I wrong in choosing resource type?
>
> What should I give on_fail as. (I tried stop, restart,block).



It depends on what you're trying to achieve.


I am not
> using fence as my understanding is, it will reboot the failed machine
> which I don't want or there is option not to reboot.
>
> What option should I use with on_fail to stop the monitor/status
> operation in case it fails in first instance?


on_fail is irrelevant here. as the page i referred you to indicates, you
need to set default_resource_failure_stickiness

Thanks again,
> Chetan
>
> --
>
> <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="2"
> cib_feature_revision="1.3" generated="true" epoch="9" num_updates="439"
> cib-last-written="Mon Jul  9 19:43:26 2007" ccm_transition="2"
> dc_uuid="5426e37c-9469-40a3-813c-eebeb0b7c6a0">
>    <configuration>
>      <crm_config>
>        <cluster_property_set id="cib-bootstrap-options">
>          <attributes>
>            <nvpair id="symmetric_cluster" name="symmetric_cluster"
> value="true"/>
>            <nvpair id="no_quorum_policy" name="no_quorum_policy"
> value="stop"/>
>            <nvpair id="default_resource_stickiness"
> name="default_resource_stickiness" value="500"/>
>            <nvpair id="default_resource_failure_stickiness"
> name="default_resource_failure_stickiness" value="-100"/>
>            <nvpair
> id="cib-bootstrap-options-default-resource-failure-stickiness"
> name="default-resource-failure-stickiness" value="-1500"/>
>            <nvpair name="last-lrm-refresh"
> id="cib-bootstrap-options-last-lrm-refresh" value="1183985435"/>
>          </attributes>
>        </cluster_property_set>
>      </crm_config>
>      <nodes>
>        <node id="5426e37c-9469-40a3-813c-eebeb0b7c6a0" uname="node1"
> type="normal"/>
>        <node id="1c3fdfbd-ee55-47e3-a8c2-52f34a5c5553" uname="node2"
> type="normal"/>
>      </nodes>
>      <resources>
>        <group id="group_org" collocated="true" ordered="true"
> multiple_active="stop_start">
>          <primitive class="ocf" id="IPaddr_1" provider="heartbeat"
> type="IPaddr">
>            <operations>
>              <op id="1" interval="1s" name="monitor" timeout="2s"/>
>            </operations>
>            <instance_attributes id="i1">
>              <attributes>
>                <nvpair id="id1" name="ip" value="172.20.1.94"/>
>                <nvpair id="mask1" name="netmask" value="24"/>
>                <nvpair id="nic1" name="nic" value="eth0"/>
>              </attributes>
>            </instance_attributes>
>          </primitive>
>          <primitive class="heartbeat" type="ttmgr.sh"
> provider="heartbeat" id="resource_tt">
>            <instance_attributes id="resource_tt_instance_attrs">
>              <attributes/>
>            </instance_attributes>
>            <operations>
>              <op id="tt_start_1" name="start" description="begin op"
> timeout="5" start_delay="0" disabled="false" role="Started"
> prereq="nothing" on_fail="stop"/>
>              <op description="check state" interval="2s" timeout="3s"
> start_delay="0" disabled="false" role="Started" prereq="nothing"
> on_fail="restart" id="tt_status_1" name="monitor"/>
>            </operations>
>          </primitive>
>        </group>
>      </resources>
>      <constraints>
>        <rsc_location id="place_testconfig" rsc="group_org">
>          <rule id="prefered_place_testconfig" score="1500">
>            <expression attribute="#uname"
> id="0480539b-f4a5-4380-b573-86ab4fc2c0c6" operation="eq" value="node1"/>
>          </rule>
>        </rsc_location>
>        <rsc_location id="place_wl1" rsc="group_org">
>          <rule id="prefered_place_wl1" score="1000">
>            <expression attribute="#uname"
> id="df9404c3-ac5d-4968-b1c9-e4cb8b7ef566" operation="eq" value="node2"/>
>          </rule>
>        </rsc_location>
>      </constraints>
>    </configuration>
> </cib>
>
>
> ---
>
>
>
>
>
>
>
> Hello,
> >
> >
> >
> > I am new to cluster HA and I am trying to failover a service between
> > two nodes. I am using heartbeat 2.0.8 on redhat linux (64 bits).
> >
> >
> >
> > The monitor operation on the resource is not doing failover after
> > encountering error during monitor operation. It keeps running the
> > script on the same node and do not fail over to another node.
>
>
> http://www.linux-ha.org/v2/faq/forced_failover
>
>
>
>
>
> http://www.patni.com
> World-Wide Partnerships. World-Class Solutions.
> _____________________________________________________________________
>
> This e-mail message may contain proprietary, confidential or legally
> privileged information for the sole use of the person or entity to
> whom this message was originally addressed. Any review, e-transmission
> dissemination or other use of or taking of any action in reliance upon
> this information by persons or entities other than the intended
> recipient is prohibited. If you have received this e-mail in error
> kindly delete  this e-mail from your records. If it appears that this
> mail has been forwarded to you without proper authority, please notify
> us immediately at netadmin at patni.com and delete this mail.
> _____________________________________________________________________
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


More information about the Linux-HA mailing list