[Linux-HA] Resource Agent script passing info back to Heartbeat question.

Andrew Beekhof beekhof at gmail.com
Wed Oct 11 01:40:46 MDT 2006


On 10/11/06, Max Hofer <max.hofer at apus.co.at> wrote:
> I'm assuming you use heartbeat V2 where with the CRM (if not your are out of
> luck doing the things you want).
>
> The RAs are returing their well defined return codes to the monitoring daemon
> (CRM). It is the configuration of the CRM, i.e. the CIB file which has to
> define what the CRM should do when something sepcial happenend (ie. resource
> failure, some attribites changed etc.).
>
> This is the solution i found for the problem you described (maybe there is a
> better way, please correct me).

A very good explanation - thankyou :-)

A couple of minor points...
If resource_failure_stickiness == 0, then the value of fail_count is ignored.
In such cases, if resource_stickiness is > 0, then the most likely
action after a failed monitor action is a restart on the same node.

> When you want the resource to restart after a failure configure the CIB in the
> following way:
> * make sure the resource runs only on this node and never on another node
> * make sure the resource does not get a negative stickiness when it fails
> (resouce_failure_stickiness)
> * make sure the resource is restarted after a monitoring failure
> (on_fail="restart" for the monitoring operation).
>
> Example: a resource called "dummy" should run on node "paul" (excerpt from
> CIB.xml)
>
> <resources>
>   <primitive class="ocf" id="dummy_resource" provider="heartbeat"
>       type="Dummy" resource_failure_stickiness="0">
>     <instance_attributes>
>     <!-- this attribute is set because the Dummy resource would use a
>            default value of 10 seconds which is anoying for tests
>     -->
>       <attributes>
>         <nvpair id="startup_time" name="start_delay" value="2"/>
>       </attributes>
>     </instance_attributes>
>     <operations>
>        <!-- Attention: a restart would usually restart a resource to a node
>               where the faicount of this resource is near 0. Thus make sure
>              resource runs only on this node. See constraints.
>       -->
>        <op id="dummy_monitor" interval="8s" name="monitor" timeout="15s"
> on_fail="restart"/>
>      </operations>
>   </primitive>
> </resources>
>
> <constraints>
>   <!-- first make sure dummy runs only on paul so a restart does not move the
>     resource somewhere else -->
>   <rsc_location id="dummy_only_on_paul" rsc="dummy">
>     <rule score="-INFINITY">
>       <expression attribute="#uname" operation="ne" value="management2"/>
>     </rule>
>     <rule score="INFINITY">
>         <expression attribute="#uname" operation="eq" value="management2"/>
>     </rule>
>   </rsc_location>
> </constraints>
>
> A second way i could thinkg of restarting a resource is:
> * run a cronjob which periodically which runs a script which checks if the
> resource runs and if it is not running resets the failcount to 0 (which
> should trigger a resource start).
>
> kind regards
> Max
>
>
> On Tuesday 10 October 2006 21:20, Peter Wong wrote:
> > Greetings:
> >
> > Is there a way for the Resource Agent to return some
> > parameters/exit-code back to the Heartbeat monitoring
> > daemon during the "status" subcommand to tell Heartbeat
> > to either restart the Resource on the local node or
> > do a failover from the local node to a standby node?
> >
> > The reason I need to do this is that in some case I
> > just want to restart the resource on the local node
> > because the situation is not severe enough to go to
> > a standby node.
> >
> > Thanks in advance for any help!
> >
> > Peter.
> >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
> --
> Max Hofer
> APUS Software G.m.b.H.
> A-8074 Raaba, Bahnhofstraße 1/1
> T| +43 316 401629 11
> F| +43 316 401629 9
> W| www.apus.co.at
> E| max.hofer at apus.co.at
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


More information about the Linux-HA mailing list