[Linux-HA] Reasonable values for timeouts
Andrew Beekhof
beekhof at gmail.com
Fri Jul 13 01:47:53 MDT 2007
On 7/12/07, Eddie C <edlinuxguru at gmail.com> wrote:
>
> I have found a few things:
>
> 1) A status or monitor function.. I would set a timeout for more then 30
> seconds.
> Why? Sometimes developers/administrators do not understand the heartbeat
> capability. They only want to to/restart a service quickly. If you set the
> status/monitor too low it detects little restarts and may cause a fail
> over.
> Also if the service is broken somehow heartbeat may try to restart it very
> often filling up logs quickly
thats not a good reason :-)
this is more likely affected by the interval, not the timeout
you're better off making the resource unmanaged instead
2) As for the timeouts. setting them high might be better as well 30 sec+. I
> had a piece of code that started in a split second in the lab with a
> testing
> configuration. In the real world it took over 20 seconds to start. I had
> the
> timeout set at 5. This drove the system crazy because things were starting
> after heartbeat gave up and attempted to fail them over to another node.
>
> Remember heartbeat is called as HA High Availability not CA Continuous
> Availability. I personally found that fail over ~60 seconds is good. If
> you
> go to low the state machine mechanics can start getting tricky.
the timeout is also the _maximum_ time the action is allowed to take.
if you set the timeout to 5 hours and the action only takes 1 minute then
we'll not spend the remaining 4h59m sitting there doing nothing :-)
On 7/12/07, matilda matilda <matilda at grandel.de> wrote:
> >
> > >>> "Andrew Beekhof" <beekhof at gmail.com> 12.07.2007 15:40 >>>
> > > >>> "Andrew Beekhof" <beekhof at gmail.com> 12.07.2007 13:53 >>>
> > > On 7/12/07, matilda matilda <matilda at grandel.de> wrote:
> > > > Hi all,
> > > >
> > > > how do I get reasonable values for timeout attributes for certain
> > operations?
> > > > How can I tune them?
> > > > Or shall I use the values provided in the RA metadata?
> > >
> > > the default-action-timeout option determines what is used by default
> > > to use a different value for a particular operation, eg. 300s for a
> > > start operation, go to the resource you wish to modify and add:
> > >
> > > <operations>
> > > <op id="somevalue" name="start" timeout="300s"/>
> > > </operations>
> > >
> > >or for a recurring monitor operation such as:
> > > <op id="DoFencing-1" name="monitor" interval="60s"
> > prereq="nothing"/>
> > >just change that to something like:
> > > <op id="DoFencing-1" name="monitor" interval="60s"
> > >prereq="nothing" timeout="300s"/>
> > >
> > >
> > >does that help?
> >
> >
> > Thank you, but what I really wanted to know is:
> > How do I get a feeling about how long a certain action could take before
> > it is assumed that this action doesn't work. So, how could I get a
> timeout
> > value which is as short as possible but not too short.
> > Is there a way to test a RA in different load situations?
> >
> > Best regards
> > Andreas Mock
> >
> >
> >
> >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list