[Linux-HA] timeouts revisited

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Aug 9 08:15:44 MDT 2006


Hi,

On Wed, Aug 09, 2006 at 09:00:51AM +0200, Andrew Beekhof wrote:
> On 8/9/06, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> >***********************
> >Warning: Your file, cib.xml.bz2, appears to be a compressed file but is 
> >corrupt. It was not scanned by InterScan MSS.
> >***********************
> >
> >
> >Hi,
> >
> >CTS has been running for about half an hour and so far no
> >serious problems encountered.
> >
> >There's only this timeouts issue which has been bothering me
> >recently, as some of you may know.
> >
> >Though it almost exclusively happens with network
> >interfaces,* I think now that it's some kind of lrm problem,
> >or perhaps lrm/crm interaction. It looks as if crm expected
> >an operation to have been started by lrm and it hasn't been
> >or as if lrm wanted to started by then has somehow forgotten
> >to do it or something in between. I hope that somebody can
> >unravel this from the stuff I'll attach.
> 
> Part of the problem is the way timeouts are handled.
> If an action is supposed to take T seconds, we tell the LRM this and
> wait for 2*T seconds ourselves.
> 
> The problem here is two-fold for low values of T:
> * If the network is loaded, it may take more than T seconds to make
> the round-trip
> * If the CRM or LRM is loaded, it may take more than T seconds to get
> the action scheduled.

I looked a bit further through the logs. All timeouts happened on
sapcl02 which is the slowest computer. And it seems like all were
due to the max number of parallel operations exceeded (there was
always such a notice in the logs).

> What I am in the process of testing is a new option called
> "network-delay" which is specifically intended to account for system
> load.

This to be defined by the admin? Shouldn't it be better to be
dynamically updated depending on the actual load?

> Now the TE will tell the LRM it has T seconds and then wait 2*T _plus_
> the value of network-delay.  I think this better reflects what is
> actually happening and allow admins to tune their configuration better
> on loaded system without unduly increasing the action timeouts.
> 
> If you're interested in trying this, let me know and I'll send you the
> series of patches.

Yes, I would.

==========

I'm still not sure how does the crm-lrm communication work. From
what I gathered, it seems to be (sort of) synchronous: crm issues
a command and waits for the response (how? lrm sends crm a message
about the operation's outcome?) and, in case it didn't hear from
lrm for long enough, (some timer going off?) enquires about the
status of the operation (this the point where lrm tells sth which
makes crm think that the operation timed out?). Right?

How about sth like this:

crm says to lrm:

    crm: id rsc operation priority timeout
    crm: id cancel [this id]
    crm: 0 how are you?

lrm says to crm:

    lrm: id outcome time [when it started] duration [how long did it take]
    lrm: id rescheduled delay [how far from request]
    lrm: 0 load
    lrm: 0 hey man it's been really busy (how busy? load?) lately,
         can you back up a bit

crm thinks:

    lrm on this node has nn outstanding operations. And it keeps
    growing. It must be really busy over there. Let's space out
    operations over there and log that this node can't keep up
    with the demands (hoping that somebody's reading the logs).

lrm thinks:

    lrm doesn't think.

crm will have to keep a table (a fifo thing) of all outstanding
operations on a per member basis. lrm should keep the table as well.

The table could be rearranged in case a higher priority operation
gets in (some things are more urgent).

Perhaps there could be a priority like "please, run this now no
matter what".

At any rate, crm should not timeout an operation until lrm tells
it that the operation actually timed out.

This (or thereabouts) makes sense? This doable?

Cheers,

Dejan

> 
> >
> >Setup: HB CVS as of Aug  8 16:33. SLES9: two nodes i386 and
> >one node (sapcl03) x86_64.
> >
> >BTW, the logging has been taken care of syslog-ng now, but I
> >see some strange output (such as some chars at the end of
> >the line missing). There's still ha_logd used in between.
> >Should ha_logd be removed from the equation?
> 
> No.  My understanding is that syslog-ng is still synchronous.
> Using ha_logd means that we can do logging a-syncrnously because it
> will make the synchronous call to syslog(-ng) for us.
> 
> >
> >Cheers,
> >
> >Dejan
> >
> >*) Network interfaces are monitored by far more often in
> >this configuration (default: 5s) and hence higher probability.
> >
> >
> >
> >_______________________________________________
> >Linux-HA mailing list
> >Linux-HA at lists.linux-ha.org
> >http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >See also: http://linux-ha.org/ReportingProblems
> >
> >
> >


More information about the Linux-HA mailing list