[Linux-HA] ipfail in V2
Alan Robertson
alanr at unix.sh
Fri Oct 21 10:50:57 MDT 2005
Andrew Beekhof wrote:
> On 10/21/05, Alan Robertson <alanr at unix.sh> wrote:
>
> [snip]
[more-snip]
>>Is this a one-shot timer? What happens when you get conflicting
>>"repokes"? Does the last one win? That might be OK - at least for
>>events with similar hysteresis intervals.
>
>
> the repoke is controlled by the DC.
> it is started when the DC enters the idle state and cancelled if it
> ever moves out of it.
> so there is never a conflict - because there is only 1 timer and only
> 1 node running it.
>
> what you're thinking of is a timer running in the CIB.
Yes.
> you'd need to indicate somehow that this change should start/extend a timer.
Almost. Start or shorten. Never extend (AFAIK).
The algorithm is this:
Is there a delayed notification timer running?
If yes, then see how much time is left on it.
If the current update is accompanied by a timer and it is
shorter than the remaining time on that timer, then
Cancel the current delayed notification
timeout and start a new timer with
the timeout which came with the update.
If no timer running, then
start a new timer with the value which came with the
update (if any)
The idea would be that the command for updating the attribute value
would have a flag for a delayed notification value.
crm_attribute -d -n node --set --attribute foo --value bar
(or whatever it is)
> you'd also need to keep track of which updates have been sent out
> you'll start confusing clients because the order will be all messed up
I don't see this as an issue. The order in which they were updated is
irrelevant. They would all appear to the clients to be updated
simultaneously - which would be perfect. Or possibly, I misunderstood
you here.
The idea is that you set the delayed notification value to the "settling
time" for the events which change the value of this attribute. It's
kind of like debouncing a switch in software (except in this case it's
at an even higher level - for the whole cluster).
> you could even have a situation where the update doesnt even exit
> anymore because the whole CIB was replaced in the meantime.
Help me with this one please. I don't follow this.
[more snip]
>>By the way, I'm not 100% sure that having this interval be the shortest
>>is always the best choice. Having it be the longest might be a better
>>choice in some circumstances.
>>
>>If this is true, there are circumstances when the optimal choice is
>>undecidable. But, since this is rare - we probably shouldn't worry
>>about it _that_ much :-).
>
>
> sorry, you lost me here.
No worries. Not that important. Just getting carried away in
possibilities.
--
Alan Robertson <alanr at unix.sh>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
More information about the Linux-HA
mailing list