[Linux-HA] ipfail in V2

Alan Robertson alanr at unix.sh
Thu Oct 20 10:00:10 MDT 2005


Andrew Beekhof wrote:
> On 10/20/05, Alan Robertson <alanr at unix.sh> wrote:
>>Andrew Beekhof wrote:
>>>On 10/20/05, Alan Robertson <alanr at unix.sh> wrote:
>>>>Andrew Beekhof wrote:
>>>>>On 10/20/05, Lars Marowsky-Bree <lmb at suse.de> wrote:
>>>>>>On 2005-10-20T13:22:30, Simon Rowe <srowe at cambridgebroadband.com> wrote:
>>>>>>
>>>>>>>>(OK, or wait for someone on the linux-ha team to write it.)
>>>>>>>I don't have the time for either. I'll have to dump V2 and see if V1
>>>>>>>provides sufficient working functionality.
>>>>>>I suppose this being Open Source, you could find a developer who'd take
>>>>>>a bribe and implement this feature for you for less than you'd have paid
>>>>>>for a single node license of a commercial clustering product...
>>>>>While in principle I personally would never take bribes, in practice I
>>>>>have no principles :-)
>>>>>
>>>>>With what we have now in CVS, we can get a reasonably close
>>>>>approximation of ipfail.
>>>>>You wont get true damping - so some ping-pong'ing of resources may still occur.
>>>>>It may also be not as efficient as we'd like it to be.
>>>>>
>>>>>lmb: random thought... if the "ipfail RA" checked the values on the
>>>>>other nodes before it updated the CIB, then we could avoid dipping
>>>>>into the PE.  the check would be trivial, just specify a different
>>>>Writing the code to do this is not trivial...
>>>>
>>>>
>>>>Let's see ... you need to write a join protocol, and then you need to
>>>>exchange votes with everyone else, and...
>>>>
>>>>Right now, they have NO communication with other nodes at all.
>>>you dont need any of that.  query the CIB:
>>>
>>>crm_attribute -G -U some_host -n the_attribute -t nodes
>>How does this help with hysteresis?
>>
>>You need to know the value they have of the attribute _before they
>>update it_.  Because, once they update it, boom, you're off on a
>>reconfigure-the-cluster mission.  It's too late then.
>>
> 
> this was the quick sketch that i outlined to lmb.
> 
> you'd do it as part of a monitor-like repeating action.
> 
> so first it would wait until its value had stabilized (the RA could
> store a rolling history window - not trivial but also not rocket
> science).
> 
> then once its stabilized, you check if your value exceeds the existing
> "winner"'s by some threshold.

This is irrelevant.  You only know the old winner's value.  Irrelevant 
to the current situation.

> 
> if you are the current winner, you should always update the CIB with
> your changed stable value (so that the previous step is always correct)

This is of no help.  It does not provide hysteresis at all.  Averaging 
is NOT delay - and it only works for integers anyway...

You MUST know what everyone else's values are - before they update the 
CRM.  Knowing the old value is of no help at all.

You can't know the value in another node unless you communicate outside 
of the CRM.  Which gets back to my previous note about join protocol, etc.

If this can't be done in the CIB (which is what it's beginning to sound 
like), then we'll have to create an update hysteresis daemon which deals 
with this.  Basically, a pre-update-CIB for certain attribute values.

What the ipfail code does, and this proposal does NOT do is:

	observe updates
	Delay a while
		collect ALL current updated values
		act on them

Averages actually make this worse.  If I have a value like 0/1, and want 
to average it, and present an integer value, then all you've done is 
delay the exact same events by the exact same amounts.

So...  If the events were observed in this way:
	t0: everyone sees 3 ping nodes
	t1: node 1 sees 2 ping nodes
	t2: node 2 sees 2 ping nodes

You've changed that to:
	t0: everyone sees 3 ping nodes
	t4: node 1 sees 2 ping nodes
	t5 node 2 sees 2 ping nodes.

This is of no help at all.


-- 
     Alan Robertson <alanr at unix.sh>

"Openness is the foundation and preservative of friendship...  Let me 
claim from you at all times your undisguised opinions." - William 
Wilberforce



More information about the Linux-HA mailing list