[Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: membership bygshi from

Andrew Beekhof (GMail) beekhof at gmail.com
Wed Feb 16 00:11:38 MST 2005


On Feb 15, 2005, at 8:05 PM, Alan Robertson wrote:

> Andrew Beekhof (GMail) wrote:
>> On Feb 15, 2005, at 7:21 PM, Alan Robertson wrote:
>>> Andrew Beekhof (GMail) wrote:
>>>
>>>>> Not at all true.
>>>>>
>>>>> I see the confusion.
>>>>>
>>>>> What I meant was crash the machine locally -- without using 
>>>>> STONITH.  This is easy, and it's *like* STONITH - but doesn't use 
>>>>> the OfficialSTONITHCode(tm).
>>>>
>>>> Oh thats so evil.  Why would we prefer this of calling STONITH?  I 
>>>> mean we're going to have to anyway since the node is now unclean 
>>>> (left without permission)
>>>
>>>
>>>
>>> Crashing is crashing.  This is a built-in, documented API for 
>>> crashing on machines which maintain basic communications sanity.  
>>> Evil seems a little strong an adjective, IMHO...    ;-)
>>>
>>> It will ALWAYS work - even when STONITH isn't configured, or the 
>>> STONITH hardware is flaky - and it's probably faster than STONITH.
>>>
>>> I'm having trouble following all the levels of message nesting 
>>> above, and which questions you were asking me to respond to.  So, 
>>> I'll start over...
>>>
>>> Right now, you don't pay attention to quorum at all.
>>>
>>> For a 2-node cluster - you can do as you please - and you needn't 
>>> have quorum.  OR you can require it.  I would see this as an option.
>>>
>>> For quorum we can add a ping-node-vote option to the membership 
>>> layer.
>>>
>>> The "best" way is to enable both options - at least for 2-node 
>>> clusters.
>>>
>>> But, it's not so obvious that both should always be enabled in all 
>>> cases.
>>>
>>> Some would claim that quorum shouldn't be declared without fencing - 
>>> but that's a separate subject.
>>>
>>>
>> Missed a couple, so I'll re-ask :)
>> You pointed out the loss of consciousness situation:
>>>> If you cannot contact any node when you first come up, then EVERY 
>>>> node you cannot contact must be STONITHed.  ("everybody must get 
>>>> stoned" - as Dylan might say).  No doubt some delay would be in 
>>>> order for this case.
>> Then I said:
>>> I believe that if we amend #2 above to also include nodes it has 
>>> never seen (code for this is in CVS) _and_ amend the CIB startup to 
>>> erase the status section after reading the config from disk (so that 
>>> we're saying _no_ nodes have been seen)... then we have the bases 
>>> covered right?
>> Would you agree? or am i still missing a case?
>
> I believe that's right.  I'd have to know more about your code to know 
> for absolute certain - but it certainly sounds right to me.  [I didn't 
> answer this one because we'd talked about it on IRC and/or the phone, 
> and I agreed ;-)].

Well the code is of course perfect :)
I just thought I'd make sure I hadnt forgotten anything we discussed.

>
>> The other part was that you appeared to be saying that:
>>  - we cant allocate resources until we shoot the unknowns, and
>>  - we cant/shouldnt shoot the unknowns until we have quorum
>> To me it follows that the cluster cant do resource management until 
>> we have quorum.
>> So my question was: Is that what you meant and if so, is that really 
>> we want?
>
>
> I think I answered this.  But, perhaps it wasn't completely clear, or 
> perhaps I didn't answer the question...
>
> >> For a 2-node cluster - you can do as you please - and you needn't 
> have
> >> quorum.  OR you can require it.  I would see this as an option.
>
> The "option" part isn't for 2-node only - at least not in my mind...
>
> It might be a good default all the time to have it on, but if you have 
> the option, then I'd still have it available in all configurations.  
> Is this clearer?  [or did I miss the point?]

Thanks for the clarification.

>
>
> -- 
>     Alan Robertson <alanr at unix.sh>
>
> "Openness is the foundation and preservative of friendship...  Let me 
> claim from you at all times your undisguised opinions." - William 
> Wilberforce
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
>
--
Andrew Beekhof

"No means no, and no means yes, and everything in between and all the 
rest" - TISM



More information about the Linux-HA-Dev mailing list