[Linux-HA] General problems understanding split brain (quorum)

Ryan Thomson ryan at pet.ubc.ca
Wed Jul 8 09:44:57 MDT 2009


STONITH is used to ensure a "dead" node is no longer running any cluster resources so that the remaining "live" nodes can start those resources cleanly and safely knowing the "dead" node isn't also running them.

Your example makes on error in logic, though if I'm reading it correctly. The "dead" node will be able to be STONITH'ed by live nodes because their 192.168.0.x interface is still live and can connect to the STONITH device. Nodes don't STONITH themselves, they STONITH other nodes.

In your case with only a VIP, it probably isn't a big deal. If you're willing to live with untested failure cases without STONITH, I guess that's fine. However, STONITH makes for a good reliable way of *knowing* what's going to happen when one of your nodes stops sending heartbeats.

For those of us mounting file systems on our cluster nodes as resources, it's very important to know the "dead" node is no longer touching the file system if we want automatic failover to occur. If a node mounts an ext3 file system as a resource and it goes "dead" (no heartbeats) but is still mounting and writing to the file system because it's left running (no STONITH), the node taking over that file system resource will now also mount it and two nodes in question will be mounting and potentially writing to the same ext3 file system at the same time. That could (would) quickly corrupt that ext3 file system as ext3 is not cluster aware.

--Ryan

Ehlers, Kolja wrote:
> Hello again,
> 
> I was trying to set up stonith using our APC Smart UPS 1000 with Network Management Card (see the other Mail) but now I realized
> that those devices are not accessible using the apcsmart plugin (not over lan) and the apcmaster is not compatible with those APC
> UPS. Using the serial cable I can only connect 1 node to each APC so this is not helping either. But right now I am again very
> confused about stonith and I hope maybe someone could bring some light in the dark. Let me try to explain my problem understanding. 
> 
> If I configure a stonith device which sends in split brain situations shutdown commandos to node(s), how does this make my
> environment any saver than it is w/o it. All my nodes have 2 NIC (192.168.0.x and 10.0.0.x). My hearbeat communication runs though
> both networks. Now if I had such a power managment hardware we would hook it up into the 192.168.0.x lan. Now in a situation when 1
> node is not seen by the others anymore its not on the lan anymore either so the power management hardware can not shut it down also.
> But it can not cause any damage even if it grabs my only ciritcal resource (a virtual ip). 
> 
> The point of what I am saying is that if my heartbeats are send over the lan even if a split brain happens it can not cause anything
> dangerous since that node is not in the lan available anymore. I could configure the ssh stonith device to shut the other node down
> through the 10.0.0.x interface and everything is covered.
> 
> I guess I am missing something here. Please help me out
> 
> Thanks
> 
> Kolja
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: linux-ha-bounces at lists.linux-ha.org [mailto:linux-ha-bounces at lists.linux-ha.org] Im Auftrag von Dejan Muhamedagic
> Gesendet: Mittwoch, 17. Juni 2009 15:34
> An: General Linux-HA mailing list
> Betreff: Re: [Linux-HA] General problems understanding split brain (quorum)
> 
> Hi,
> 
> On Wed, Jun 17, 2009 at 02:39:54PM +0200, Ehlers, Kolja wrote:
>> Thanks for the fast reply. 
>>
>> If I set the no-quorum-policy to ignore my 2 ciritical resources, 2 
>> virtual ip addresses are bound to each node. Of course I dont want 
>> this to happen. You say I must use a fencing technique/device to do 
>> what? To stonith the other node(s)?
>> Exactly how do I protect my virtual ips from being bound by more than 
>> one node if I do not have a configurable power supply.
> 
> The only way is to use stonith. That is going to protect your resources, because in case of split-brain one node won't start
> resources before it fences (i.e. kills) the other node.  Besides, and that has been discussed often, it is highly recommendable to
> configure stonith. And there is really no excuse for not doing that, unless it is a stretch cluster.
> 
>> And again all this theory only works if a majority count subcluster 
>> has quorum, right?
> 
> See above.
> 
>> So there is nothing one can do in a two node cluster about a split 
>> brain? I dont see how this is an advantage over just defining one node 
>> as the master node to take over all resources during a split brain and 
>> all other node will not run critical resources anymore.
> 
> What if your master node is really down? How do you know it is a split brain?
> 
> Thanks,
> 
> Dejan
> 
>> Thanks
>>
>>  
>> -----Urspr?ngliche Nachricht-----
>> Von: linux-ha-bounces at lists.linux-ha.org 
>> [mailto:linux-ha-bounces at lists.linux-ha.org] Im Auftrag von Dejan 
>> Muhamedagic
>> Gesendet: Mittwoch, 17. Juni 2009 14:01
>> An: General Linux-HA mailing list
>> Betreff: Re: [Linux-HA] General problems understanding split brain 
>> (quorum)
>>
>> Hi,
>>
>> On Wed, Jun 17, 2009 at 12:55:41PM +0200, Ehlers, Kolja wrote:
>>> Hello everybody,
>>>
>>> I am having problems understanding split brain situations. If I 
>>> understand correctly when a split brain situation happens the larger 
>>> cluster fragment have quorum and these cluster members can decide to 
>>> fence off resources or to stonith the cluster members which
>> are not seen.
>>> I have read that it is not sane to use a 2 node cluster, because in 
>>> split brain situations no one has quorum and the no-quorum-policy 
>>> decides how to deal with this.
>> There is no quorum in two-node clusters, so you set the policy to "ignore". You can use fencing to effectively replace it.
>>
>>> But what if I
>>> have a 3 node cluster and the switch delivering the heartbeat 
>>> between those members dies. Then again I will have three separate 
>>> clusters each consisting of only 1 node and none having quorum.
>> Right. Make sure that your switch doesn't die. Or use more than one switch.
>>
>>> If I think of a split brain would it not be the best action to merge 
>>> all resources to the DC and the other node(s) will shut down?  Is 
>>> this not possible to configure?
>> No.
>>
>> Thanks,
>>
>> Dejan
>>
>>> Thanks for your help
>>>
>>> Kolja
>>>
>>> Gesch?ftsf?hrung: Dr. Michael Fischer, Reinhard Eisebitt Amtsgericht 
>>> K?ln HRB 32356
>>> Steuer-Nr.: 217/5717/0536
>>> Ust.Id.-Nr.: DE 204051920
>>> --
>>> This email transmission and any documents, files or previous email 
>>> messages attached to it may contain information that is confidential 
>>> or legally privileged. If you are not the intended recipient or a 
>>> person responsible for delivering this transmission to the intended 
>>> recipient, you are hereby notified that any disclosure, copying, 
>>> printing, distribution or use of this transmission is strictly 
>>> prohibited. If you have received this transmission in error, please 
>>> immediately notify the sender by telephone or return email and 
>>> delete the original transmission and its attachments without reading or saving in any manner.
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>> Gesch?ftsf?hrung: Dr. Michael Fischer, Reinhard Eisebitt Amtsgericht 
>> K?ln HRB 32356
>> Steuer-Nr.: 217/5717/0536
>> Ust.Id.-Nr.: DE 204051920
>> --
>> This email transmission and any documents, files or previous email 
>> messages attached to it may contain information that is confidential 
>> or legally privileged. If you are not the intended recipient or a 
>> person responsible for delivering this transmission to the intended 
>> recipient, you are hereby notified that any disclosure, copying, 
>> printing, distribution or use of this transmission is strictly 
>> prohibited. If you have received this transmission in error, please 
>> immediately notify the sender by telephone or return email and delete 
>> the original transmission and its attachments without reading or saving in any manner.
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
> Geschäftsführung: Dr. Michael Fischer, Reinhard Eisebitt
> Amtsgericht Köln HRB 32356
> Steuer-Nr.: 217/5717/0536
> Ust.Id.-Nr.: DE 204051920
> --
> This email transmission and any documents, files or previous email
> messages attached to it may contain information that is confidential or
> legally privileged. If you are not the intended recipient or a person
> responsible for delivering this transmission to the intended recipient,
> you are hereby notified that any disclosure, copying, printing,
> distribution or use of this transmission is strictly prohibited. If you
> have received this transmission in error, please immediately notify the
> sender by telephone or return email and delete the original transmission
> and its attachments without reading or saving in any manner.
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



More information about the Linux-HA mailing list