[Linux-HA] Is the following logic by design for the Split-Brain Case ? If yes - can i disable it ?

Harakiri harakiri_23 at yahoo.com
Thu Sep 6 05:47:46 MDT 2007


Sorry, i havent been able to reply yet.

We are using heartbeat-2 v. 2.0.7-2 which is the
stable from debian etch.

As far as i can see, your current release is 2.1.2
which only differs in the minor number.

Did you fix issues between 2.0.7-2 and 2.1.2 regarding
split-brain case ?

Regarding the information you have now, would you say
that its definitly a bug in heartbeat if the logs, as
you said look fine ?

Thanks

--- Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:

> Hi,
> 
> On Fri, Aug 31, 2007 at 09:05:54AM -0700, Harakiri
> wrote:
> > Hello,
> > 
> > > It is most probably a bug. The cluster should be
> > > able to recover
> > > from split brain. Please post the logs.
> > > 
> > > Dejan
> > 
> > attached to this message are the log files.
> > 
> > server1_network_down.txt - the log of server1 when
> the
> > network went down
> > 
> > server2_network_down.txt - the log of server2 when
> the
> > network went down
> > 
> > server1_network_restored.txt - the log of server1
> when
> > the network has been restored
> > 
> > server2_network_restored.txt - the log of server2
> when
> > the network has been restored
> > 
> > resource_my_service = the service which has been
> > configured for heartbeat
> 
> Read the logs and there everything looks fine. Don't
> know why
> crm_mon shows the nodes as offline on that one node.
> In the logs,
> the nodes claimed to have voted a DC:
> 
> Aug 25 04:35:53 server1 crmd: [24516]: info:
> update_dc:utils.c Set DC to server1 (1.0.6)
> Aug 25 04:35:55 server2 crmd: [19821]: info:
> update_dc:utils.c Set DC to server1 (1.0.6)
> 
> which they wouldn't do unless they established the
> current membership.
> 
> BTW, this seems to be a bit older version of
> Heartbeat. Perhaps
> you can upgrade to the latest stable (2.1.2).
> 
> Thanks,
> 
> Dejan
> 
> > Thanks for your help
> > 
> > 
> > --- Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > 
> > > Hi,
> > > 
> > > On Fri, Aug 31, 2007 at 07:36:34AM -0700,
> Harakiri
> > > wrote:
> > > > Hello List,
> > > > 
> > > > suppose i have a cluster with 2 members, lets
> call
> > > > them server1 and server2.
> > > > 
> > > > Before a network outage - a service configured
> for
> > > > heartbeat is running on server2 only, crm_mon
> > > shows
> > > > that both nodes are online on both servers.
> > > > 
> > > > Now after the network outage, server1 starts
> up
> > > the
> > > > same service as server2 - this makes sense at
> is
> > > the
> > > > expected behaviour of a fail over. 
> > > > 
> > > > During this time the service is running on
> both
> > > > servers because they do not "see each other".
> > > > 
> > > > After a few hours, the network is restored -
> > > server1
> > > > sees that server2 is already (or still)
> running
> > > the
> > > > service in question - so it disables the
> service. 
> > > > 
> > > > server1 shows both nodes online and that the
> > > service
> > > > is running on server2.
> > > > 
> > > > However, on server2 both nodes show as offline
> but
> > > the
> > > > service in question is still running and
> managed
> > > by
> > > > heartbeat.
> > > > 
> > > > Is this the expected behaviour for a Split
> Brain
> > > > situation ? I.e. do not activate the node
> > > (server2)
> > > > after a split brain to be sure to not have
> > > > inconsistency ?
> > > > 
> > > > For the record, after restarting heartbeat on
> > > server2
> > > > - crm_mon shows both nodes as online.
> > > > 
> > > > If this is the expected behaviour, can i
> disable
> > > it ?
> > > > Because the service in question can run after
> a
> > > split
> > > > brain - no harm will be done, no inconsitency
> can
> > > > exist.
> > > > 
> > > > Or is it a bug - that after a network outage,
> and
> > > > restoring of the network, server2 shows both
> nodes
> > > > (itself, and server1) as offline and only a
> > > restart
> > > > repairs it to the online status ?
> > > 
> > > It is most probably a bug. The cluster should be
> > > able to recover
> > > from split brain. Please post the logs.
> > > 
> > > Dejan
> > > 
> > > > Thank you for your input
> > > > 
> > > > 
> > > >        
> > > >
> > >
> >
>
____________________________________________________________________________________
> > > > Choose the right car based on your needs. 
> Check
> > > out Yahoo! Autos new Car Finder tool.
> > > > http://autos.yahoo.com/carfinder/
> > > >
> _______________________________________________
> > > > Linux-HA mailing list
> > > > Linux-HA at lists.linux-ha.org
> > > >
> > >
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > See also:
> http://linux-ha.org/ReportingProblems
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > >
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > > 
> > 
> > 
> > 
> >        
> >
>
____________________________________________________________________________________
> > Got a little couch potato? 
> > Check out fun summer activities for kids.
> >
>
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 



       
____________________________________________________________________________________
Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/



More information about the Linux-HA mailing list