[Linux-HA] reconfiguring network interfaces causes split brain

Raoul Bhatia [IPAX] r.bhatia at ipax.at
Wed Sep 26 02:51:19 MDT 2007


hello,

ill try to keep things short so please do not consider it rude:

2 (debian 4.0) nodes: eth0 = external; eth1 = hb channel

the cluster has been in the state:

> Current DC: webcluster02 (917954cd-0285-4fcd-9cd2-671736c4de66)
> 2 Nodes configured.
 > ...
> Node: webcluster01 (49e81295-8e2f-4aeb-98f3-a14de6f62298): online
> Node: webcluster02 (917954cd-0285-4fcd-9cd2-671736c4de66): online

on webcluster01 i issued /etc/init.d/networking restart causing:
> Sep 26 10:33:59 webcluster01 kernel: [78706.462355] tg3: eth1: Link is down.
> Sep 26 10:34:02 webcluster01 kernel: [78708.758368] tg3: eth1: Link is up at 1000 Mbps, full duplex.
> Sep 26 10:34:02 webcluster01 kernel: [78708.764028] tg3: eth1: Flow control is on for TX and on for RX.
> Sep 26 10:34:29 webcluster01 heartbeat: [31919]: WARN: node webcluster02: is dead
> Sep 26 10:34:29 webcluster01 heartbeat: [31919]: info: Link webcluster02:eth1 dead.
> Sep 26 10:34:29 webcluster01 crmd: [31937]: notice: crmd_ha_status_callback: Status update: Node webcluster02 now has status [dead]
> Sep 26 10:34:29 webcluster01 ccm: [31932]: info: Break tie for 2 nodes cluster

now, crm_mon is in a split-brain situation:

webcluster01:
> Current DC: webcluster01 (49e81295-8e2f-4aeb-98f3-a14de6f62298)
> 2 Nodes configured.
> ...
> Node: webcluster01 (49e81295-8e2f-4aeb-98f3-a14de6f62298): online
> Node: webcluster02 (917954cd-0285-4fcd-9cd2-671736c4de66): OFFLINE
                                                              ^^^^^^^

webcluster02:
 > Current DC: webcluster02 (917954cd-0285-4fcd-9cd2-671736c4de66)
 > 2 Nodes configured.
 > ...
 > Node: webcluster01 (49e81295-8e2f-4aeb-98f3-a14de6f62298): online
 > Node: webcluster02 (917954cd-0285-4fcd-9cd2-671736c4de66): online
                                                              ^^^^^^

Q: how do i resolve this issue without restarting heartbeat? shouldn't
there be a check to avoid this kind of split-brain situation?

do you need any further information?

cheers,
raoul bhatia

ps: i do not use stonith yet as i do not want stonith to interfere with
configuration errors :)
-- 
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc.          email.          r.bhatia at ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG         web.          http://www.ipax.at
Barawitzkagasse 10/2/2/11           email.            office at ipax.at
1190 Wien                           tel.               +43 1 3670030
FN 277995t HG Wien                  fax.            +43 1 3670030 15
____________________________________________________________________



More information about the Linux-HA mailing list