[Linux-HA] Bad node on a 2 node cluster

vini.bill at gmail.com vini.bill at gmail.com
Fri Dec 22 15:10:12 MST 2006


Hi guys. I`m new to High availability so I might be misunderstanding some
concepts.

Anyway. I`ve got a 2 node cluster ( namely DB01 and DB02 ) and on the main
node I`m able to run heartbeat perfectly, but it`s constantly complaining
about the second node being bad. Take a look at my /var/log/messages:

Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[0] : [t=reqnodes]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[1] : [dest=db01]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[2] :
[(1)destuuid=0x8122650(37 28)]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[3] : [src=db02]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[4] :
[(1)srcuuid=0x8123618(36 27)]Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR:
MSG[5] : [seq=1cf9]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[6] : [hg=18]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[7] : [ts=458c544e]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[8] : [ld=0.00 0.00
0.001/138 7707]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[9] : [ttl=4]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG[10] : [auth=2 69c27482]
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: process_status_message: bad
node [db02] in message
Dec 22 19:51:41 DB01 heartbeat: [6160]: ERROR: MSG: Dumping message with 12
fields

What does that mean? I`ve tried every working configuration setup possible
here and I`d still get the same message. How should I fix that?

The machine I`m running heartbeat on has 2 ethernet devices:

on the primary node
eth0 =10.0.0.1
eth1=192.168.0.96

and on the secondary
eth0= 10.0.0.2
eth1=192.168.0.234

my ha.cf:

keepalive 1
deadtime 5
warntime 3
initdead 120
udpport 694
ping 192.168.0.1
bcast eth1
auto_failback on
node DB01
node BD02
respawn hacluster /usr/lib/heartbeat/ipfail
use_logd yes
logfacility local7
logfile /var/log/ha-log

and my haresources:
#DB01  192.168.0.143
DB01 192.168.0.143/16/eth1
#DB02 192.168.0.143/16/eth1 drbd

On both nodes eth0 is used for drbd ( version 0.7.18 ) sync, and eth1 for
internet and heartbeat connection. I`m also considering using the new
Cluster Resource Manager xml but I`m still struggling to learn how to use
it, any tip is apreciated. I`m using Heartbeat version 2.0.5 on SLES version
10 ( no updates since I`m behind a proxy that blocks the updates ).

Thanks for the great software!

... Vinicius Menezes ...


More information about the Linux-HA mailing list