[Linux-HA] Three Node Cluster Problem

Mc Linux mrmclinux at gmail.com
Fri Nov 3 03:22:40 MST 2006


Hi!

I've got the following configuration:

3 home PCs (hb1,hb2,hb3), with one ethernet. (This cluster is for
testing purposes)
Installed OS: SuSE Linux Enterprise Server 9.0
Heartbeat: 2.0.7 rpm package made with "ConfigureMe package"

same ha.cf on each maschine:

keepalive 2
deadtime 30
warntime 10
initdead 120

udpport 694
mcast eth0 225.0.0.1 694 1 0
auto_failback off

node hb1
node hb2
node hb3

same haresources on each maschine:

hb1 192.168.70.155 saslauthd

I've just use saslauthd for testing.

same authkeys on each maschine:

auth 3

3 md5 hello

When I start the cluster software on the maschines every maschine can
see the other two. Logs from hb1:

heartbeat[10390]: 2006/10/30_16:41:21 info: Local status now set to: 'up'
heartbeat[10390]: 2006/10/30_16:41:22 info: Link hb2:eth0 up.
heartbeat[10390]: 2006/10/30_16:41:22 info: Status update for node
hb2: status up
harc[10397]:    2006/10/30_16:41:22 info: Running /etc/ha.d/rc.d/status status
heartbeat[10390]: 2006/10/30_16:41:23 info: Link hb3:eth0 up.
heartbeat[10390]: 2006/10/30_16:41:23 info: Status update for node
hb3: status up

If I turn off hb1 the other two maschines relocate the service in the same time!

Ha2 logs:

info: Link hb1:eth0 dead.
info: Running /etc/ha.d/rc.d/status status
info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys
hb2] to acquire.
info: Taking over resource group 192.168.70.155
info: Acquiring resource group: hb1 192.168.70.155 saslauthd
INFO: IPaddr Resource is stopped
info: Running /etc/ha.d/resource.d/IPaddr 192.168.70.155 start
INFO: eval /sbin/ifconfig eth0:0 192.168.70.155 netmask 255.255.255.0
broadcast 192.168.70.255
INFO: Sending Gratuitous Arp for 192.168.70.155 on eth0:0 [eth0]


Ha3 logs:

WARN: node hb1: is dead
WARN: No STONITH device configured.
WARN: Shared disks are not protected.
info: Resources being acquired from hb1.
info: Link hb1:eth0 dead.
info: Running /etc/ha.d/rc.d/status status
info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys
hb3] to acquire.
info: Taking over resource group 192.168.70.155
nfo: Acquiring resource group: hb1 192.168.70.155 saslauthd
INFO: IPaddr Resource is stopped
info: Running /etc/ha.d/resource.d/IPaddr 192.168.70.155 start
INFO: eval /sbin/ifconfig eth0:0 192.168.70.155 netmask 255.255.255.0
broadcast 192.168.70.255
INFO: Sending Gratuitous Arp for 192.168.70.155 on eth0:0 [eth0]

And after a few seconds they realize that both of them got the resources:

ERROR: Both machines own our resources!

What did I wrong?
I think I missed something from the ha.cf, but I don't know what. I've
got dozens of working two-node clusters.

Please help me!

Thanks,

Mr McLinux


More information about the Linux-HA mailing list