[Linux-HA] Three Node Cluster Problem
Mc Linux
mrmclinux at gmail.com
Fri Nov 3 03:22:40 MST 2006
Hi!
I've got the following configuration:
3 home PCs (hb1,hb2,hb3), with one ethernet. (This cluster is for
testing purposes)
Installed OS: SuSE Linux Enterprise Server 9.0
Heartbeat: 2.0.7 rpm package made with "ConfigureMe package"
same ha.cf on each maschine:
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
mcast eth0 225.0.0.1 694 1 0
auto_failback off
node hb1
node hb2
node hb3
same haresources on each maschine:
hb1 192.168.70.155 saslauthd
I've just use saslauthd for testing.
same authkeys on each maschine:
auth 3
3 md5 hello
When I start the cluster software on the maschines every maschine can
see the other two. Logs from hb1:
heartbeat[10390]: 2006/10/30_16:41:21 info: Local status now set to: 'up'
heartbeat[10390]: 2006/10/30_16:41:22 info: Link hb2:eth0 up.
heartbeat[10390]: 2006/10/30_16:41:22 info: Status update for node
hb2: status up
harc[10397]: 2006/10/30_16:41:22 info: Running /etc/ha.d/rc.d/status status
heartbeat[10390]: 2006/10/30_16:41:23 info: Link hb3:eth0 up.
heartbeat[10390]: 2006/10/30_16:41:23 info: Status update for node
hb3: status up
If I turn off hb1 the other two maschines relocate the service in the same time!
Ha2 logs:
info: Link hb1:eth0 dead.
info: Running /etc/ha.d/rc.d/status status
info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys
hb2] to acquire.
info: Taking over resource group 192.168.70.155
info: Acquiring resource group: hb1 192.168.70.155 saslauthd
INFO: IPaddr Resource is stopped
info: Running /etc/ha.d/resource.d/IPaddr 192.168.70.155 start
INFO: eval /sbin/ifconfig eth0:0 192.168.70.155 netmask 255.255.255.0
broadcast 192.168.70.255
INFO: Sending Gratuitous Arp for 192.168.70.155 on eth0:0 [eth0]
Ha3 logs:
WARN: node hb1: is dead
WARN: No STONITH device configured.
WARN: Shared disks are not protected.
info: Resources being acquired from hb1.
info: Link hb1:eth0 dead.
info: Running /etc/ha.d/rc.d/status status
info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys
hb3] to acquire.
info: Taking over resource group 192.168.70.155
nfo: Acquiring resource group: hb1 192.168.70.155 saslauthd
INFO: IPaddr Resource is stopped
info: Running /etc/ha.d/resource.d/IPaddr 192.168.70.155 start
INFO: eval /sbin/ifconfig eth0:0 192.168.70.155 netmask 255.255.255.0
broadcast 192.168.70.255
INFO: Sending Gratuitous Arp for 192.168.70.155 on eth0:0 [eth0]
And after a few seconds they realize that both of them got the resources:
ERROR: Both machines own our resources!
What did I wrong?
I think I missed something from the ha.cf, but I don't know what. I've
got dozens of working two-node clusters.
Please help me!
Thanks,
Mr McLinux
More information about the Linux-HA
mailing list