[Linux-HA] heartbeatlink dead after IP recovery test

Andreas Kurz akurz at sms.at
Tue Dec 12 06:07:38 MST 2006


Hello!

I am testing a 2 node heartbeat cluster (2.0.7) on centos 4.2

I have done an "ifdown" on node clno01 and the IP resource configured
recovers the interface like expected. But after the specified deadtime
the heartbeatlink on the already recovered interface is decleared dead
on that node:

heartbeat[16441]: 2006/12/12_12:05:45 info: Link
clno02.intern.sms.at:bond0 dead

[clno01 ~]# cl_status hblinkstatus clno02.intern.sms.at bond0
dead

All other hblinks are up:

[clno01 ~]# cl_status hblinkstatus clno02.intern.sms.at bond1
up
[clno02 tmp]# cl_status hblinkstatus clno01.intern.sms.at bond0
up
[clno02 tmp]# cl_status hblinkstatus clno01.intern.sms.at bond1
up


The "heartbeat: read: mcast bond0" process is hanging while waiting to
receive data on an UDP socket from clno02:

[clno01 ~]# strace -f -s1024 -p 16446
Process 16446 attached - interrupt to quit
recvfrom(26,
[clno01 ~]# lsof -np 16446 |grep 26u
heartbeat 16446 nobody   26u  IPv4   18617292              UDP
224.0.0.160:ha-cluster

Is this a configuration problem, and is there a way to recover the dead
hblink without restarting heartbeat?

Regards,
Andreas



-------------- next part --------------

use_logd on
udpport 694

auto_failback off
autojoin any
compression zlib
conn_logd_time 30
coredumps false
debug 0
mcast bond0 224.0.0.160 694 1 0
mcast bond1 224.0.0.162 694 1 0

crm on

deadtime 60
initdead 120
keepalive 1
warntime 10

ping 192.168.20.5 192.168.20.10

traditional_compression false

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cib.xml.gz
Type: application/x-gunzip
Size: 4985 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20061212/dc25ad87/cib.xml-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ha-log_clno01.gz
Type: application/x-gunzip
Size: 11804 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20061212/dc25ad87/ha-log_clno01-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ha-log_clno02.gz
Type: application/x-gunzip
Size: 18017 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20061212/dc25ad87/ha-log_clno02-0001.bin


More information about the Linux-HA mailing list