[Linux-HA] crm_mon confused?
Ian Turner
vectro at vectro.org
Thu Nov 1 12:44:08 MDT 2007
List,
I seem to have found the issue behind the mysteriously curt output of
cibadmin -Ql and crm_mon -r1 on one of my two nodes. For some reason the
problematic machine isn't able to retrieve an updated version of the cib.
From Anthony (the good node):
Nov 1 11:34:23 anthony heartbeat: [2785]: info: Retransmitting pkt 98802
Nov 1 11:34:23 anthony heartbeat: [2785]: info: msg size =3773, type=cib
Nov 1 11:34:23 anthony heartbeat: [2785]: debug: rexmit request from node
brutus for msg(98802-98802)
Nov 1 11:34:24 anthony last message repeated 4 times
Nov 1 11:34:24 anthony heartbeat: [2785]: info: Retransmitting pkt 98802
Nov 1 11:34:24 anthony heartbeat: [2785]: info: msg size =3773, type=cib
Nov 1 11:34:24 anthony heartbeat: [2785]: debug: rexmit request from node
brutus for msg(98802-98802)
This goes on, and on, and on.
Running cibadmin -Ss on Brutus (the problematic node) sometimes hangs,
sometimes not. Killing the cib process on brutus seems to have no effect.
There are no firewalls installed (iptables module is not even present), the
machines are on a common Ethernet, and ifconfig reports no receive or
transmit errors. Network has zero packet loss and excellent bandwidth and
latency.
Any other thoughts about this?
--Ian Turner
More information about the Linux-HA
mailing list