[Linux-HA] crm_mon confused?
Dejan Muhamedagic
dejanmm at fastmail.fm
Thu Nov 1 13:38:24 MDT 2007
Hi,
On Thu, Nov 01, 2007 at 02:44:08PM -0400, Ian Turner wrote:
> List,
>
> I seem to have found the issue behind the mysteriously curt output of
> cibadmin -Ql and crm_mon -r1 on one of my two nodes. For some reason the
> problematic machine isn't able to retrieve an updated version of the cib.
>
> From Anthony (the good node):
> Nov 1 11:34:23 anthony heartbeat: [2785]: info: Retransmitting pkt 98802
> Nov 1 11:34:23 anthony heartbeat: [2785]: info: msg size =3773, type=cib
> Nov 1 11:34:23 anthony heartbeat: [2785]: debug: rexmit request from node
> brutus for msg(98802-98802)
> Nov 1 11:34:24 anthony last message repeated 4 times
> Nov 1 11:34:24 anthony heartbeat: [2785]: info: Retransmitting pkt 98802
> Nov 1 11:34:24 anthony heartbeat: [2785]: info: msg size =3773, type=cib
> Nov 1 11:34:24 anthony heartbeat: [2785]: debug: rexmit request from node
> brutus for msg(98802-98802)
>
> This goes on, and on, and on.
Strange. Then the other node (brutus) must complain about garbled
messages or not seeing them at all.
> Running cibadmin -Ss on Brutus (the problematic node) sometimes hangs,
> sometimes not. Killing the cib process on brutus seems to have no effect.
>
> There are no firewalls installed (iptables module is not even present), the
> machines are on a common Ethernet, and ifconfig reports no receive or
> transmit errors. Network has zero packet loss and excellent bandwidth and
> latency.
>
> Any other thoughts about this?
Which version of heartbeat is this?
This looks like a bug to me. Perhaps you can open a bugzilla for
it. It would be best if you could use hb_report to attach a full
report.
Thanks,
Dejan
> --Ian Turner
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list