[Linux-HA] Message hist queue is filling up

Matt Wilder grewaru at gmail.com
Tue Dec 5 10:33:08 MST 2006


Greetings,

I applied the patch pointed to above with no issue.  I have installed
the patched version and restarted heartbeat on both nodes and the 99%
cpu issue appears to be gone.  However, I am still getting the
following messages in syslog and It seems as if resource handover isnt
working quite right.  Can anyone point me to what these messages mean?
 I can provide more logs if necessary.

Thanks.

Primary Node (active):
Dec  5 12:25:49 glider1 lrmd: [886]: WARN: G_SIG_dispatch: Dispatch
function for SIGCHLD was delayed 1000 ms (> 100 ms) before being
called (GSource: 0x522418)
Dec  5 12:25:49 glider1 crmd: [888]: WARN:
do_dc_join_finalize:join_dc.c join-2: We are still in a transition.
Delaying until the TE completes.
Dec  5 12:25:49 glider1 crmd: [888]: WARN:
do_dc_join_finalize:join_dc.c join-2: We are still in a transition.
Delaying until the TE completes.
Dec  5 12:25:51 glider1 tengine: [899]: notice: run_graph:graph.c
Transition 1: (Complete=18, Pending=0, Fired=0, Skipped=2,
Incomplete=0)
Dec  5 12:29:52 glider1 heartbeat: [837]: ERROR: Message hist queue is
filling up (151 messages in queue)
Dec  5 12:29:54 glider1 heartbeat: [837]: ERROR: Message hist queue is
filling up (152 messages in queue)
Dec  5 12:29:56 glider1 heartbeat: [837]: ERROR: Message hist queue is
filling up (153 messages in queue)
Dec  5 12:29:58 glider1 heartbeat: [837]: ERROR: Message hist queue is
filling up (154 messages in queue)

Secondary node:
Dec  5 12:30:03 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 135
Dec  5 12:30:03 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 135
Dec  5 12:30:18 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 143
Dec  5 12:30:28 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 148
Dec  5 12:30:34 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 151
Dec  5 12:30:39 glider2 heartbeat: [559]: ERROR: Irretrievably lost
packet: node glider1.domainit.com seq 153



On 11/30/06, Matt Wilder <grewaru at gmail.com> wrote:
> I will look into this, as I am also having the 99% cpu issue.
>
> Any ideas as to if this will make it into a release?
>
>
> On 11/30/06, Oren Nechushtan <oren at forescout.com> wrote:
> > Hi,
> > We've encountered something like that in the past.
> > Check out the messages titled "[Linux-HA] RE: 99% CPU heartbeat & rexmit (seqno too low)"
> > from September 2006. The (unofficial) patch there solved it for us thought it may require minor changes to date.
> >
> > Best,
> > Oren.
> >
> > > -----Original Message-----
> > > From: linux-ha-bounces at lists.linux-ha.org
> > > [mailto:linux-ha-bounces at lists.linux-ha.org]On Behalf Of Matt Wilder
> > > Sent: Thursday, November 30, 2006 8:03 PM
> > > To: General Linux-HA mailing list
> > > Subject: Re: [Linux-HA] Message hist queue is filling up
> > >
> > >
> > > What would cause this to happen?  There are no network connectivity
> > > issues between the two nodes.
> > >
> > > On 11/30/06, Serge Dubrouski <sergeyfd at gmail.com> wrote:
> > > > Lost packets between nodes in cluster.
> > > >
> > > > On 11/30/06, Matt Wilder <grewaru at gmail.com> wrote:
> > > > > Can anyone tell me what the cause of the following
> > > messages showing up
> > > > > in syslog from heartbeat?  I have checked network
> > > connectivity between
> > > > > the two machines in my cluster and everything looks fine.  These
> > > > > messages are occurring on a semi-frequent basis and do
> > > not seem to be
> > > > > stopping.
> > > > >
> > > > > Node1 syslog (currently serving all resources):
> > > > > Nov 28 18:06:36 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (196 messages in queue)
> > > > > Nov 28 18:06:38 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (197 messages in queue)
> > > > > Nov 28 18:06:40 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (198 messages in queue)
> > > > > Nov 28 18:06:42 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (199 messages in queue)
> > > > > Nov 28 18:06:44 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (200 messages in queue)
> > > > > Nov 28 18:06:50 glider1 last message repeated 3 times
> > > > > Nov 28 18:06:50 glider1 heartbeat: [80229]: ERROR: Cannot
> > > rexmit pkt
> > > > > 614508 for glider2.domainit.com: seqno too low
> > > > > Nov 28 18:06:52 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (200 messages in queue)
> > > > > Nov 28 18:06:56 glider1 last message repeated 2 times
> > > > > Nov 28 18:06:56 glider1 heartbeat: [80229]: ERROR: Cannot
> > > rexmit pkt
> > > > > 614511 for glider2.domainit.com: seqno too low
> > > > > Nov 28 18:06:58 glider1 heartbeat: [80229]: ERROR:
> > > Message hist queue
> > > > > is filling up (200 messages in queue)
> > > > > Nov 28 18:07:06 glider1 last message repeated 4 times
> > > > >
> > > > >
> > > > > Node2 syslog:
> > > > > Nov 28 18:05:05 glider2 heartbeat: [568]: ERROR:
> > > Irretrievably lost
> > > > > packet: node glider1.domainit.com seq 614508
> > > > > Nov 28 18:05:11 glider2 heartbeat: [568]: ERROR:
> > > Irretrievably lost
> > > > > packet: node glider1.domainit.com seq 614511
> > > > > _______________________________________________
> > > > > Linux-HA mailing list
> > > > > Linux-HA at lists.linux-ha.org
> > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > > See also: http://linux-ha.org/ReportingProblems
> > > > >
> > > > _______________________________________________
> > > > Linux-HA mailing list
> > > > Linux-HA at lists.linux-ha.org
> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > See also: http://linux-ha.org/ReportingProblems
> > > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>


More information about the Linux-HA mailing list