[Linux-HA] crmd taking up 99% cput on 1st machine to join cluster

Matt Wilder grewaru at gmail.com
Tue Dec 12 14:33:24 MST 2006


Greetings,

I am having a problem with heartbeat 2.0.7.  If i have more than 9
primitives in my resource group, the first node to join the cluster
will have its crmd take up 99% (or more) of my cpu once the second
node joins the cluster.

 I have 2 nodes in this cluster. If i stop heartbeat on both and start
heartbeat on the 1st node, then the second node, crmd will go to 99%
on the first node once the second has fully joined.  If i start them
in reverse order (2 and then 1) the secondary node's crmd will go to
99%.  It does not matter which node is serving the services.  I set my
services to not be managed to verify this.

Furthermore, this problem only presents itself once i have more than 9
primitives in my resource group.  All of my resources on this cluster
are part of the same group, as they all have to be ran on the same
node.  I have tried numerous combinations of primitives, with and
without monitoring.  No combination seems to matter whatsoever.
Provided I have 9 or less primitives in the group, crmd runs fine.  If
i have 10 or more, crmd maxes the cpu as described above.

This is a freebsd 64bit system:

FreeBSD glider1.domainit.com 6.1-RELEASE-p3 FreeBSD 6.1-RELEASE-p3 #0:
Tue Jul 11 15:40:30 EDT 2006
root at gliderweb1.domainit.com:/usr/obj/usr/src/sys/GLIDERWEB1  amd64

I would appreciate any thoughts on this.  I can provide more details
regarding configuration if necessary.

Thanks.


More information about the Linux-HA mailing list