[Linux-HA] heartbeat waits for initdead even after all nodes have joined
Lars Ellenberg
lars.ellenberg at linbit.com
Fri Jan 15 09:53:30 MST 2010
On Fri, Jan 15, 2010 at 01:22:45AM -0500, David Sickmiller wrote:
> > > I don't have autojoin in my ha.cf, and I believe it defaults to
> > > "autojoin none", so that wouldn't explain why heartbeat keeps
> waiting
> > > after all nodes have joined.
> >
> > True. That should be fixed. Can you please open a bugzilla for
> > this issue,
>
> Thanks for your help! I've filed this as Bug 2311
> (http://developerbugs.linux-foundation.org/show_bug.cgi?id=2311)
again:
maybe this is because dc-timeout defaults to initdead.
as these are independent values,
you may configure them independently.
please try and configure an explicit dc-timeout.
and see if that improves things.
ha.cf initdead is for the heartbeat cluster communication layer
and ccm.
dc-timeout can be configured in the cib, using cibadmin, or the crm shell, or ...
you could have initdead of 900,
and dc-timeout of 40 (just an example).
for AIS based clusters, there seem to be no real "initdead",
I think that is why the dc-timeout was changed to default to
the ha.cf initdead setting.
iirc there have been improvements to this startup behaviour in pacemaker
somewhen, but I don't remember the details.
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
More information about the Linux-HA
mailing list