[Linux-HA] weird node dependencies in 2.0.7...

Andrew Beekhof beekhof at gmail.com
Thu Sep 14 02:03:41 MDT 2006


On 9/13/06, Brian O'Neill <oneill at oinc.net> wrote:
>
>
> Dejan Muhamedagic wrote:
> > On Mon, Sep 11, 2006 at 03:45:01PM -0400, Brian O'Neill wrote:
> >> I'm running in to an issue with heartbeat 2.0.7 where nodes won't join a
> >> cluster unless other nodes that are down are brought up.
> >>
> >> I'm setting up a cluster with three nodes (node0, node1, node2). There
> >> are two resource groups which include an IP for each (group0, group1).
> >>
> >> If I start just node0 (or any other, really doesn't matter), the
> >> resources will come up fine and both run on node0. This is even if
> >> "require_quorum" is set to true, which as I understood it means it
> >> shouldn't start any resources unless there is a quorum, but its not well
> >> defined in the web pages.
> >
> > This should not happen, i.e. no resources should come up when only
> > one node (out of three) is started. Can you check /etc/ha.d/ha.cf
> > on all nodes and make sure that they have all nodes listed?
> > Please include the CIB and ha.cf in the next post.
> >
>
> Interestingly, one of my colleagues changed some stuff randomly and it
> seems to work now...at least the problem about not being able to start
> the second node. Here is the ha.cf as it is right now:
>
> ===============
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
> logfacility     local0
> keepalive 500ms
> deadtime 20
> #initdead 120
> udpport 694
> bcast eth1
> msgfmt netstring
> realtime on
> auto_failback off
> use_logd yes
>
> node node0 node1 node2
> autojoin none
>
> crm yes
> apiauth cibmon   uid=hacluster
> respawn hacluster /usr/lib/heartbeat/cibmon -d
> ==============
>
> He's not sure what made the difference, but he added the "autojoin none"
> (which apparently alone did not make a difference), removed the
> "deadtime" directive (which is now back but at 20 instead of 120), and
> put all the nodes on one line (they had separate "node" entries before).
> The old config worked with 2.0.2.
>
> Now, as for the quorum issues...there are two property settings:
>
> no_quorum_policy
> require_quorum
>
> However, require_quorum, although appearing in several examples, doesn't
> appear to be documented. I thought it to mean that  quorum is required
> at startup, but this isn't having an effect - is this property no longer
> valid?


require_quorum (a boolean) was superseded by no_quorum_policy

>
> I had also assumed that no_quorum_policy was in case nodes became
> unavailable and what to do if a quorum no longer existed...am I also
> incorrect here?

correct

> I had been using "ignore" as I didn't want resources to
> stop because two of the three nodes were not available - is "freeze"
> what I really want?

freeze is generally safer.  it wont start anything that wasn't already
running, but it will continue to manage anything that is running.

> What I want is that if heartbeat is started, a quorum must exist in
> order to start resources (no split-braining because of a cluster restart
> but no connectivity between), but if two out of three nodes fail,
> resources will continue running on the last node.
>
> Here is the crm_config section of the cib.xml file:
> <crm_config>
>    <cluster_property_set id="cib-bootstrap-options">
>      <attributes>
>        <nvpair id="cib-bootstrap-options-is_managed_default"
> name="is_managed_default" value="true"/>
>        <nvpair id="cib-bootstrap-options-symmetric_cluster"
> name="symmetric_cluster" value="true"/>
>        <nvpair id="cib-bootstrap-options-default_resource_stickiness"
> name="default_resource_stickiness" value="INFINITY"/>
>        <nvpair id="cib-bootstrap-options-no_quorum_policy"
> name="no_quorum_policy" value="ignore"/>
>        <nvpair id="cib-bootstrap-options-require_quorum"
> name="require_quorum" value="true"/>
>      </attributes>
>    </cluster_property_set>
> </crm_config>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


More information about the Linux-HA mailing list