[Linux-HA] R1 to R2 testing: cib.xml & ldirectord questions for 2
node cluster
Peter Farrell
peter.d.farrell at gmail.com
Tue Sep 11 06:34:51 MDT 2007
Hi all.
I'm looking for clarification and/or direction. I've been reading
through the linux-ha.org site, watching Andrew's presentation in
Australia and experimenting with R2 in anticipation of upgrading our
R1 setup.
Versions:
heartbeat-stonith-2.1.2-3.el4.centos
heartbeat-pils-2.1.2-3.el4.centos
heartbeat-ldirectord-2.1.2-3.el4.centos
heartbeat-2.1.2-3.el4.centos
I'm using 2 node clusters that monitor IPaddr among web servers.
Specifically - ldirectord is my only resource.
The R1 setup I'm trying to test is:
ha.cf:
crm yes
use_logd on
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast eth1
ucast eth1 10.0.0.1
auto_failback off
node dmz1.example.com
node dmz2.example.com
haresources:
dmz1.example.com IPaddr::212.140.130.37 IPaddr::212.140.130.38
ldirectord::ldirectord.cf
My main 'issue' is creating / updating the cib.xml.
I tried various iterations with the python converter - but it's always
either blown away across reboots or missing parameters.
I managed to to use 'cibadmin' and add strings (from the xml file I
wanted to use) to the 'live' cib.xml. It works, after a fashion and is
being replicated between the two nodes.
crm_mon shows me:
============
Last updated: Tue Sep 11 13:01:34 2007
Current DC: dmz2.example.com (4ffd8d6f-adaa-4fdb-888e-dcf520cf2189)
2 Nodes configured.
1 Resources configured.
============
Node: dmz2.example.com (4ffd8d6f-adaa-4fdb-888e-dcf520cf2189): online
Node: dmz1.example.com (d9ffeb49-3151-48fc-a976-0edfb39494f9): online
Resource Group: group_1
IPaddr_212_140_130_37 (heartbeat::ocf:IPaddr):
Started dmz1.scarceskills.com
IPaddr_212_140_130_38 (heartbeat::ocf:IPaddr):
Started dmz1.scarceskills.com
ldirectord_3 (heartbeat:ldirectord): Started dmz1.example.com FAILED
Failed actions:
ldirectord_3_monitor_120000 (node=dmz1.example.com, call=11, rc=7): complete
----------------
Questions:
----------------
1. Is there a way or 'procedure' for maintaining a standard cib.xml file?
I mean, the 'cibadmin' seems really clumsy to me, copying in strings,
etc. Is this the "normal" way to manage it? For example - if I wanted
to create a new environment at another site - do I `cat
/var/lib/heartbeat/crm/cib.xml > save_this.xml` and the create a new
'blank' cluster and sort of feed the saved file in via the cibadmin
tool?
BTW: it took me a day and half to actually figure out that I was meant
to be using the cibadmin tool at all... I tried in vain to copy it
directly to the /var/lib path to no avail.
2. The logs show ldirectord being restarted multiple times, ps shows
it running, and ifconfig -a shows me the correct virtual interfaces
have been configured. When I reboot either node - the other node picks
up as it's meant to. So I trust it's working (although I've not tested
sending request to the test page at the end of that address yet)
What then do I make of the output from crm_mon?
ldirectord_3 (heartbeat:ldirectord): Started dmz1.example.com FAILED
Failed actions:
ldirectord_3_monitor_120000 (node=dmz1.example.com, call=11, rc=7): complete
3. Do I need any additional statements in my ha.cf for this type of setup?
4. (Last one!) Although I can feed in the XML strings to construct the
file I want - I can't shake off the default 'quorum=true' bit in the
first line of the cib.xml:
<cib generated="true" admin_epoch="0" have_quorum="true"
ignore_dtd="false" num_peers="2" ccm_transition="10"
cib_feature_revision="1.3"
dc_uuid="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189" epoch="5"
num_updates="130" cib-last-written="Tue Sep 11 13:00:58 2007">
I want:
name="symmetric-cluster" value="true"/>
<nvpair id="cib-bootstrap-options-no-quorum-policy"
Does this matter? Can I safely ignore that first line? Or is there a
way to remove it?
Any pointers would be greatly appreciated.
Best regards;
-Peter Farrell
Cardiff, Wales - UK
cib.xml:
-----------
[root at dmz1 ha.d]# cat /var/lib/heartbeat/crm/cib.xml
<cib generated="true" admin_epoch="0" have_quorum="true"
ignore_dtd="false" num_peers="2" ccm_transition="10"
cib_feature_revision="1.3"
dc_uuid="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189" epoch="5"
num_updates="130" cib-last-written="Tue Sep 11 13:00:58 2007">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster" value="true"/>
<nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="stop"/>
<nvpair
id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness" value="0"/>
<nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
<nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
<nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action" value="reboot"/>
<nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources" value="true"/>
<nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions" value="true"/>
<nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop" value="false"/>
<nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names" value="true"/>
<nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout" value="5min"/>
<nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout" value="15s"/>
<nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default" value="true"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189"
uname="dmz2.example.com" type="normal"/>
<node id="d9ffeb49-3151-48fc-a976-0edfb39494f9"
uname="dmz1.example.com" type="normal"/>
</nodes>
<resources>
<group id="group_1">
<primitive class="ocf" id="IPaddr_212_140_130_37"
provider="heartbeat" type="IPaddr">
<operations>
<op id="IPaddr_212_140_130_37_mon" interval="5s"
name="monitor" timeout="5s"/>
</operations>
<instance_attributes id="IPaddr_212_140_130_37_inst_attr">
<attributes>
<nvpair id="IPaddr_212_140_130_37_attr_0" name="ip"
value="212.140.130.37"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" id="IPaddr_212_140_130_38"
provider="heartbeat" type="IPaddr">
<operations>
<op id="IPaddr_212_140_130_38_mon" interval="5s"
name="monitor" timeout="5s"/>
</operations>
<instance_attributes id="IPaddr_212_140_130_38_inst_attr">
<attributes>
<nvpair id="IPaddr_212_140_130_38_attr_0" name="ip"
value="212.140.130.38"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="heartbeat" id="ldirectord_3"
provider="heartbeat" type="ldirectord">
<operations>
<op id="ldirectord_3_mon" interval="120s" name="monitor"
timeout="60s"/>
</operations>
<instance_attributes id="ldirectord_3_inst_attr">
<attributes>
<nvpair id="ldirectord_3_attr_1" name="1" value="ldirectord.cf"/>
</attributes>
</instance_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="rsc_location_group_1" rsc="group_1">
<rule id="prefered_location_group_1" score="100">
<expression attribute="#uname"
id="prefered_location_group_1_expr" operation="eq"
value="dmz1.example.com"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>
More information about the Linux-HA
mailing list