[Linux-HA] R1 to R2 testing: cib.xml & ldirectord questions for 2 node cluster

Peter Farrell peter.d.farrell at gmail.com
Tue Sep 11 06:34:51 MDT 2007


Hi all.

I'm looking for clarification and/or direction. I've been reading
through the linux-ha.org site, watching Andrew's presentation in
Australia and experimenting with  R2 in anticipation of upgrading our
R1 setup.

Versions:
heartbeat-stonith-2.1.2-3.el4.centos
heartbeat-pils-2.1.2-3.el4.centos
heartbeat-ldirectord-2.1.2-3.el4.centos
heartbeat-2.1.2-3.el4.centos

I'm using 2 node clusters that monitor IPaddr among web servers.
Specifically - ldirectord is my only resource.

The R1 setup I'm trying to test is:

ha.cf:
crm yes
use_logd on
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast   eth1
ucast eth1 10.0.0.1
auto_failback off
node    dmz1.example.com
node    dmz2.example.com

haresources:
dmz1.example.com IPaddr::212.140.130.37 IPaddr::212.140.130.38
ldirectord::ldirectord.cf


My main 'issue' is creating / updating the cib.xml.
I tried various iterations with the python converter - but it's always
either blown away across reboots or missing parameters.

I managed to to use 'cibadmin' and add strings (from the xml file I
wanted to use) to the 'live' cib.xml. It works, after a fashion and is
being replicated between the two nodes.

crm_mon shows me:
============
Last updated: Tue Sep 11 13:01:34 2007
Current DC: dmz2.example.com (4ffd8d6f-adaa-4fdb-888e-dcf520cf2189)
2 Nodes configured.
1 Resources configured.
============

Node: dmz2.example.com (4ffd8d6f-adaa-4fdb-888e-dcf520cf2189): online
Node: dmz1.example.com (d9ffeb49-3151-48fc-a976-0edfb39494f9): online

Resource Group: group_1
    IPaddr_212_140_130_37       (heartbeat::ocf:IPaddr):
Started dmz1.scarceskills.com
    IPaddr_212_140_130_38       (heartbeat::ocf:IPaddr):
Started dmz1.scarceskills.com
    ldirectord_3        (heartbeat:ldirectord): Started dmz1.example.com FAILED

Failed actions:
    ldirectord_3_monitor_120000 (node=dmz1.example.com, call=11, rc=7): complete


----------------
Questions:
----------------
1. Is there a way or 'procedure' for maintaining a standard cib.xml file?
I mean, the 'cibadmin' seems really clumsy to me, copying in strings,
etc. Is this the "normal" way to manage it? For example - if I wanted
to create a new environment at another site - do I `cat
/var/lib/heartbeat/crm/cib.xml > save_this.xml` and the create a new
'blank' cluster and sort of feed the saved file in via the cibadmin
tool?

BTW: it took me a day and half to actually figure out that I was meant
to be using the cibadmin tool at all... I tried in vain to copy it
directly to the /var/lib path to no avail.

2. The logs show ldirectord being restarted multiple times, ps shows
it running, and ifconfig -a shows me the correct virtual interfaces
have been configured. When I reboot either node - the other node picks
up as it's meant to. So I trust it's working (although I've not tested
sending request to the test page at the end of that address yet)

What then do I make of the output from crm_mon?
ldirectord_3        (heartbeat:ldirectord): Started dmz1.example.com FAILED

Failed actions:
ldirectord_3_monitor_120000 (node=dmz1.example.com, call=11, rc=7): complete

3. Do I need any additional statements in my ha.cf for this type of setup?

4. (Last one!) Although I can feed in the XML strings to construct the
file I want - I can't shake off the default 'quorum=true' bit in the
first line of the cib.xml:
<cib generated="true" admin_epoch="0" have_quorum="true"
ignore_dtd="false" num_peers="2" ccm_transition="10"
cib_feature_revision="1.3"
dc_uuid="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189" epoch="5"
num_updates="130" cib-last-written="Tue Sep 11 13:00:58 2007">

I want:
name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no-quorum-policy"

Does this matter? Can I safely ignore that first line? Or is there a
way to remove it?

Any pointers would be greatly appreciated.

Best regards;

-Peter Farrell
Cardiff, Wales - UK


cib.xml:
-----------
[root at dmz1 ha.d]# cat /var/lib/heartbeat/crm/cib.xml
 <cib generated="true" admin_epoch="0" have_quorum="true"
ignore_dtd="false" num_peers="2" ccm_transition="10"
cib_feature_revision="1.3"
dc_uuid="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189" epoch="5"
num_updates="130" cib-last-written="Tue Sep 11 13:00:58 2007">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="stop"/>
           <nvpair
id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness" value="0"/>
           <nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
           <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
           <nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action" value="reboot"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions" value="true"/>
           <nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop" value="false"/>
           <nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names" value="true"/>
           <nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout" value="5min"/>
           <nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout" value="15s"/>
           <nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default" value="true"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="4ffd8d6f-adaa-4fdb-888e-dcf520cf2189"
uname="dmz2.example.com" type="normal"/>
       <node id="d9ffeb49-3151-48fc-a976-0edfb39494f9"
uname="dmz1.example.com" type="normal"/>
     </nodes>
     <resources>
       <group id="group_1">
         <primitive class="ocf" id="IPaddr_212_140_130_37"
provider="heartbeat" type="IPaddr">
           <operations>
             <op id="IPaddr_212_140_130_37_mon" interval="5s"
name="monitor" timeout="5s"/>
           </operations>
           <instance_attributes id="IPaddr_212_140_130_37_inst_attr">
             <attributes>
               <nvpair id="IPaddr_212_140_130_37_attr_0" name="ip"
value="212.140.130.37"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="ocf" id="IPaddr_212_140_130_38"
provider="heartbeat" type="IPaddr">
           <operations>
             <op id="IPaddr_212_140_130_38_mon" interval="5s"
name="monitor" timeout="5s"/>
           </operations>
           <instance_attributes id="IPaddr_212_140_130_38_inst_attr">
             <attributes>
               <nvpair id="IPaddr_212_140_130_38_attr_0" name="ip"
value="212.140.130.38"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="heartbeat" id="ldirectord_3"
provider="heartbeat" type="ldirectord">
           <operations>
             <op id="ldirectord_3_mon" interval="120s" name="monitor"
timeout="60s"/>
           </operations>
           <instance_attributes id="ldirectord_3_inst_attr">
             <attributes>
               <nvpair id="ldirectord_3_attr_1" name="1" value="ldirectord.cf"/>
             </attributes>
           </instance_attributes>
         </primitive>
       </group>
     </resources>
     <constraints>
       <rsc_location id="rsc_location_group_1" rsc="group_1">
         <rule id="prefered_location_group_1" score="100">
           <expression attribute="#uname"
id="prefered_location_group_1_expr" operation="eq"
value="dmz1.example.com"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
 </cib>


More information about the Linux-HA mailing list