[Linux-HA] R1 to R2 testing: cib.xml & ldirectord questions for 2 node cluster

Peter Farrell peter.d.farrell at gmail.com
Thu Sep 13 06:17:17 MDT 2007


On 13/09/2007, Andrew Beekhof <beekhof at gmail.com> wrote:
> On 9/13/07, Peter Farrell <peter.d.farrell at gmail.com> wrote:
> > Thanks Dejan - that was very helpful.
> >
> > Last question :-)
> >
> > I've modified this setup, taking into account your advice about
> > removing the ldirectord attributes (where are these things
> > documented!?).
>
> http://linux-ha.org/ResourceAgents

Ta.

>
> > Everything is running fine and I've got no errors in cluster.log -
> > only a few WARN's and they seem  harmless.
> >
> > So when I reboot either node the other takes over just fine.
> >
> > But - if I disconnect the network cable from either node - nothing
> > happens at all.
>
> how many communication paths?

2 nodes. 2 NIC's each.
eth0 = 212.140.130.0 --> both connected to switch
eth1 = 10.0.0.0 -----------> cross over cable between these 2 nodes

To test I just pulled the cable out of eth0 on one node - then waited,
like a tit.

>
> > What do I need to do (or am I mis-understanding the concept here) in
> > order to get the node w/ no network to fail over?
> >
> > Here is my current 'cibadmin -Ql > local.xml' (with status removed)
>
> which is the important part :-(

Sorry - I was looking through the status bit and it just seemed...
indecipherable, that's why I left it out.

>
> that and supplying logs (as attachments)

Will do.

So - here it is again.
Each node operating just fine. I've zero'd out my cluster.log. Restart
heartbeat, then I pull the eth0 cable on one node - ssh over the
crossover cable to that node and pull the 'cibadmin -Ql > local.xml'
which I'm attaching along with cluster.log.

Thanks for everyone's time.
-Peter



> >
> >    <configuration>
> >      <crm_config/>
> >      <nodes>
> >        <node id="f8e04b68-2b8e-4eac-8b14-eb64ccc5379c"
> > uname="dmz2.example.com" type="normal"/>
> >        <node id="d5795fde-df25-451c-a0f2-a8337a8290e3"
> > uname="dmz1.example.com" type="normal"/>
> >      </nodes>
> >      <resources>
> >        <group id="WEB_IP_TEST">
> >          <primitive class="ocf" id="IPaddr_212_140_130_37"
> > provider="heartbeat" type="IPaddr">
> >            <operations>
> >              <op id="IPaddr_212_140_130_37_mon" interval="5s"
> > name="monitor" timeout="5s"/>
> >            </operations>
> >            <instance_attributes id="IPaddr_212_140_130_37_inst_attr">
> >              <attributes>
> >                <nvpair id="IPaddr_212_140_130_37_attr_0" name="ip"
> > value="212.140.130.37"/>
> >              </attributes>
> >            </instance_attributes>
> >          </primitive>
> >          <primitive class="ocf" id="IPaddr_212_140_130_38"
> > provider="heartbeat" type="IPaddr">
> >            <operations>
> >              <op id="IPaddr_212_140_130_38_mon" interval="5s"
> > name="monitor" timeout="5s"/>
> >            </operations>
> >            <instance_attributes id="IPaddr_212_140_130_38_inst_attr">
> >              <attributes>
> >                <nvpair id="IPaddr_212_140_130_38_attr_0" name="ip"
> > value="212.140.130.38"/>
> >              </attributes>
> >            </instance_attributes>
> >          </primitive>
> >          <primitive class="ocf" id="ldirectord_3" provider="heartbeat"
> > type="ldirectord">
> >            <operations>
> >              <op id="ldirectord_3_mon" interval="120s" name="monitor"
> > timeout="60s"/>
> >            </operations>
> >          </primitive>
> >        </group>
> >      </resources>
> >
> >
> >      <constraints>
> >        <rsc_location id="rsc_location_group_1" rsc="WEB_IP_TEST">
> >          <rule id="prefered_location_DMZ1" score="100">
> >            <expression attribute="#uname"
> > id="prefered_location_DMZ1_expr" operation="eq"
> > value="dmz1.example.com"/>
> >          </rule>
> >        </rsc_location>
> >      </constraints>
> >    </configuration>
> >
> >
> > -Peter
> >
> >
> >
> >
> >
> > On 12/09/2007, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > > Hi,
> > >
> > > On Wed, Sep 12, 2007 at 04:24:47PM +0100, Peter Farrell wrote:
> > > > Andreas -
> > > >
> > > > A follow up if you will...
> > > >
> > > > 1) RE: finding that /usr/lib/ocf directory - very nice. (Was this in
> > > > the documentation or am I being thick?)
> > > >
> > > > 3) "The ldirectord wrapper is a ocf compliant resource agent which
> > > > starts, stops and monitors ldirectord."
> > > >
> > > > I'm not sure on the use here - I've deleted the symbolic link in
> > > > /etc/ha.d/haresources.d/ldirectord and copied over the wrapper from
> > > > /usr/lib/ocf/resource.d/heartbeat/ldirectord along with
> > > > .ocf-shellfuc.ocf-shellfuncs.
> > > > I edited the OCF_ROOT path from:
> > > > ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
> > > > to
> > > > ${OCF_ROOT}/etc/ha.d/resource.d/.ocf-shellfuncs
> > > >
> > > > Is this the correct way to use the wrapper?
> > > > (because when it starts - the cluster goes straight to hell - so I'm
> > > > guessing 'no' here - although it could be totally related to the next
> > > > question!)
> > >
> > > No. There are three different kinds of RAs: LSB, heartbeat,
> > > and OCF. You can't use an OCF RA as a heartbeat one. In short,
> > > these three are expected to behave in a different manner.
> > >
> > > > 2a) you have to create another resource definition for that ldirectord resource.
> > > >
> > > > Would this be the "primitive" "type" from IPaddr to ...?  for the ip
> > > > address definitions? Or is it that you change ldirectord class from
> > > > 'heartbeat' to 'ocf'?
> > >
> > > Yes, the last one. It seems that the OCF ldirectord does not
> > > support any parameters, so you should remove the
> > > "ldirectord_3_attr_1" parameter too.
> > >
> > > > Sorry - After looking over the linux-ha site I'm still confused.
> > > > You create a definition for each resource (ip address 1 and 2)  then
> > > > one for the ldirectord itself right? I've got those - I'm not clear on
> > > > the additional definitions I need.
> > > >
> > > > 2b) I'm pretty sure you also have to define colocation constraints
> > > > between the IP resources you want to serve and the ldirectord
> > > > resource.
> > > >
> > > > What would that look like? Do you have an example?
> > > > I thought (from the website)
> > > > "This constraint is already implicit because there is a group over
> > > > these resources."
> > >
> > > Yes, you typically put the resources you want to run on the same
> > > node in a group.
> > >
> > > Thanks.
> > >
> > > Dejan
> > >
> > > > RESOURCES
> > > > ===========
> > > >
> > > > <resources>
> > > >   <group id="group_1">
> > > >     <primitive class="ocf" id="IPaddr_212_140_130_37"
> > > > provider="heartbeat" type="IPaddr">
> > > >       <operations>
> > > >         <op id="IPaddr_212_140_130_37_mon" interval="5s"
> > > > name="monitor" timeout="5s"/>
> > > >       </operations>
> > > >         <instance_attributes id="IPaddr_212_140_130_37_inst_attr">
> > > >           <attributes>
> > > >             <nvpair id="IPaddr_212_140_130_37_attr_0" name="ip"
> > > > value="212.140.130.37"/>
> > > >           </attributes>
> > > >         </instance_attributes>
> > > >     </primitive>
> > > >     <primitive class="ocf" id="IPaddr_212_140_130_38"
> > > > provider="heartbeat" type="IPaddr">
> > > >       <operations>
> > > >         <op id="IPaddr_212_140_130_38_mon" interval="5s"
> > > > name="monitor" timeout="5s"/>
> > > >       </operations>
> > > >         <instance_attributes id="IPaddr_212_140_130_38_inst_attr">
> > > >           <attributes>
> > > >             <nvpair id="IPaddr_212_140_130_38_attr_0" name="ip"
> > > > value="212.140.130.38"/>
> > > >           </attributes>
> > > >         </instance_attributes>
> > > >     </primitive>
> > > >     <primitive class="heartbeat" id="ldirectord_3"
> > > > provider="heartbeat" type="ldirectord">
> > > >       <operations>
> > > >         <op id="ldirectord_3_mon" interval="120s" name="monitor" timeout="60s"/>
> > > >       </operations>
> > > >         <instance_attributes id="ldirectord_3_inst_attr">
> > > >           <attributes>
> > > >             <nvpair id="ldirectord_3_attr_1" name="1" value="ldirectord.cf"/>
> > > >           </attributes>
> > > >         </instance_attributes>
> > > >     </primitive>
> > > >   </group>
> > > > </resources>
> > > >
> > > >
> > > >
> > > > CONSTRAINTS
> > > > ===========
> > > >
> > > > <constraints>
> > > >   <rsc_location id="rsc_location_group_1" rsc="group_1">
> > > >     <rule id="prefered_location_group_1" score="100">
> > > >       <expression attribute="#uname"
> > > > id="prefered_location_group_1_expr" operation="eq"
> > > > value="dmz1.example.com"/>
> > > >     </rule>
> > > >   </rsc_location>
> > > > </constraints>
> > > >
> > > >
> > > > Best Regards;
> > > > - Peter Farrell
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 11/09/2007, matilda matilda <matilda at grandel.de> wrote:
> > > > > >>> "Peter Farrell" <peter.d.farrell at gmail.com> 09/11/07 10:11 PM >>>
> > > > >
> > > > > Hi Peter,
> > > > >
> > > > > 1) In version 2.1.2 the mentioned script aka ocf-compliant RA should be part of the distribution.
> > > > > You can also use the one posted.
> > > > > 2) Yes, you have to create another resource definition for that ldirectord resource.
> > > > > I'm pretty sure you also have to define colocation constraints between the IP resources you want to serve
> > > > > and the ldirectord resource.
> > > > > 3) The ldirectord wrapper is a ocf compliant resource agent which starts, stops and monitors
> > > > > ldirectord.
> > > > > 4) For me it works.  ;-)
> > > > >
> > > > > Hope, it helps.
> > > > >
> > > > > Best regards
> > > > > Andreas Mock
> > > > >
> > > > > _______________________________________________
> > > > > Linux-HA mailing list
> > > > > Linux-HA at lists.linux-ha.org
> > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > > See also: http://linux-ha.org/ReportingProblems
> > > > >
> > > > _______________________________________________
> > > > Linux-HA mailing list
> > > > Linux-HA at lists.linux-ha.org
> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > See also: http://linux-ha.org/ReportingProblems
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: local.xml
Type: text/xml
Size: 10802 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20070913/fba9cd78/local-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.log
Type: text/x-log
Size: 45295 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20070913/fba9cd78/cluster-0001.bin


More information about the Linux-HA mailing list