[Linux-HA] Two node, two IP resource, config problem?

Andrew Beekhof beekhof at gmail.com
Wed Oct 4 06:40:39 MDT 2006


(sorry for the delay)

as far as heartbeat can tell - both links are dead

heartbeat[6468]: 2006/09/12_08:30:35 info: Link ranger:eth2 dead.
heartbeat[6468]: 2006/09/12_08:31:47 info: Link ranger:eth0 dead.

but i'm no expert at the communication level stuff

On 9/12/06, Robert Gravsjö <robert.gravsjo at tietoenator.com> wrote:
>
>
> Andrew Beekhof wrote:
> > On 9/11/06, Robert Gravsjö <robert.gravsjo at tietoenator.com> wrote:
> >>
> >>
> >> Andrew Beekhof wrote:
> >> > On 9/8/06, Robert Gravsjö <robert.gravsjo at tietoenator.com> wrote:
> >> >>
> >> >>
> >> >> Oren Nechushtan wrote:
> >> >> > Did you try the patch Andrew supplied? It worked for me.
> >> >> > See
> >> >>
> >> http://lists.community.tummy.com/pipermail/linux-ha/2006-August/021391.html
> >>
> >> >>
> >> >> >
> >> >> > 13 days agoFilter out updates that arent for cluster members (eg.
> >> >> ping nodes)
> >> >> >
> >> >> > changeset
> >> >> > Andrew Beekhof <beekhof at gmail.com> [Thu, 24 Aug 2006 17:42:40 +0200]
> >> >> rev 9564
> >> >> >
> >> >> > Filter out updates that arent for cluster members (eg. ping nodes)
> >> >>
> >> >> I tried this patch and it fixed the failover problem.
> >> >>
> >> >> The OFFLINE problem still occurs. A node with its network cables
> >> removed
> >> >> will stay OFFLINE in crm_mon despite the fact that it is back online.
> >> >
> >> > How can the other node(s) know the "failed" node is "back" if its
> >> > network cables are unplugged?
> >>
> >> There are additional networks attached to these servers. One of these
> >> other networks are dedicated to heartbeat communication.
> >> So, since heartbeat is able to communicate with the other node it should
> >> also be aware of it resuming normal operations. The log shows entries
> >> about it being back, but crm_mon shows it OFFLINE and crmadmin says
> >> "idle".
> >
> > Can you include some logs (as bzip attachments) and h.cf on both nodes?
>
> Attached logs and conf. Do you want the cib.xml too?
>
> rouge is primary node and ranger is the secondary node.
>
> /roppert
>
> >> >> The difference this time is that restarting heartbeat results in both
> >> >> node starting up group net_1 and no one runs group_1.
> >> >
> >> > Is this a two node cluster?
> >>
> >> Yes, this is a two node cluster sharing two IP addresses (one for the
> >> primary network and one for the secondary).
> >
> > ok - thats good
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
> --
> RobertG
>
> Phone: +46 (0)480 44 58 35
>
>
> logfile /var/log/ha-log
> keepalive 1
> deadtime 15
> warntime 5
> initdead 60
> bcast   eth3            # Linux
> ucast eth0 192.168.60.11
> ucast eth2 192.168.80.11
> auto_failback off
> node rouge ranger
> ping_group ai 192.168.60.17 192.168.80.17 192.168.60.18 192.168.80.18
> ping_group bard 192.168.60.19 192.168.80.19
> crm yes
> respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s
>
>
> logfile /var/log/ha-log
> keepalive 1
> deadtime 15
> warntime 5
> initdead 60
> bcast   eth3            # Linux
> ucast eth0 192.168.60.10
> ucast eth2 192.168.80.10
> auto_failback off
> node rouge ranger
> ping_group ai 192.168.60.17 192.168.80.17 192.168.60.18 192.168.80.18
> ping_group bard 192.168.60.19 192.168.80.19
> crm yes
> respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
>


More information about the Linux-HA mailing list