[Linux-HA] ldirector/LVSsync/IPaddr fail (2.99.2, 1.03)

Thomas Baumann tom at tiri.li
Fri Jul 3 08:50:13 MDT 2009


My sysctl.conf looks like:

kernel.printk = 4 4 1 7
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.tcp_syncookies=1
net.ipv4.ip_forward=0
net.ipv6.conf.all.forwarding=0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0
kernel.maps_protect = 1
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.vs.expire_quiescent_template = 1
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.eth1.arp_ignore = 1
net.ipv4.conf.eth1.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.core.wmem_max = 8388608
net.core.rmem_max = 8388608
net.ipv4.tcp_rmem = 4096 6289408 8388608
net.ipv4.tcp_wmem = 4096 65536 8388608

But I get


============
Last updated: Fri Jul  3 16:49:03 2009
Current DC: orad01 (c565bd14-8b2d-40d5-931b-40fefb1f89f3) - partition  
with quorum
Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ orad01 orad02 ]

Resource Group: group_LVS
     ldirectord_1        (heartbeat:ldirectord): Started orad01 FAILED
     LVSSyncDaemonSwap_2 (heartbeat:LVSSyncDaemonSwap):  Started orad01
     IPaddr_192_168_178_100      (ocf::heartbeat:IPaddr):         
Started orad01 (unmanaged) FAILED

Failed actions:
     IPaddr_192_168_178_100_stop_0 (node=orad01, call=11, rc=1,  
status=complete): unknown error
     ldirectord_1_monitor_120000 (node=orad01, call=6, rc=7,  
status=complete): not running


Zitat von Thomas Baumann <tom at tiri.li>:

> Hello list,
>
> I have following configuration (2 node cluster, pacemaker 1.0.3,
> heartbeat 2.99.2). Logfiles are available upon request. Final cib.xml
> is attached.
>
> $ ip addr sh
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet 192.168.178.100/0 scope global lo:0
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 00:0c:29:e4:00:7b brd ff:ff:ff:ff:ff:ff
>     inet 192.168.160.101/24 brd 192.168.160.255 scope global eth0
>     inet6 fe80::20c:29ff:fee4:7b/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 00:0c:29:e4:00:85 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.178.101/24 brd 192.168.178.255 scope global eth1
>     inet6 fe80::20c:29ff:fee4:85/64 scope link
>        valid_lft forever preferred_lft forever
>
> My haresources looks like:
> orad01 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master
> IPaddr::192.168.178.100/32/eth1 #group_LVS
>
> My ha.cf looks like:
> keepalive 2
> deadtime 15
> warntime 5
> initdead 30
> mcast eth1 225.0.0.1 694 1 0
> auto_failback off
> node    orad01
> node    orad02
> ping 192.168.178.1
> use_logd yes
> conn_logd_time 60
> compression     zlib
> compression_threshold 2
> coredumps true
> crm respawn
>
> I convert the V1 to 0.6
>
> $ /usr/lib/heartbeat/haresources2cib.py --stdout -c ha.cf haresources
>> cib.xml.initial
> $ crm_verify -VVV -x cib.xml.initial
> crm_verify[10794]: 2009/07/03_16:04:43 info: main: =#=#=#=#= Getting
> XML =#=#=#=#=
> crm_verify[10794]: 2009/07/03_16:04:43 notice: update_validation: Upgrading
> pacemaker-0.6-style configuration to transitional-0.6 with
> /usr/share/pacemaker/upgrade06.xsl
> crm_verify[10794]: 2009/07/03_16:04:43 info: update_validation:   
> Transformation
> /usr/share/pacemaker/upgrade06.xsl successful
> crm_verify[10794]: 2009/07/03_16:04:43 notice: update_validation:
> Upgraded from <none> to
> pacemaker-1.0 validation
> crm_verify[10794]: 2009/07/03_16:04:43 WARN: cli_config_update: Your
> configuration was
> internally updated to the latest version (pacemaker-1.0)
> crm_verify[10794]: 2009/07/03_16:04:43 WARN: cluster_status: We do not
> have quorum -
> fencing and resource management disabled
> Warnings found during check: config may not be valid
>
> I convert V0.6 to Pacemaker 1
>
> $ xsltproc /usr/share/pacemaker/upgrade06.xsl cib.xml.initial >
> cib-pm.xml.initial
> $ crm_verify -VVV -x cib-pm.xml.initial
> crm_verify[10802]: 2009/07/03_16:06:19 info: main: =#=#=#=#= Getting
> XML =#=#=#=#=
> crm_verify[10802]: 2009/07/03_16:06:19 WARN: cluster_status: We do not
> have quorum -
> fencing and resource management disabled
>
> orad01:/etc/ha.d# cp cib-pm.xml.initial /var/lib/heartbeat/crm/cib.xml
> orad01:/etc/ha.d# chown hacluster:haclient /var/lib/heartbeat/crm/cib.xml
>
> But when I start my cluster with /etc/init.d/heartbeat start the
> cluster resources fail.
>
> ipvsadm -L -n --stats ; echo ; crm_mon -1; echo ; ip addr sh
>
>      Fri Jul  3
> 16:22:21 2009
>
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port               Conns   InPkts  OutPkts    
> InBytes OutBytes
>   -> RemoteAddress:Port
> UDP  192.168.178.100:1812                0        0        0          
> 0        0
>   -> 192.168.178.101:1812                0        0        0          
> 0        0
>   -> 192.168.178.102:1812                0        0        0          
> 0        0
>
>
>
> ============
> Last updated: Fri Jul  3 16:22:21 2009
> Current DC: orad01 (c565bd14-8b2d-40d5-931b-40fefb1f89f3) - partition
> with quorum
> Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
> 1 Nodes configured, unknown expected votes
> 1 Resources configured.
> ============
>
> Online: [ orad01 ]
>
> Resource Group: group_LVS
>     ldirectord_1        (heartbeat:ldirectord): Started orad01 FAILED
>     LVSSyncDaemonSwap_2 (heartbeat:LVSSyncDaemonSwap):  Stopped
>     IPaddr_192_168_178_100      (ocf::heartbeat:IPaddr):        Stopped
>
> Failed actions:
>     IPaddr_192_168_178_100_monitor_0 (node=orad01, call=4, rc=1,
> status=complete):
> unknown error
>     ldirectord_1_monitor_120000 (node=orad01, call=92, rc=7,
> status=complete): not running
>
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 00:0c:29:e4:00:7b brd ff:ff:ff:ff:ff:ff
>     inet 192.168.160.101/24 brd 192.168.160.255 scope global eth0
>     inet6 fe80::20c:29ff:fee4:7b/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 00:0c:29:e4:00:85 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.178.101/24 brd 192.168.178.255 scope global eth1
>     inet6 fe80::20c:29ff:fee4:85/64 scope link
>        valid_lft forever preferred_lft forever
>
>
> Whats wrong with this ?
>
> Thanks for your reply in advance.
>
> Thomas.
>
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.



-- 
tiri GmbH
Lauenburger Str. 31a
21493 Schwarzenbek
Tel. 04151 8674995
Fax. 04151 8674996
Net. http://www.tiri.li

Geschäftsführer: Anja Baumann, Thomas Baumann
Sitz Schwarzenbek, Amtsgericht Lübeck, HRB 8837 HL

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




More information about the Linux-HA mailing list