[Linux-HA] heartbeat 2.0.8: pingd failover
Eddie C
edlinuxguru at gmail.com
Sat Feb 3 15:18:10 MST 2007
Let me make a silly suggestion. If its doing the opposite of what you want
maybe change -INFINITY to +INFINITY. It is so crazy it just might work :)
<rule id="ip_resource:not_connected
>
> :rule" score="-INFINITY">
On 2/3/07, greno at verizon.net <greno at verizon.net> wrote:
>
> I have heartbeat running in a 2-node active/passive setup and I've
> configured a resource 'ip_resource' to monitor in cib. Whenever I unplug
> the network cable from the active server, I see messages that indicate that
> the resource is stopped and then restarted again but on the active server
> rather than the failover server as I was expecting. Is there something
> wrong in the config files? Some details...
>
> Refresh in 4s...
>
> ============
> Last updated: Sat Feb 3 01:17:56 2007
> Current DC: server2 (29626f17-db1f-4139-aa33-5a6b4110da51)
> 2 Nodes configured.
> 1 Resources configured.
> ============
>
> ip_resource (heartbeat::ocf:IPaddr): Started server1
>
> =====================
>
> logfacility daemon
> keepalive 1
> deadtime 10
> warntime 5
> initdead 1208
> udpport 694
> ping 192.168.1.1
> bcast eth0 eth1
> auto_failback off
> respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score
> node server1
> node server2
> use_logd yes
> compression bz2
> compression_threshold 2
> crm yes
>
> =====================
> <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="2"
> cib_feature_revision="1.3" generated="true" epoch="14" num_updates="171"
> cib-last-written="Sat Feb 3 01:18:23 2007" ccm_transition="2"
> dc_uuid="29626f17-db1f-4139-aa33-5a6b4110da51">
> <configuration>
> <crm_config/>
> <nodes>
> <node id="29626f17-db1f-4139-aa33-5a6b4110da51" uname="server2"
> type="normal"/>
> <node id="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" uname="server1"
> type="normal"/>
> </nodes>
> <resources>
> <primitive id="ip_resource" class="ocf" type="IPaddr"
> provider="heartbeat">
> <instance_attributes id="ip_attributes">
> <attributes>
> <nvpair id="ip" name="ip" value="192.168.1.215"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </resources>
> <constraints>
> <rsc_location id="run_ip_resource" rsc="ip_resource">
> <rule id="pref_run_ip_resource1" score="100">
> <expression id="expr1" attribute="#uname" operation="eq"
> value="server1"/>
> </rule>
> <rule id="pref_run_ip_resource2" score="000">
> <expression id="expr2" attribute="#uname" operation="eq"
> value="server2"/>
> </rule>
> </rsc_location>
> <rsc_location id="ip_resource:not_connected" rsc="ip_resource">
> <rule id="ip_resource:not_connected:rule" score="-INFINITY">
> <expression id="ip_resource:not_connected:expr"
> attribute="pingd_score" operation="not_defined"/>
> </rule>
> </rsc_location>
> </constraints>
> </configuration>
> </cib>
>
> =====================
> Feb 3 01:12:32 server1 heartbeat: [6149]: info: Enabling logging daemon
> Feb 3 01:12:32 server1 heartbeat: [6149]: info: logfile and debug file
> are those specified in logd config file (default /etc/logd.cf)
> Feb 3 01:12:32 server1 heartbeat: [6149]: WARN: logd is enabled but
> logfile/debugfile/logfacility is still configured in ha.cf
> Feb 3 01:12:32 server1 heartbeat: [6149]: info:
> **************************
> Feb 3 01:12:32 server1 heartbeat: [6149]: info: Configuration validated.
> Starting heartbeat 2.0.8
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: heartbeat: version 2.0.8
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: Heartbeat generation: 13
> Feb 3 01:12:32 server1 heartbeat: [6150]: info:
> G_main_add_TriggerHandler: Added signal manual handler
> Feb 3 01:12:32 server1 heartbeat: [6150]: info:
> G_main_add_TriggerHandler: Added signal manual handler
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: Removing
> /var/run/heartbeat/rsctmp failed, recreating.
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: glib: ping heartbeat
> started.
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: glib: UDP Broadcast
> heartbeat started on port 694 (694) interface eth0
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: glib: UDP Broadcast
> heartbeat closed on port 694 interface eth0 - Status: 1
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: glib: UDP Broadcast
> heartbeat started on port 694 (694) interface eth1
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: glib: UDP Broadcast
> heartbeat closed on port 694 interface eth1 - Status: 1
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> Feb 3 01:12:32 server1 heartbeat: [6150]: info: Local status now set to:
> 'up'
> Feb 3 01:12:33 server1 heartbeat: [6150]: info: Link 192.168.1.1:192.168.1.1
> up.
> Feb 3 01:12:33 server1 heartbeat: [6150]: info: Status update for node
> 192.168.1.1: status ping
> Feb 3 01:12:34 server1 heartbeat: [6150]: info: Link server1:eth0 up.
> Feb 3 01:12:34 server1 heartbeat: [6150]: info: Link server1:eth1 up.
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Link server2:eth0 up.
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Link server2:eth1 up.
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Status update for node
> server2: status up
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Comm_now_up(): updating
> status to active
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Local status now set to:
> 'active'
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/ccm" (100,101)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/cib" (100,101)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/lrmd -r" (0,0)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/stonithd" (0,0)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/attrd" (100,101)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/crmd" (100,101)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Starting child client
> "/usr/lib/heartbeat/mgmtd -v" (0,0)
> Feb 3 01:12:45 server1 heartbeat: [6150]: info: Status update for node
> server2: status active
> Feb 3 01:12:45 server1 heartbeat: [6165]: info: Starting
> "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0 gid 0 (pid
> 6165)
> Feb 3 01:12:45 server1 heartbeat: [6166]: info: Starting
> "/usr/lib/heartbeat/ccm" as uid 100 gid 101 (pid 6166)
> Feb 3 01:12:45 server1 heartbeat: [6167]: info: Starting
> "/usr/lib/heartbeat/cib" as uid 100 gid 101 (pid 6167)
> Feb 3 01:12:45 server1 cib: [6167]: WARN: crm_is_writable:
> /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
> Feb 3 01:12:45 server1 cib: [6167]: info: readCibXmlFile: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml
> Feb 3 01:12:45 server1 cib: [6167]: info: log_data_element:
> readCibXmlFile: [on-disk] <primitive id="ip_resource" class="ocf"
> type="IPaddr" provider="heartbeat">
> Feb 3 01:12:45 server1 heartbeat: [6168]: info: Starting
> "/usr/lib/heartbeat/lrmd -r" as uid 0 gid 0 (pid 6168)
> Feb 3 01:12:45 server1 heartbeat: [6169]: info: Starting
> "/usr/lib/heartbeat/stonithd" as uid 0 gid 0 (pid 6169)
> Feb 3 01:12:45 server1 heartbeat: [6170]: info: Starting
> "/usr/lib/heartbeat/attrd" as uid 100 gid 101 (pid 6170)
> Feb 3 01:12:45 server1 heartbeat: [6171]: info: Starting
> "/usr/lib/heartbeat/crmd" as uid 100 gid 101 (pid 6171)
> Feb 3 01:12:45 server1 heartbeat: [6172]: info: Starting
> "/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 6172)
> Feb 3 01:12:45 server1 stonithd: [6169]: info: Signing in with heartbeat.
> Feb 3 01:12:46 server1 stonithd: [6169]: notice:
> /usr/lib/heartbeat/stonithd start up successfully.
> Feb 3 01:12:50 server1 heartbeat: [6150]: WARN: 1 lost packet(s) for
> [server2] [19:21]
> Feb 3 01:12:50 server1 heartbeat: [6150]: info: No pkts missing from
> server2!
> Feb 3 01:12:51 server1 heartbeat: [6150]: WARN: 1 lost packet(s) for
> [server2] [23:25]
> Feb 3 01:12:51 server1 heartbeat: [6150]: info: No pkts missing from
> server2!
> Feb 3 01:13:01 server1 heartbeat: [6150]: WARN: 1 lost packet(s) for
> [server2] [43:45]
> Feb 3 01:13:01 server1 heartbeat: [6150]: info: No pkts missing from
> server2!
> Feb 3 01:18:17 server1 heartbeat: [6150]: info: Link server2:eth0 dead.
> Feb 3 01:18:18 server1 heartbeat: [6150]: WARN: node 192.168.1.1: is dead
> Feb 3 01:18:18 server1 heartbeat: [6150]: info: Link 192.168.1.1:192.168.1.1
> dead.
> Feb 3 02:06:44 server1 heartbeat: [6150]: info: Link 192.168.1.1:192.168.1.1
> up.
> Feb 3 02:06:44 server1 heartbeat: [6150]: WARN: Late heartbeat: Node
> 192.168.1.1: interval 2917470 ms
> Feb 3 02:06:44 server1 heartbeat: [6150]: info: Status update for node
> 192.168.1.1: status ping
> Feb 3 02:06:45 server1 heartbeat: [6150]: info: Link server2:eth0 up.
>
>
> =====================
> Feb 3 01:12:43 server2 heartbeat: [6016]: info: Enabling logging daemon
> Feb 3 01:12:43 server2 heartbeat: [6016]: info: logfile and debug file
> are those specified in logd config file (default /etc/logd.cf)
> Feb 3 01:12:43 server2 heartbeat: [6016]: WARN: logd is enabled but
> logfile/debugfile/logfacility is still configured in ha.cf
> Feb 3 01:12:43 server2 heartbeat: [6016]: info:
> **************************
> Feb 3 01:12:43 server2 heartbeat: [6016]: info: Configuration validated.
> Starting heartbeat 2.0.8
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: heartbeat: version 2.0.8
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: Heartbeat generation: 14
> Feb 3 01:12:43 server2 heartbeat: [6017]: info:
> G_main_add_TriggerHandler: Added signal manual handler
> Feb 3 01:12:43 server2 heartbeat: [6017]: info:
> G_main_add_TriggerHandler: Added signal manual handler
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: Removing
> /var/run/heartbeat/rsctmp failed, recreating.
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: glib: ping heartbeat
> started.
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: glib: UDP Broadcast
> heartbeat started on port 694 (694) interface eth0
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: glib: UDP Broadcast
> heartbeat closed on port 694 interface eth0 - Status: 1
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: glib: UDP Broadcast
> heartbeat started on port 694 (694) interface eth1
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: glib: UDP Broadcast
> heartbeat closed on port 694 interface eth1 - Status: 1
> Feb 3 01:12:43 server2 heartbeat: [6017]: info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> Feb 3 01:12:44 server2 heartbeat: [6017]: info: Local status now set to:
> 'up'
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Link 192.168.1.1:192.168.1.1
> up.
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Status update for node
> 192.168.1.1: status ping
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Link server1:eth0 up.
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Status update for node
> server1: status up
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Link server2:eth0 up.
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Link server1:eth1 up.
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Link server2:eth1 up.
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Comm_now_up(): updating
> status to active
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Local status now set to:
> 'active'
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/ccm" (100,101)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/cib" (100,101)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/lrmd -r" (0,0)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/stonithd" (0,0)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/attrd" (100,101)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/crmd" (100,101)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Starting child client
> "/usr/lib/heartbeat/mgmtd -v" (0,0)
> Feb 3 01:12:45 server2 heartbeat: [6017]: WARN: G_CH_dispatch_int:
> Dispatch function for read child took too long to execute: 60 ms (> 50 ms)
> (GSource: 0x9f568d0)
> Feb 3 01:12:45 server2 heartbeat: [6031]: info: Starting
> "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0 gid 0 (pid
> 6031)
> Feb 3 01:12:45 server2 heartbeat: [6032]: info: Starting
> "/usr/lib/heartbeat/ccm" as uid 100 gid 101 (pid 6032)
> Feb 3 01:12:45 server2 heartbeat: [6033]: info: Starting
> "/usr/lib/heartbeat/cib" as uid 100 gid 101 (pid 6033)
> Feb 3 01:12:45 server2 cib: [6033]: WARN: crm_is_writable:
> /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
> Feb 3 01:12:45 server2 cib: [6033]: info: readCibXmlFile: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml
> Feb 3 01:12:45 server2 cib: [6033]: info: log_data_element:
> readCibXmlFile: [on-disk] <primitive id="ip_resource" class="ocf"
> type="IPaddr" provider="heartbeat">
> Feb 3 01:12:45 server2 heartbeat: [6034]: info: Starting
> "/usr/lib/heartbeat/lrmd -r" as uid 0 gid 0 (pid 6034)
> Feb 3 01:12:45 server2 heartbeat: [6035]: info: Starting
> "/usr/lib/heartbeat/stonithd" as uid 0 gid 0 (pid 6035)
> Feb 3 01:12:45 server2 heartbeat: [6017]: info: Status update for node
> server1: status active
> Feb 3 01:12:45 server2 heartbeat: [6036]: info: Starting
> "/usr/lib/heartbeat/attrd" as uid 100 gid 101 (pid 6036)
> Feb 3 01:12:45 server2 heartbeat: [6037]: info: Starting
> "/usr/lib/heartbeat/crmd" as uid 100 gid 101 (pid 6037)
> Feb 3 01:12:45 server2 heartbeat: [6038]: info: Starting
> "/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 6038)
> Feb 3 01:12:45 server2 stonithd: [6035]: info: Signing in with heartbeat.
> Feb 3 01:12:45 server2 stonithd: [6035]: notice:
> /usr/lib/heartbeat/stonithd start up successfully.
> Feb 3 01:12:50 server2 heartbeat: [6017]: WARN: 1 lost packet(s) for
> [server1] [45:47]
> Feb 3 01:12:50 server2 heartbeat: [6017]: info: No pkts missing from
> server1!
> Feb 3 01:12:52 server2 heartbeat: [6017]: WARN: 1 lost packet(s) for
> [server1] [54:56]
> Feb 3 01:12:52 server2 heartbeat: [6017]: info: No pkts missing from
> server1!
> Feb 3 01:12:57 server2 heartbeat: [6017]: WARN: 1 lost packet(s) for
> [server1] [64:66]
> Feb 3 01:12:57 server2 heartbeat: [6017]: info: No pkts missing from
> server1!
> Feb 3 01:15:04 server2 pengine: [6049]: info: native_print: ip_resource
> (heartbeat::ocf:IPaddr): Stopped
> Feb 3 01:15:04 server2 pengine: [6049]: info: process_pe_message:
> Transition 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-
> input-29.bz2
> Feb 3 01:15:05 server2 pengine: [6049]: info: native_print: ip_resource
> (heartbeat::ocf:IPaddr): Stopped
> Feb 3 01:15:05 server2 pengine: [6049]: info: process_pe_message:
> Transition 1: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-
> input-30.bz2
> Feb 3 01:15:07 server2 pengine: [6049]: info: native_print: ip_resource
> (heartbeat::ocf:IPaddr): Started server1
> Feb 3 01:15:07 server2 pengine: [6049]: info: process_pe_message:
> Transition 2: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-
> input-31.bz2
> Feb 3 01:18:17 server2 heartbeat: [6017]: info: Link server1:eth0 dead.
> Feb 3 01:18:24 server2 pengine: [6049]: info: native_print: ip_resource
> (heartbeat::ocf:IPaddr): Started server1
> Feb 3 01:18:24 server2 pengine: [6049]: info: process_pe_message:
> Transition 3: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-
> input-32.bz2
>
> =====================
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list