[Linux-HA] heartbeat 2.0.8: still not working
greno at verizon.net
greno at verizon.net
Mon Feb 5 14:57:40 MST 2007
Ok, getting a little further...
Now the IP starts on grp-01-30-01 and not on rp-01-30-02 but then if I pull out the network cable from grp-01-30-01 so it cannot reach ping nodes the IP gets stopped and restarted but it stays on grp-01-30-01 rather than migrating over to grp-01-30-02. This must have something to do with the scores but I'm not sure how all the scores interact so when I change them it doesn't make a difference.
details again:
ha.cf:
=========================
logfacility daemon
keepalive 1
deadtime 10
warntime 5
initdead 20
udpport 694
ping 192.168.1.1
bcast eth0 eth1
auto_failback off
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score
apiauth cibmon uid=hacluster
respawn hacluster /usr/lib/heartbeat/cibmon -d
node grp-01-30-01
node grp-01-30-02
use_logd yes
compression bz2
compression_threshold 2
crm yes
=========================
cib.xml:
=========================
<cib have_quorum="true" admin_epoch="0" ignore_dtd="false" num_peers="1" cib_feature_revision="1.3" ccm_transition="3" generated="true" dc_uuid="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" epoch="13" num_updates="159" cib-last-written="Mon Feb 5 16:39:45 2007">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair id="cib-bootstrap-options-symmetric_cluster" name="symmetric_cluster" value="True"/>
<nvpair id="cib-bootstrap-options-default_resource_stickiness" name="default_resource_stickiness" value="50"/>
<nvpair id="cib-bootstrap-options-default_resource_failure_stickiness" name="default_resource_failure_stickiness" value="-50"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" uname="grp-01-30-01" type="normal"/>
<node id="29626f17-db1f-4139-aa33-5a6b4110da51" uname="grp-01-30-02" type="normal"/>
</nodes>
<resources>
<group id="GRP_webserver_ip_RG">
<primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
<instance_attributes id="GRP_webserver_ip_R_instance_attrs">
<attributes>
<nvpair id="941b5590-c6b8-4465-882d-ce52ec4f63e8" name="ip" value="192.168.1.215"/>
</attributes>
</instance_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="GRP_webserver_ip_RG:not_connected" rsc="GRP_webserver_ip_RG">
<rule id="GRP_webserver_ip_RG:not_connected:rule" score="-INFINITY">
<expression id="GRP_webserver_ip_RG:not_connected:expr" attribute="pingd_score" operation="not_defined"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>
=========================
log: grp-01-30-01
=========================
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: Enabling logging daemon
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf)
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7418]: WARN: logd is enabled but logfile/debugfile/logfacility is still configured in ha.cf
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: **************************
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: Configuration validated. Starting heartbeat 2.0.8
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: heartbeat: version 2.0.8
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Heartbeat generation: 50
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: ping heartbeat started.
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Feb 5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Local status now set to: 'up'
Feb 5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 up.
Feb 5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Status update for node 192.168.1.1: status ping
Feb 5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-01:eth0 up.
Feb 5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-01:eth1 up.
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 up.
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth1 up.
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Status update for node grp-01-30-02: status up
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Comm_now_up(): updating status to active
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Local status now set to: 'active'
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/cibmon -d" (100,101)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/ccm" (100,101)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/cib" (100,101)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/lrmd -r" (0,0)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/stonithd" (0,0)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/attrd" (100,101)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/crmd" (100,101)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/mgmtd -v" (0,0)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Status update for node grp-01-30-02: status active
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7434]: info: Starting "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0 gid 0 (pid 7434)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7435]: info: Starting "/usr/lib/heartbeat/cibmon -d" as uid 100 gid 101 (pid 7435)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7436]: info: Starting "/usr/lib/heartbeat/ccm" as uid 100 gid 101 (pid 7436)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7437]: info: Starting "/usr/lib/heartbeat/cib" as uid 100 gid 101 (pid 7437)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7438]: info: Starting "/usr/lib/heartbeat/lrmd -r" as uid 0 gid 0 (pid 7438)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7439]: info: Starting "/usr/lib/heartbeat/stonithd" as uid 0 gid 0 (pid 7439)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7440]: info: Starting "/usr/lib/heartbeat/attrd" as uid 100 gid 101 (pid 7440)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7441]: info: Starting "/usr/lib/heartbeat/crmd" as uid 100 gid 101 (pid 7441)
Feb 5 16:28:38 grp-01-30-01 heartbeat: [7442]: info: Starting "/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 7442)
Feb 5 16:28:38 grp-01-30-01 stonithd: [7439]: info: Signing in with heartbeat.
Feb 5 16:28:38 grp-01-30-01 cib: [7437]: WARN: crm_is_writable: /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
Feb 5 16:28:38 grp-01-30-01 cib: [7437]: info: readCibXmlFile: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml
Feb 5 16:28:38 grp-01-30-01 cib: [7437]: info: log_data_element: readCibXmlFile: [on-disk] <primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
Feb 5 16:28:38 grp-01-30-01 stonithd: [7439]: notice: /usr/lib/heartbeat/stonithd start up successfully.
Feb 5 16:28:44 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [20:22]
Feb 5 16:28:44 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb 5 16:28:45 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [27:29]
Feb 5 16:28:45 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb 5 16:28:54 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [42:44]
Feb 5 16:28:54 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb 5 16:29:04 grp-01-30-01 cibmon: [7435]: info: log_data_element: cib_apply_diff: + <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb 5 16:29:05 grp-01-30-01 cibmon: [7435]: info: log_data_element: cib_apply_diff: + <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
PULLED grp-01-30-01 network cable here
Feb 5 16:34:47 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 dead.
Feb 5 16:34:48 grp-01-30-01 heartbeat: [7419]: WARN: node 192.168.1.1: is dead
Feb 5 16:34:48 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 dead.
Feb 5 16:38:13 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 up.
Feb 5 16:38:13 grp-01-30-01 heartbeat: [7419]: WARN: Late heartbeat: Node 192.168.1.1: interval 216020 ms
Feb 5 16:38:13 grp-01-30-01 heartbeat: [7419]: info: Status update for node 192.168.1.1: status ping
RECONNECTED grp-01-30-01 network cable here
Feb 5 16:38:14 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 up.
=========================
log: grp-01-30-02
=========================
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: Enabling logging daemon
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf)
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6284]: WARN: logd is enabled but logfile/debugfile/logfacility is still configured in ha.cf
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: **************************
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: Configuration validated. Starting heartbeat 2.0.8
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: heartbeat: version 2.0.8
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Heartbeat generation: 46
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: ping heartbeat started.
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Feb 5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Local status now set to: 'up'
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link 192.168.1.1:192.168.1.1 up.
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Status update for node 192.168.1.1: status ping
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 up.
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Status update for node grp-01-30-01: status up
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth1 up.
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-02:eth0 up.
Feb 5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-02:eth1 up.
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Comm_now_up(): updating status to active
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Local status now set to: 'active'
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/cibmon -d" (100,101)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/ccm" (100,101)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/cib" (100,101)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/lrmd -r" (0,0)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/stonithd" (0,0)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/attrd" (100,101)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/crmd" (100,101)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/mgmtd -v" (0,0)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6299]: info: Starting "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0 gid 0 (pid 6299)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6300]: info: Starting "/usr/lib/heartbeat/cibmon -d" as uid 100 gid 101 (pid 6300)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6301]: info: Starting "/usr/lib/heartbeat/ccm" as uid 100 gid 101 (pid 6301)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6302]: info: Starting "/usr/lib/heartbeat/cib" as uid 100 gid 101 (pid 6302)
Feb 5 16:28:38 grp-01-30-02 cib: [6302]: WARN: crm_is_writable: /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
Feb 5 16:28:38 grp-01-30-02 cib: [6302]: info: readCibXmlFile: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml
Feb 5 16:28:38 grp-01-30-02 cib: [6302]: info: log_data_element: readCibXmlFile: [on-disk] <primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6303]: info: Starting "/usr/lib/heartbeat/lrmd -r" as uid 0 gid 0 (pid 6303)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6304]: info: Starting "/usr/lib/heartbeat/stonithd" as uid 0 gid 0 (pid 6304)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6305]: info: Starting "/usr/lib/heartbeat/attrd" as uid 100 gid 101 (pid 6305)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6306]: info: Starting "/usr/lib/heartbeat/crmd" as uid 100 gid 101 (pid 6306)
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6307]: info: Starting "/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 6307)
Feb 5 16:28:38 grp-01-30-02 stonithd: [6304]: info: Signing in with heartbeat.
Feb 5 16:28:38 grp-01-30-02 stonithd: [6304]: notice: /usr/lib/heartbeat/stonithd start up successfully.
Feb 5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Status update for node grp-01-30-01: status active
Feb 5 16:28:43 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [41:43]
Feb 5 16:28:43 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb 5 16:28:44 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [46:48]
Feb 5 16:28:44 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb 5 16:28:54 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [66:68]
Feb 5 16:28:54 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb 5 16:29:03 grp-01-30-02 pengine: [6315]: info: native_print: GRP_webserver_ip_R (heartbeat::ocf:IPaddr): Stopped
Feb 5 16:29:03 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-97.bz2
Feb 5 16:29:03 grp-01-30-02 cibmon: [6300]: info: log_data_element: cib_update: + <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb 5 16:29:04 grp-01-30-02 cibmon: [6300]: info: log_data_element: cib_update: + <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb 5 16:29:04 grp-01-30-02 pengine: [6315]: info: native_print: GRP_webserver_ip_R (heartbeat::ocf:IPaddr): Stopped
Feb 5 16:29:04 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 1: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-98.bz2
Feb 5 16:29:06 grp-01-30-02 pengine: [6315]: info: native_print: GRP_webserver_ip_R (heartbeat::ocf:IPaddr): Started grp-01-30-01
Feb 5 16:29:06 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 2: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-99.bz2
PULLED grp-01-30-01 network cable here
Feb 5 16:34:48 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 dead.
Feb 5 16:34:54 grp-01-30-02 pengine: [6315]: info: native_print: GRP_webserver_ip_R (heartbeat::ocf:IPaddr): Started grp-01-30-01
Feb 5 16:34:54 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 3: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-100.bz2
RECONNECTED grp-01-30-01 network cable here
Feb 5 16:38:13 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 up.
Feb 5 16:38:20 grp-01-30-02 pengine: [6315]: info: native_print: GRP_webserver_ip_R (heartbeat::ocf:IPaddr): Started grp-01-30-01
Feb 5 16:38:20 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 4: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-101.bz2
=========================
>From: greno at verizon.net
>Date: 2007/02/05 Mon PM 03:09:38 CST
>To: Andrew Beekhof <beekhof at gmail.com>,
General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
>Subject: Re: Re: [Linux-HA] heartbeat 2.0.8: still not working
>Sorry Andrew, I forgot to reload the firewalls after making the port changes. I'll try this again. And thanks for your help.
More information about the Linux-HA
mailing list