[Linux-HA] heartbeat 2.0.8: still not working

greno at verizon.net greno at verizon.net
Mon Feb 5 14:57:40 MST 2007


Ok, getting a little further...

Now the IP starts on grp-01-30-01 and not on rp-01-30-02 but then if I pull out the network cable from grp-01-30-01 so it cannot reach ping nodes the IP gets stopped and restarted but it stays on grp-01-30-01 rather than migrating over to grp-01-30-02.  This must have something to do with the scores but I'm not sure how all the scores interact so when I change them it doesn't make a difference.

details again:

ha.cf:
=========================
logfacility     daemon
keepalive 1
deadtime 10
warntime 5
initdead 20          
udpport 694           
ping 192.168.1.1      
                      
                      
                      
                      
bcast eth0 eth1       
auto_failback off     
                      
                      
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score
apiauth cibmon   uid=hacluster
respawn hacluster /usr/lib/heartbeat/cibmon -d

node    grp-01-30-01  
node    grp-01-30-02  
use_logd yes
compression     bz2
compression_threshold 2
crm yes

=========================

cib.xml:
=========================
 <cib have_quorum="true" admin_epoch="0" ignore_dtd="false" num_peers="1" cib_feature_revision="1.3" ccm_transition="3" generated="true" dc_uuid="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" epoch="13" num_updates="159" cib-last-written="Mon Feb  5 16:39:45 2007">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-symmetric_cluster" name="symmetric_cluster" value="True"/>
           <nvpair id="cib-bootstrap-options-default_resource_stickiness" name="default_resource_stickiness" value="50"/>
           <nvpair id="cib-bootstrap-options-default_resource_failure_stickiness" name="default_resource_failure_stickiness" value="-50"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" uname="grp-01-30-01" type="normal"/>
       <node id="29626f17-db1f-4139-aa33-5a6b4110da51" uname="grp-01-30-02" type="normal"/>
     </nodes>
     <resources>
       <group id="GRP_webserver_ip_RG">
         <primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
           <instance_attributes id="GRP_webserver_ip_R_instance_attrs">
             <attributes>
               <nvpair id="941b5590-c6b8-4465-882d-ce52ec4f63e8" name="ip" value="192.168.1.215"/>
             </attributes>
           </instance_attributes>
         </primitive>
       </group>
     </resources>
     <constraints>
       <rsc_location id="GRP_webserver_ip_RG:not_connected" rsc="GRP_webserver_ip_RG">
         <rule id="GRP_webserver_ip_RG:not_connected:rule" score="-INFINITY">
           <expression id="GRP_webserver_ip_RG:not_connected:expr" attribute="pingd_score" operation="not_defined"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
 </cib>

=========================

log: grp-01-30-01
=========================
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: Enabling logging daemon 
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf)
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7418]: WARN: logd is enabled but logfile/debugfile/logfacility is still configured in ha.cf
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: **************************
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7418]: info: Configuration validated. Starting heartbeat 2.0.8
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: heartbeat: version 2.0.8
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Heartbeat generation: 50
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: ping heartbeat started.
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Feb  5 16:28:31 grp-01-30-01 heartbeat: [7419]: info: Local status now set to: 'up'
Feb  5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 up.
Feb  5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Status update for node 192.168.1.1: status ping
Feb  5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-01:eth0 up.
Feb  5 16:28:32 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-01:eth1 up.
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 up.
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth1 up.
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Status update for node grp-01-30-02: status up
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Comm_now_up(): updating status to active
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Local status now set to: 'active'
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/cibmon -d" (100,101)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/ccm" (100,101)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/cib" (100,101)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/lrmd -r" (0,0)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/stonithd" (0,0)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/attrd" (100,101)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/crmd" (100,101)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Starting child client "/usr/lib/heartbeat/mgmtd -v" (0,0)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7419]: info: Status update for node grp-01-30-02: status active
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7434]: info: Starting "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0  gid 0 (pid 7434)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7435]: info: Starting "/usr/lib/heartbeat/cibmon -d" as uid 100  gid 101 (pid 7435)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7436]: info: Starting "/usr/lib/heartbeat/ccm" as uid 100  gid 101 (pid 7436)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7437]: info: Starting "/usr/lib/heartbeat/cib" as uid 100  gid 101 (pid 7437)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7438]: info: Starting "/usr/lib/heartbeat/lrmd -r" as uid 0  gid 0 (pid 7438)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7439]: info: Starting "/usr/lib/heartbeat/stonithd" as uid 0  gid 0 (pid 7439)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7440]: info: Starting "/usr/lib/heartbeat/attrd" as uid 100  gid 101 (pid 7440)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7441]: info: Starting "/usr/lib/heartbeat/crmd" as uid 100  gid 101 (pid 7441)
Feb  5 16:28:38 grp-01-30-01 heartbeat: [7442]: info: Starting "/usr/lib/heartbeat/mgmtd -v" as uid 0  gid 0 (pid 7442)
Feb  5 16:28:38 grp-01-30-01 stonithd: [7439]: info: Signing in with heartbeat.
Feb  5 16:28:38 grp-01-30-01 cib: [7437]: WARN: crm_is_writable: /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
Feb  5 16:28:38 grp-01-30-01 cib: [7437]: info: readCibXmlFile: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml
Feb  5 16:28:38 grp-01-30-01 cib: [7437]: info: log_data_element: readCibXmlFile: [on-disk]         <primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
Feb  5 16:28:38 grp-01-30-01 stonithd: [7439]: notice: /usr/lib/heartbeat/stonithd start up successfully.
Feb  5 16:28:44 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [20:22]
Feb  5 16:28:44 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb  5 16:28:45 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [27:29]
Feb  5 16:28:45 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb  5 16:28:54 grp-01-30-01 heartbeat: [7419]: WARN: 1 lost packet(s) for [grp-01-30-02] [42:44]
Feb  5 16:28:54 grp-01-30-01 heartbeat: [7419]: info: No pkts missing from grp-01-30-02!
Feb  5 16:29:04 grp-01-30-01 cibmon: [7435]: info: log_data_element: cib_apply_diff: +           <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb  5 16:29:05 grp-01-30-01 cibmon: [7435]: info: log_data_element: cib_apply_diff: +           <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">

PULLED grp-01-30-01 network cable here

Feb  5 16:34:47 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 dead.
Feb  5 16:34:48 grp-01-30-01 heartbeat: [7419]: WARN: node 192.168.1.1: is dead
Feb  5 16:34:48 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 dead.
Feb  5 16:38:13 grp-01-30-01 heartbeat: [7419]: info: Link 192.168.1.1:192.168.1.1 up.
Feb  5 16:38:13 grp-01-30-01 heartbeat: [7419]: WARN: Late heartbeat: Node 192.168.1.1: interval 216020 ms
Feb  5 16:38:13 grp-01-30-01 heartbeat: [7419]: info: Status update for node 192.168.1.1: status ping

RECONNECTED grp-01-30-01 network cable here

Feb  5 16:38:14 grp-01-30-01 heartbeat: [7419]: info: Link grp-01-30-02:eth0 up.

=========================

log: grp-01-30-02
=========================
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: Enabling logging daemon 
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf)
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6284]: WARN: logd is enabled but logfile/debugfile/logfacility is still configured in ha.cf
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: **************************
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6284]: info: Configuration validated. Starting heartbeat 2.0.8
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: heartbeat: version 2.0.8
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Heartbeat generation: 46
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: ping heartbeat started.
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Feb  5 16:28:36 grp-01-30-02 heartbeat: [6285]: info: Local status now set to: 'up'
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link 192.168.1.1:192.168.1.1 up.
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Status update for node 192.168.1.1: status ping
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 up.
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Status update for node grp-01-30-01: status up
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth1 up.
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-02:eth0 up.
Feb  5 16:28:37 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-02:eth1 up.
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Comm_now_up(): updating status to active
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Local status now set to: 'active'
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" (0,0)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/cibmon -d" (100,101)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/ccm" (100,101)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/cib" (100,101)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/lrmd -r" (0,0)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/stonithd" (0,0)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/attrd" (100,101)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/crmd" (100,101)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Starting child client "/usr/lib/heartbeat/mgmtd -v" (0,0)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6299]: info: Starting "/usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd_score" as uid 0  gid 0 (pid 6299)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6300]: info: Starting "/usr/lib/heartbeat/cibmon -d" as uid 100  gid 101 (pid 6300)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6301]: info: Starting "/usr/lib/heartbeat/ccm" as uid 100  gid 101 (pid 6301)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6302]: info: Starting "/usr/lib/heartbeat/cib" as uid 100  gid 101 (pid 6302)
Feb  5 16:28:38 grp-01-30-02 cib: [6302]: WARN: crm_is_writable: /var/lib/heartbeat/crm/cib.xml should be owned and r/w by group haclient
Feb  5 16:28:38 grp-01-30-02 cib: [6302]: info: readCibXmlFile: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml
Feb  5 16:28:38 grp-01-30-02 cib: [6302]: info: log_data_element: readCibXmlFile: [on-disk]         <primitive id="GRP_webserver_ip_R" class="ocf" type="IPaddr" provider="heartbeat">
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6303]: info: Starting "/usr/lib/heartbeat/lrmd -r" as uid 0  gid 0 (pid 6303)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6304]: info: Starting "/usr/lib/heartbeat/stonithd" as uid 0  gid 0 (pid 6304)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6305]: info: Starting "/usr/lib/heartbeat/attrd" as uid 100  gid 101 (pid 6305)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6306]: info: Starting "/usr/lib/heartbeat/crmd" as uid 100  gid 101 (pid 6306)
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6307]: info: Starting "/usr/lib/heartbeat/mgmtd -v" as uid 0  gid 0 (pid 6307)
Feb  5 16:28:38 grp-01-30-02 stonithd: [6304]: info: Signing in with heartbeat.
Feb  5 16:28:38 grp-01-30-02 stonithd: [6304]: notice: /usr/lib/heartbeat/stonithd start up successfully.
Feb  5 16:28:38 grp-01-30-02 heartbeat: [6285]: info: Status update for node grp-01-30-01: status active
Feb  5 16:28:43 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [41:43]
Feb  5 16:28:43 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb  5 16:28:44 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [46:48]
Feb  5 16:28:44 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb  5 16:28:54 grp-01-30-02 heartbeat: [6285]: WARN: 1 lost packet(s) for [grp-01-30-01] [66:68]
Feb  5 16:28:54 grp-01-30-02 heartbeat: [6285]: info: No pkts missing from grp-01-30-01!
Feb  5 16:29:03 grp-01-30-02 pengine: [6315]: info: native_print:     GRP_webserver_ip_R        (heartbeat::ocf:IPaddr):        Stopped 
Feb  5 16:29:03 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-97.bz2
Feb  5 16:29:03 grp-01-30-02 cibmon: [6300]: info: log_data_element: cib_update: +           <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb  5 16:29:04 grp-01-30-02 cibmon: [6300]: info: log_data_element: cib_update: +           <lrm_resource id="GRP_webserver_ip_R" type="IPaddr" class="ocf" provider="heartbeat">
Feb  5 16:29:04 grp-01-30-02 pengine: [6315]: info: native_print:     GRP_webserver_ip_R        (heartbeat::ocf:IPaddr):        Stopped 
Feb  5 16:29:04 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 1: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-98.bz2
Feb  5 16:29:06 grp-01-30-02 pengine: [6315]: info: native_print:     GRP_webserver_ip_R        (heartbeat::ocf:IPaddr):        Started grp-01-30-01
Feb  5 16:29:06 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 2: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-99.bz2

PULLED grp-01-30-01 network cable here

Feb  5 16:34:48 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 dead.
Feb  5 16:34:54 grp-01-30-02 pengine: [6315]: info: native_print:     GRP_webserver_ip_R        (heartbeat::ocf:IPaddr):        Started grp-01-30-01
Feb  5 16:34:54 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 3: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-100.bz2

RECONNECTED grp-01-30-01 network cable here

Feb  5 16:38:13 grp-01-30-02 heartbeat: [6285]: info: Link grp-01-30-01:eth0 up.
Feb  5 16:38:20 grp-01-30-02 pengine: [6315]: info: native_print:     GRP_webserver_ip_R        (heartbeat::ocf:IPaddr):        Started grp-01-30-01
Feb  5 16:38:20 grp-01-30-02 pengine: [6315]: info: process_pe_message: Transition 4: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-101.bz2

=========================





>From: greno at verizon.net
>Date: 2007/02/05 Mon PM 03:09:38 CST
>To: Andrew Beekhof <beekhof at gmail.com>, 
	General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
>Subject: Re: Re: [Linux-HA] heartbeat 2.0.8: still not working

>Sorry Andrew, I forgot to reload the firewalls after making the port changes.  I'll try this again.  And thanks for your help.



More information about the Linux-HA mailing list