[Linux-HA] pingd failover in active/standby cluster

Matt Zagrabelny mzagrabe at d.umn.edu
Wed Oct 3 10:39:51 MDT 2007


Hello,

I have a problem with my heartbeat not failing over resources in my
active/standby cluster when a ping node goes "down". I have been reading
the archives the past couple of days and it looks like a somewhat
frequent question/problem. Unfortunately, after reading the list
archives, I am still unable to solve my problem.

Scenario:

Active/Standby firewall cluster.

                 Internet

+---eth0---+                 +---eth0---+
|          |                 |          |
|       eth2-----------------eth2       |
|  (cody)  |     Heartbeat   |   (tim)  |
| /dev/ttyS0-----------------/dev/ttyS0 |
|          |                 |          |
+---eth1---+                 +---eth1---+

                 Intranet


'cody' is the primary firewall box and 'tim' is the backup.

A single resource group:

Resource Group: monolith_resources
    external_VIP        (heartbeat::ocf:IPaddr2):       Started cody
    internal_VIP        (heartbeat::ocf:IPaddr2):       Started cody

A ping node on each interface's network to verify connectivity for that
interface. 

Some relevant configs:

# cat /etc/ha.d/ha.cf
use_logd on

keepalive 1
deadtime 5
initdead 120

udpport 694
baud 115200
serial /dev/ttyS0
bcast eth2

node cody
node tim

ping 131.212.4.158
ping 192.168.115.38

crm on

# cat /var/lib/heartbeat/crm/cib.xml
<?xml version="1.0"?>
<cib admin_epoch="0" epoch="0" num_updates="0">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <attributes>
          <nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster"                   value="true"/>
          <nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy"                    value="stop"/>
          <nvpair id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness"         value="0"/>
          <nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
          <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled"                     value="false"/>
          <nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action"                      value="reboot"/>
          <nvpair id="cib-bootstrap-options-startup-fencing"
name="startup-fencing"                     value="true"/>
          <nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources"               value="true"/>
          <nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions"                 value="true"/>
          <nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop"                   value="false"/>
          <nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names"                value="true"/>
          <nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout"             value="5min"/>
          <nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout"              value="30s"/>
          <nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default"                  value="true"/>
          <nvpair id="cib-bootstrap-options-pe-input-series-max"
name="pe-input-series-max"                 value="400"/>
        </attributes>
      </cluster_property_set>
    </crm_config>
    <nodes/>
    <resources>
      <group id="monolith_resources">
        <primitive id="external_VIP" class="ocf" provider="heartbeat"
type="IPaddr2">
          <operations>
            <op id="external_VIP_mon" name="monitor" interval="5s"
timeout="5s"/>
          </operations>
          <instance_attributes id="external_VIP_inst_attr">
            <attributes>
              <nvpair id="external_VIP_ip_assignment" name="ip"
value="131.212.4.153"/>
            </attributes>
          </instance_attributes>
        </primitive>
        <primitive id="internal_VIP" class="ocf" provider="heartbeat"
type="IPaddr2">
          <operations>
            <op id="internal_VIP_mon" name="monitor" interval="5s"
timeout="5s"/>
          </operations>
          <instance_attributes id="internal_VIP_inst_attr">
            <attributes>
              <nvpair id="internal_VIP_ip_assignment" name="ip"
value="192.168.115.33"/>
            </attributes>
          </instance_attributes>
        </primitive>
      </group>
      <clone id="pingd" globally_unique="false">
        <instance_attributes id="pingd_inst_attr">
          <attributes>
            <nvpair id="pingd-clone_max"      name="clone_max"
value="2"/>
            <nvpair id="pingd-clone_node_max" name="clone_node_max"
value="1"/>
            <nvpair id="pingd-dampen"         name="dampen"
value="5s"/>
            <nvpair id="pingd-multiplier"     name="multiplier"
value="100"/>
          </attributes>
        </instance_attributes>
        <primitive id="pingd-child" provider="heartbeat" class="ocf"
type="pingd">
          <operations>
            <op id="pingd-child-monitor" name="monitor" interval="20s"
timeout="40s" prereq="nothing"/>
            <op id="pingd-child-start" name="start" prereq="nothing"/>
          </operations>
        </primitive>
      </clone>
    </resources>
    <constraints>
      <rsc_location id="monolith_resources_location"
rsc="monolith_resources">
        <rule id="prefered_location_monolith_resources" score="50">
          <expression id="prefered_location_host_cody"
attribute="#uname" operation="eq" value="cody"/>
        </rule>
      </rsc_location>
      <rsc_location id="monolith_resources_connected"
rsc="monolith_resources">
        <rule id="monolith_resources_connected_rule"
score_attribute="pingd" >
          <expression id="connected_via_ping" attribute="pingd"
operation="defined"/>
        </rule>
      </rsc_location>
    </constraints>
  </configuration>
  <status/>
</cib>


I have read in a previous thread on the list where Andrew Beekhof wrote:

>> if you can get the cluster into a state where:
>> - both nodes are online
>> - both nodes can see each other
>> - only one node can see the ping node
>> - the resource isnt running on the machine that can see the ping node

>> then run "cibadmin -Q" and attach the results.
>> that will be enough to tell me roughly where the problem lies.

So that is what I did. Here is the result of the "cibadmin -Q" command.
Note that I ran this command on the node that cannot see the ping node,
but still has the resources.

<cib admin_epoch="0" epoch="0" num_updates="32" generated="true"
have_quorum="true" ignore_dtd="false" num_peers="2"
cib-last-written="Tue Oct  2 16:14:04 2007" ccm_transition="2"
cib_feature_revision="1.3"
dc_uuid="e723a418-ba24-470e-9540-fbb568b9bcb4">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="stop"/>
           <nvpair
id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness" value="0"/>
           <nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
           <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
           <nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action" value="reboot"/>
           <nvpair id="cib-bootstrap-options-startup-fencing"
name="startup-fencing" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions" value="true"/>
           <nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop" value="false"/>
           <nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names" value="true"/>
           <nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout" value="5min"/>
           <nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout" value="5s"/>
           <nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default" value="true"/>
           <nvpair id="cib-bootstrap-options-pe-input-series-max"
name="pe-input-series-max" value="400"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="e723a418-ba24-470e-9540-fbb568b9bcb4" uname="tim"
type="normal"/>
       <node id="ccca855c-2191-4aa8-8707-88237b72112c" uname="cody"
type="normal"/>
     </nodes>
     <resources>
       <group id="monolith_resources">
         <primitive id="external_VIP" class="ocf" provider="heartbeat"
type="IPaddr2">
           <operations>
             <op id="external_VIP_mon" name="monitor" interval="5s"
timeout="5s"/>
           </operations>
           <instance_attributes id="external_VIP_inst_attr">
             <attributes>
               <nvpair id="external_VIP_ip_assignment" name="ip"
value="131.212.4.153"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive id="internal_VIP" class="ocf" provider="heartbeat"
type="IPaddr2">
           <operations>
             <op id="internal_VIP_mon" name="monitor" interval="5s"
timeout="5s"/>
           </operations>
           <instance_attributes id="internal_VIP_inst_attr">
             <attributes>
               <nvpair id="internal_VIP_ip_assignment" name="ip"
value="192.168.115.33"/>
             </attributes>
           </instance_attributes>
         </primitive>
       </group>
       <clone id="pingd" globally_unique="false">
         <instance_attributes id="pingd_inst_attr">
           <attributes>
             <nvpair id="pingd-clone_max" name="clone_max" value="2"/>
             <nvpair id="pingd-clone_node_max" name="clone_node_max"
value="1"/>
             <nvpair id="pingd-dampen" name="dampen" value="5s"/>
             <nvpair id="pingd-multiplier" name="multiplier"
value="100"/>
           </attributes>
         </instance_attributes>
         <primitive id="pingd-child" provider="heartbeat" class="ocf"
type="pingd">
           <operations>
             <op id="pingd-child-monitor" name="monitor" interval="20s"
timeout="40s" prereq="nothing"/>
             <op id="pingd-child-start" name="start" prereq="nothing"/>
           </operations>
         </primitive>
       </clone>
     </resources>
     <constraints>
       <rsc_location id="monolith_resources_location"
rsc="monolith_resources">
         <rule id="prefered_location_monolith_resources" score="50">
           <expression id="prefered_location_host_cody"
attribute="#uname" operation="eq" value="cody"/>
         </rule>
       </rsc_location>
       <rsc_location id="monolith_resources_connected"
rsc="monolith_resources">
         <rule id="monolith_resources_connected_rule"
score_attribute="pingd">
           <expression id="connected_via_ping" attribute="pingd"
operation="defined"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
   <status>
     <node_state id="e723a418-ba24-470e-9540-fbb568b9bcb4" uname="tim"
crmd="online" crm-debug-origin="do_update_resource" shutdown="0"
in_ccm="true" ha="active" join="member" expected="member">
       <lrm id="e723a418-ba24-470e-9540-fbb568b9bcb4">
         <lrm_resources>
           <lrm_resource id="pingd-child:0" type="pingd" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="pingd-child:0_monitor_0"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="5:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;5:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="4" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
             <lrm_rsc_op id="pingd-child:0_start_0" operation="start"
crm-debug-origin="do_update_resource"
transition_key="13:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;13:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="6" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
             <lrm_rsc_op id="pingd-child:0_monitor_20000"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="14:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;14:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="7" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="20000" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
           </lrm_resource>
           <lrm_resource id="external_VIP" type="IPaddr2" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="external_VIP_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource"
transition_key="3:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;3:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="2" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="295d3e33fcc839a733bd86ee666491f6"/>
           </lrm_resource>
           <lrm_resource id="internal_VIP" type="IPaddr2" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="internal_VIP_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource"
transition_key="4:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;4:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="3" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="c309120cbcd5acf8308b6af00c3fd33c"/>
           </lrm_resource>
           <lrm_resource id="pingd-child:1" type="pingd" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="pingd-child:1_monitor_0"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="3:1:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;3:1:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="5" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
           </lrm_resource>
         </lrm_resources>
       </lrm>
       <transient_attributes id="e723a418-ba24-470e-9540-fbb568b9bcb4">
         <instance_attributes
id="status-e723a418-ba24-470e-9540-fbb568b9bcb4">
           <attributes>
             <nvpair
id="status-e723a418-ba24-470e-9540-fbb568b9bcb4-probe_complete"
name="probe_complete" value="true"/>
             <nvpair
id="status-e723a418-ba24-470e-9540-fbb568b9bcb4-pingd" name="pingd"
value="2"/>
           </attributes>
         </instance_attributes>
       </transient_attributes>
     </node_state>
     <node_state id="ccca855c-2191-4aa8-8707-88237b72112c" uname="cody"
crmd="online" crm-debug-origin="do_update_resource" in_ccm="true"
ha="active" join="member" expected="member" shutdown="0">
       <lrm id="ccca855c-2191-4aa8-8707-88237b72112c">
         <lrm_resources>
           <lrm_resource id="external_VIP" type="IPaddr2" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="external_VIP_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource"
transition_key="7:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;7:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="2" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="295d3e33fcc839a733bd86ee666491f6"/>
             <lrm_rsc_op id="external_VIP_start_0" operation="start"
crm-debug-origin="do_update_resource"
transition_key="5:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;5:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="7" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="0" op_digest="295d3e33fcc839a733bd86ee666491f6"/>
             <lrm_rsc_op id="external_VIP_monitor_5000"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="6:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;6:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="9" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="5000" op_digest="295d3e33fcc839a733bd86ee666491f6"/>
           </lrm_resource>
           <lrm_resource id="pingd-child:0" type="pingd" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="pingd-child:0_monitor_0"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="9:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;9:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="4" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
           </lrm_resource>
           <lrm_resource id="internal_VIP" type="IPaddr2" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="internal_VIP_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource"
transition_key="8:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;8:0:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="3" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="c309120cbcd5acf8308b6af00c3fd33c"/>
             <lrm_rsc_op id="internal_VIP_start_0" operation="start"
crm-debug-origin="do_update_resource"
transition_key="7:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;7:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="10" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="0" op_digest="c309120cbcd5acf8308b6af00c3fd33c"/>
             <lrm_rsc_op id="internal_VIP_monitor_5000"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="8:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;8:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="12" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="5000" op_digest="c309120cbcd5acf8308b6af00c3fd33c"/>
           </lrm_resource>
           <lrm_resource id="pingd-child:1" type="pingd" class="ocf"
provider="heartbeat">
             <lrm_rsc_op id="pingd-child:1_monitor_0"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="5:1:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:7;5:1:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="6" crm_feature_set="1.0.9" rc_code="7" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
             <lrm_rsc_op id="pingd-child:1_start_0" operation="start"
crm-debug-origin="do_update_resource"
transition_key="15:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;15:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="8" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
             <lrm_rsc_op id="pingd-child:1_monitor_20000"
operation="monitor" crm-debug-origin="do_update_resource"
transition_key="16:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
transition_magic="0:0;16:2:5dfc6976-0cfa-491f-b6c7-80b0c9e3f212"
call_id="11" crm_feature_set="1.0.9" rc_code="0" op_status="0"
interval="20000" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
           </lrm_resource>
         </lrm_resources>
       </lrm>
       <transient_attributes id="ccca855c-2191-4aa8-8707-88237b72112c">
         <instance_attributes
id="status-ccca855c-2191-4aa8-8707-88237b72112c">
           <attributes>
             <nvpair
id="status-ccca855c-2191-4aa8-8707-88237b72112c-probe_complete"
name="probe_complete" value="true"/>
             <nvpair
id="status-ccca855c-2191-4aa8-8707-88237b72112c-pingd" name="pingd"
value="1"/>
           </attributes>
         </instance_attributes>
       </transient_attributes>
     </node_state>
   </status>
 </cib>


Here are some log file snippets too:

From cody (primary node)

heartbeat[29689]: 2007/10/03_10:49:18 WARN: node 192.168.115.38: is dead
crmd[29708]: 2007/10/03_10:49:18 notice: crmd_ha_status_callback: Status
update: Node 192.168.115.38 now has status [dead]
heartbeat[29689]: 2007/10/03_10:49:18 info: Link
192.168.115.38:192.168.115.38 dead.
pingd[29865]: 2007/10/03_10:49:18 notice: pingd_nstatus_callback: Status
update: Ping node 192.168.115.38 now has status [dead]
pingd[29865]: 2007/10/03_10:49:18 info: send_update: 1 active ping nodes
pingd[29865]: 2007/10/03_10:49:18 notice: pingd_lstatus_callback: Status
update: Ping node 192.168.115.38 now has status [dead]
pingd[29865]: 2007/10/03_10:49:18 notice: pingd_nstatus_callback: Status
update: Ping node 192.168.115.38 now has status [dead]
pingd[29865]: 2007/10/03_10:49:18 info: send_update: 1 active ping nodes
crmd[29708]: 2007/10/03_10:49:18 WARN: get_uuid: Could not calculate
UUID for 192.168.115.38
attrd[29707]: 2007/10/03_10:49:19 info: attrd_trigger_update: Sending
flush op to all hosts for: pingd
attrd[29707]: 2007/10/03_10:49:19 info: attrd_ha_callback: flush message
from cody
attrd[29707]: 2007/10/03_10:49:19 info: attrd_perform_update: Sent
update 6: pingd=1
cib[29704]: 2007/10/03_10:51:26 info: cib_stats: Processed 111
operations (4864.00us average, 0% utilization) in the last 10min


From tim (backup node)

attrd[27093]: 2007/10/03_10:49:19 info: attrd_ha_callback: flush message
from cody
attrd[27093]: 2007/10/03_10:49:19 info: attrd_perform_update: Sent
update 7: pingd=2
tengine[27097]: 2007/10/03_10:49:20 info: extract_event: Aborting on
transient_attributes changes for ccca855c-2191-4aa8-8707-88237b72112c
tengine[27097]: 2007/10/03_10:49:20 info: update_abort_priority: Abort
priority upgraded to 1000000
tengine[27097]: 2007/10/03_10:49:20 info: te_update_diff: Aborting on
transient_attributes deletions
crmd[27094]: 2007/10/03_10:49:20 info: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_IPC_MESSAGE origin=route_message ]
crmd[27094]: 2007/10/03_10:49:20 info: do_state_transition: All 2
cluster nodes are eligible to run resources.
pengine[27098]: 2007/10/03_10:49:20 notice: cluster_option: Using
default value '60s' for cluster option 'cluster-delay'
pengine[27098]: 2007/10/03_10:49:20 notice: cluster_option: Using
default value '-1' for cluster option 'pe-error-series-max'
pengine[27098]: 2007/10/03_10:49:20 notice: cluster_option: Using
default value '-1' for cluster option 'pe-warn-series-max'
pengine[27098]: 2007/10/03_10:49:20 info: determine_online_status: Node
tim is online
pengine[27098]: 2007/10/03_10:49:20 info: determine_online_status: Node
cody is online
pengine[27098]: 2007/10/03_10:49:20 info: unpack_find_resource:
Internally renamed pingd-child:0 on cody to pingd-child:1
pengine[27098]: 2007/10/03_10:49:20 info: group_print: Resource Group:
monolith_resources
pengine[27098]: 2007/10/03_10:49:20 info: native_print:     external_VIP
(heartbeat::ocf:IPaddr2):       Started cody
pengine[27098]: 2007/10/03_10:49:20 info: native_print:     internal_VIP
(heartbeat::ocf:IPaddr2):       Started cody
pengine[27098]: 2007/10/03_10:49:20 info: clone_print: Clone Set: pingd
pengine[27098]: 2007/10/03_10:49:20 info: native_print:
pingd-child:0       (heartbeat::ocf:pingd): Started tim
pengine[27098]: 2007/10/03_10:49:20 info: native_print:
pingd-child:1       (heartbeat::ocf:pingd): Started cody
pengine[27098]: 2007/10/03_10:49:20 notice: NoRoleChange: Leave resource
external_VIP   (cody)
pengine[27098]: 2007/10/03_10:49:20 notice: NoRoleChange: Leave resource
internal_VIP   (cody)
pengine[27098]: 2007/10/03_10:49:20 notice: NoRoleChange: Leave resource
pingd-child:0  (tim)
pengine[27098]: 2007/10/03_10:49:20 notice: NoRoleChange: Leave resource
pingd-child:1  (cody)
crmd[27094]: 2007/10/03_10:49:20 info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=route_message ]
tengine[27097]: 2007/10/03_10:49:20 info: unpack_graph: Unpacked
transition 5: 0 actions in 0 synapses
crmd[27094]: 2007/10/03_10:49:20 info: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_IPC_MESSAGE origin=route_message ]
tengine[27097]: 2007/10/03_10:49:20 info: run_graph: Transition 5:
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0)
tengine[27097]: 2007/10/03_10:49:20 info: notify_crmd: Transition 5
status: te_complete - <null>
pengine[27098]: 2007/10/03_10:49:20 info: process_pe_message: Transition
5: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-127.bz2
cib[27090]: 2007/10/03_10:51:29 info: cib_stats: Processed 60 operations
(11000.00us average, 0% utilization) in the last 10min

The "NoRoleChange" sticks out as being a possible problem and I am
guessing that the cause of that has to be with the scores for the
resources in the constraints section of the cib.xml. However, I am
unable to determine what exactly is the issue.

I guess that is about it. Help is much appreciated, thank you.

-- 
Matt Zagrabelny - mzagrabe at d.umn.edu - (218) 726 8844
University of Minnesota Duluth
Information Technology Systems & Services
PGP key 1024D/84E22DA2 2005-11-07
Fingerprint: 78F9 18B3 EF58 56F5 FC85  C5CA 53E7 887F 84E2 2DA2

He is not a fool who gives up what he cannot keep to gain what he cannot
lose.
-Jim Elliot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linux-ha.org/pipermail/linux-ha/attachments/20071003/f48c8948/attachment.pgp>


More information about the Linux-HA mailing list