[Linux-HA] heartbeat 2.0.8: pingd failover
greno at verizon.net
greno at verizon.net
Sat Feb 3 23:44:08 MST 2007
>From: Andreas Kurz <andreas.kurz at gmail.com>
>Date: 2007/02/03 Sat PM 04:53:10 CST
>To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
>Subject: Re: [Linux-HA] heartbeat 2.0.8: pingd failover
>Try to add a monitor operation to the ip_resource to check it regularly.
>
>Is the "pingd_score" completely removed from the cib, in the case of
>an uplugged interface, or is the value of the disconnected node zero?
>... This would explain why the constraint is not working correctly ...
>then a constraint depending on the value of pingd_score would be more
>helpful.
>
>Regards,
>Andreas
>
Ok, I changed things around to try this with a resource group since that is what I really need to end up with.
Same ha.cf and modified cib.xml:
==================================
<cib admin_epoch="0" have_quorum="true" ignore_dtd="false" num_peers="2" cib_feature_revision="1.3" generated="true" epoch="17" num_updates="213" cib-last-written="Sun Feb 4 01:00:05 2007" ccm_transition="4" dc_uuid="29626f17-db1f-4139-aa33-5a6b4110da51">
<configuration>
<crm_config/>
<nodes>
<node id="29626f17-db1f-4139-aa33-5a6b4110da51" uname="server2" type="normal"/>
<node id="67b0bfa7-0165-4a8c-9c0f-ec82e0ae2c91" uname="server1" type="normal"/>
</nodes>
<resources>
<group id="GRP_webserver_RG">
<primitive id="GRP_webserverip_R" class="ocf" type="IPaddr" provider="heartbeat">
<instance_attributes id="GRP_webserver_RA">
<attributes>
<nvpair id="GRP_webserverip_RA_ip" name="ip" value="192.168.1.215"/>
</attributes>
</instance_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="run_GRP_webserver_RG" rsc="GRP_webserver_RG">
<rule id="pref_run_GRP_webserver_RG" score="100">
<expression id="pref_run_GRP_webserver_RG_expr1" attribute="#uname" operation="eq" value="server1"/>
</rule>
</rsc_location>
<rsc_location id="GRP_webserver_RG:not_connected" rsc="GRP_webserver_RG">
<rule id="GRP_webserver_RG:not_connected:rule" score="-INFINITY">
<expression id="GRP_webserver_RG:not_connected:expr" attribute="pingd_score" operation="not_defined"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>
==================================
What I get now is that the resource fails over from server1 to server2. And then when connectivity is restored to server1, the log on server2 shows "Link server1:eth0 up." and "Started server2"; but it actually started server1 which I tested by logging into the webserver virtual ip. So I got the failover but with having autofailback set to no I expected that the IP would remain on server2. It told me that it started it on server2 but actually it is on server1.
So now with the IP on server1 I pulled the network cable and the log on server2 shows server1:eth0 is dead and says Starting server1. And if I try to login to the IP from server1 itself (which is disconnected at this point) it is on server1. This makes no sense.
More information about the Linux-HA
mailing list