[Linux-HA] problem with locations depending on pingd
Andrew Beekhof
beekhof at gmail.com
Fri Nov 9 09:43:14 MST 2007
On Nov 9, 2007, at 1:44 PM, Sebastian Reitenbach wrote:
> Hi,
>
> I changed the resources to look like this:
>
> <rsc_location id="NFS_SW_PLACE" rsc="NFS_SW">
> <rule id="prefered_NFS_SW_PING" score="-INFINITY" boolean_op="or">
> <expression id="NFS_SW_PLACE_defined" attribute="pingd"
> operation="not_defined"/>
> <expression id="NFS_SW_PLACE_lte" attribute="pingd"
> operation="lte"
> value="0"/>
> </rule>
> <rule id="prefered_NFS_SW_HOST" score="100" boolean_op="or">
> <expression id="NFS_SW_HOST_uname" attribute="#uname"
> operation="eq"
> value="ppsnfs101"/>
> </rule>
> </rsc_location>
>
>
> It seems to work well on startup, but I still have the same problem
> that the
> attribute that the pingd sets is not reset to 0 when pingd stops
> receiving
> ping answers from the ping node.
Looks like heartbeat didn't notice the ping node went away.
If that doesn't happen, then the score wouldn't change.
Are you sure you made the right change?
>
> I created a bugzilla entry, with a hb_report appended:
> http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?
> id=1770
>
> kind regards
> Sebastian
>
>
> Sebastian Reitenbach <sebastia at l00-bugdead-prods.de>,General Linux-HA
> mailing list <linux-ha at lists.linux-ha.org> wrote:
>> Hi Dejan,
>>
>> thank you very much for your helpful hints, I got it mostly
>> working. I
>> initially generated the constraints via the GUI, and did not
>> recognized
> the
>> subtle differences.I changed them manually to look like what you
> suggested,
>> in your first example. I have to admit, I did not tried yet the -
>> INFINITY
>> example you gave, where the resources will refuse to work on a node
> without
>> connectivity. Because I think it would not work, when I see my
> observations:
>>
>> In the beginning, after cluster startup, node
>> 262387d6-3ba0-4001-95c6-f394d1ba243f
>> is not able to ping, node 15854123-86ef-46bb-bf95-79c99fb62f46 is
>> able to
>> ping
>> the defined ping node.
>> cibadmin -Q -o status | grep ping
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>> <nvpair id="status-262387d6-3ba0-4001-95c6-f394d1ba243f-
>> pingd"
>> name="pingd" value="0"/>
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>> <nvpair id="status-15854123-86ef-46bb-bf95-79c99fb62f46-
>> pingd"
>> name="pingd" value="100"/>
>>
>> then, all four resources are on host 15854123-86ef-46bb-
>> bf95-79c99fb62f46,
>> so everything as I expected.
>> then, I changed the firewall to not answer pings from
>> 15854123-86ef-46bb-bf95-79c99fb62f46
>> but instead answer pings from: 262387d6-3ba0-4001-95c6-
>> f394d1ba243f, then
>> it took some seconds, and the output changed to:
>>
>> cibadmin -Q -o status | grep ping
>> <nvpair id="status-262387d6-3ba0-4001-95c6-f394d1ba243f-
>> pingd"
>> name="pingd" value="100"/>
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>> <nvpair id="status-15854123-86ef-46bb-bf95-79c99fb62f46-
>> pingd"
>> name="pingd" value="100"/>
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>>
>> and two of the resources went over to the node
>> 262387d6-3ba0-4001-95c6-f394d1ba243f.
>>
>> but also after some more minutes, the output of cibadmin -Q -o
>> status |
> grep
>> ping
>> did not changed again. Id expected it to look like this:
>> <nvpair id="status-262387d6-3ba0-4001-95c6-f394d1ba243f-
>> pingd"
>> name="pingd" value="100"/>
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>> <nvpair id="status-15854123-86ef-46bb-bf95-79c99fb62f46-
>> pingd"
>> name="pingd" value="0"/>
>> <lrm_resource id="PING:0" type="pingd" class="ocf"
>> provider="heartbeat">
>> <lrm_resource id="PING:1" type="pingd" class="ocf"
>> provider="heartbeat">
>> and that the two resources from 15854123-86ef-46bb-
>> bf95-79c99fb62f46 would
>> migrate to
>> node 262387d6-3ba0-4001-95c6-f394d1ba243f
>>
>> My assumption is, that the -INFINITY example would only work, when
>> the
> value
>> for the id
>> status-15854123-86ef-46bb-bf95-79c99fb62f46-pingd would be resetted
>> to 0
> at
>> some
>> point, but it is not. Therefore I did not tried.
>>
>>
>> below are my constraints, the ping clone resource, and an exemplary
>> Xen
>> resource.
>>
>> <constraints>
>> <rsc_order id="FIS_DB_before_MGMT_DB" from="FIS_DB" type="before"
>> to="MGMT_DB" action="start" symmetrical="false" score="0"/>
>> <rsc_order id="FIS_DB_before_NFS_MH" from="FIS_DB" type="before"
>> to="NFS_MH" action="start" symmetrical="false" score="0"/>
>> <rsc_order id="FIS_DB_before_NFS_SW" from="FIS_DB" type="before"
>> to="NFS_SW" action="start" symmetrical="false" score="0"/>
>> <rsc_order id="MGMT_DB_before_NFS_SW" from="MGMT_DB" type="before"
>> to="NFS_SW" action="start" symmetrical="false" score="0"/>
>> <rsc_order id="MGMT_DB_before_NFS_MH" from="MGMT_DB" type="before"
>> to="NFS_MH" action="start" symmetrical="false" score="0"/>
>> <rsc_order id="NFS_MH_before_NFS_SW" from="NFS_MH" type="before"
>> to="NFS_SW" action="start" symmetrical="false" score="0"/>
>> <rsc_location id="FIS_DB_PLACE" rsc="FIS_DB">
>> <rule id="prefered_FIS_DB_PLACE" score_attribute="pingd">
>> <expression attribute="pingd"
>> id="e248586f-284b-4d6e-86a1-86ac54cecb3d" operation="defined"/>
>> </rule>
>> </rsc_location>
>> <rsc_location id="NFS_SW_PLACE" rsc="NFS_SW">
>> <rule id="prefered_NFS_SW_PLACE" score_attribute="pingd">
>> <expression attribute="pingd"
>> id="ccd4c85c-7b30-48c5-806e-d37a42e3db5b" operation="defined"/>
>> </rule>
>> </rsc_location>
>> <rsc_location id="MGMT_DB_PLACE" rsc="MGMT_DB">
>> <rule id="prefered_MGMT_DB_PLACE" score_attribute="pingd">
>> <expression attribute="pingd"
>> id="ff209e83-ac2e-4dad-901b-f6496c652f3b" operation="defined"/>
>> </rule>
>> </rsc_location>
>> <rsc_location id="NFS_MH_PLACE" rsc="NFS_MH">
>> <rule id="prefered_NFS_MH_PLACE" score_attribute="pingd">
>> <expression attribute="pingd"
>> id="4349f298-2f36-4bfa-9318-ed9863ab32bb" operation="defined"/>
>> </rule>
>> </rsc_location>
>> </constraints>
>>
>>
>>
>> <clone id="PING_CLONE" globally_unique="false">
>> <meta_attributes id="PING_CLONE_meta_attrs">
>> <attributes>
>> <nvpair id="PING_CLONE_metaattr_target_role"
>> name="target_role"
>> value="started"/>
>> <nvpair id="PING_CLONE_metaattr_clone_max" name="clone_max"
>> value="2"/>
>> <nvpair id="PING_CLONE_metaattr_clone_node_max"
>> name="clone_node_max" value="1"/>
>> <nvpair id="PING_CLONE_metaattr_globally_unique"
>> name="globally_unique" value="false"/>
>> </attributes>
>> </meta_attributes>
>> <primitive id="PING" class="ocf" type="pingd"
>> provider="heartbeat">
>> <instance_attributes id="PING_instance_attrs">
>> <attributes>
>> <nvpair id="8381fc80-bdfa-4cf2-9832-be8ff5c7375f"
> name="pidfile"
>> value="/tmp/PING.pid"/>
>> <nvpair id="142b69d4-2145-4095-afb2-4859a0bb2cee"
>> name="user"
>> value="root"/>
>> <nvpair id="d313ca32-d470-43d2-a234-7c240246d9c9"
>> name="host_list" value="192.168.102.199"/>
>> <nvpair id="57f26ccf-b90d-44b8-a8f2-9e5ab91f2bc3"
>> name="name"
>> value="pingd"/>
>> <nvpair id="159a60eb-e838-4a74-9186-b58e9bf3b3f9"
>> name="dampen"
>> value="5s"/>
>> <nvpair id="aad4e482-0560-4781-bdaf-e24b69bdb7c8"
>> name="multiplier" value="100"/>
>> </attributes>
>> </instance_attributes>
>> </primitive>
>> </clone>
>>
>>
>>
>> <primitive class="ocf" type="Xen" provider="heartbeat" id="NFS_MH">
>> <instance_attributes id="NFS_MH_instance_attrs">
>> <attributes>
>> <nvpair id="0b43c873-b1fe-4a5e-a542-33dff1de9eff"
>> name="xmfile"
>> value="/etc/xen/vm/NFS_MH"/>
>> <nvpair id="ef00900d-1413-47db-9ad2-e7df01db49f4"
>> name="reserved_Dom0_memory" value="512"/>
>> <nvpair id="3fcabbb2-020e-44f7-a204-2e3e8eab32ae"
>> name="allow_mem_management" value="1"/>
>> <nvpair id="df2f7960-75fa-4321-b694-86db72386cbe"
>> name="monitor_scripts" value="/root/bin/check_nfsmh.sh"/>
>> </attributes>
>> </instance_attributes>
>> <meta_attributes id="NFS_MH_meta_attrs">
>> <attributes>
>> <nvpair name="target_role" id="NFS_MH_metaattr_target_role"
>> value="started"/>
>> </attributes>
>> </meta_attributes>
>> <operations>
>> <op id="d0ad49d1-39a7-4881-95fe-d95413011c8b" name="monitor"
>> description="Monitor MH" interval="10" timeout="30" start_delay="60"
>> disabled="false" role="Started" prereq="nothing" on_fail="restart"/>
>> </operations>
>> </primitive>
>>
>>
>>
>> kind regards
>> Sebastian
>>
>> Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
>>> Hi,
>>>
>>> On Wed, Nov 07, 2007 at 06:31:54PM +0100, Sebastian Reitenbach
>>> wrote:
>>>> Hi,
>>>>
>>>> I tried to follow http://www.linux-ha.org/pingd, the section
>>>> "Quickstart - Only Run my_resource on Nodes with Access to at Least
> One
>> Ping
>>>> Node"
>>>>
>>>> therefore I have created the following pingd resources:
>>>>
>>>> <clone id="PING_CLONE">
>>>
>>> <clone id="PING_CLONE" globally_unique="false">
>>>
>>> because all the clones will be equal.
>>>
>>>> <meta_attributes id="PING_CLONE_meta_attrs">
>>>> <attributes>
>>>> <nvpair id="PING_CLONE_metaattr_target_role"
> name="target_role"
>>>> value="started"/>
>>>> <nvpair id="PING_CLONE_metaattr_clone_max" name="clone_max"
>>>> value="2"/>
>>>> <nvpair id="PING_CLONE_metaattr_clone_node_max"
>>>> name="clone_node_max" value="1"/>
>>>> </attributes>
>>>> </meta_attributes>
>>>> <primitive id="PING" class="ocf" type="pingd"
> provider="heartbeat">
>>>> <instance_attributes id="PING_instance_attrs">
>>>> <attributes>
>>>> <nvpair id="8381fc80-bdfa-4cf2-9832-be8ff5c7375f"
>> name="pidfile"
>>>> value="/tmp/PING.pid"/>
>>>> <nvpair id="142b69d4-2145-4095-afb2-4859a0bb2cee"
> name="user"
>>>> value="root"/>
>>>> <nvpair id="d313ca32-d470-43d2-a234-7c240246d9c9"
>>>> name="host_list" value="192.168.102.199"/>
>>>> <nvpair id="57f26ccf-b90d-44b8-a8f2-9e5ab91f2bc3"
> name="name"
>>>> value="pingd"/>
>>>
>>> add these two
>>> <nvpair id="..." name="dampen" value="5s"/>
>>> <nvpair id="..." name="multiplier" value="100"/>
>>>
>>>> </attributes>
>>>> </instance_attributes>
>>>> </primitive>
>>>> </clone>
>>>>
>>>>
>>>> and here is my location constraint (entered via hb_gui,
>>>> thererfore is
> a
>>>> value there):
>>>>
>>>> <rsc_location id="NFS_MH_PLACE" rsc="NFS_MH">
>>>> <rule id="prefered_NFS_MH_PLACE" score="100">
>>>> <expression attribute="pingd"
>>>> id="4349f298-2f36-4bfa-9318-ed9863ab32bb" operation="defined"
>> value="af"/>
>>>> </rule>
>>>
>>> Looks somewhat strange. There are quite a few better examples on
>>> the page you quoted:
>>>
>>> <rsc_location id="my_resource:connected" rsc="my_resource">
>>> <rule id="my_resource:connected:rule" score_attribute="pingd">
>>> <expression id="my_resource:connected:expr:defined"
>>> attribute="pingd" operation="defined"/>
>>> </rule>
>>> </rsc_location>
>>>
>>> or, perhaps better:
>>>
>>> <rsc_location id="my_resource:connected" rsc="my_resource">
>>> <rule id="my_resource:connected:rule" score="-INFINITY"
> boolean_op="or">
>>> <expression id="my_resource:connected:expr:undefined"
> attribute="pingd"
>> operation="not_defined"/>
>>> <expression id="my_resource:connected:expr:zero" attribute="pingd"
>> operation="lte" value="0"/>
>>> </rule>
>>> </rsc_location>
>>>
>>> The latter will have a score of -INFINITY for all nodes which
>>> don't have an attribute or it's value is zero thus preventing the
>>> resource from running there.
>>>
>>>> The 192.168.102.199 is just an openbsd host, pingable from both
> cluster
>>>> nodes. The NFS_MH resource is a Xen domU.
>>>> On startup of the two cluster nodes, the NFS_MH node went to node1.
>>>> Then I reconfigured the firewall of the ping node to only answer
>>>> pings from node2.
>>>> In the cluster itself, nothing happened, but I expected the
>>>> resource
> to
>>>> relocate to the node with connectivity. I still must do sth.
>>>> wrong I
>> think,
>>>> any hints?
>>>
>>> If you want to check if pingd really works, just do cibadmin -Q
>>> and check the status section of nodes for pingd attribute.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>>
>>>> kind regards
>>>> Sebastian
>>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list