[Linux-HA] Two nodes cluster failover configuration
Andrew Beekhof
beekhof at gmail.com
Fri Nov 9 10:16:48 MST 2007
On Nov 6, 2007, at 6:38 PM, Igor Neves wrote:
> On Tue, 6 Nov 2007 18:14:03 +0100
> Andrew Beekhof <beekhof at gmail.com> wrote:
>
>>
>> On Nov 6, 2007, at 5:43 PM, Igor Neves wrote:
>>
>>> On Tue, 6 Nov 2007 16:23:34 +0100
>>> Andrew Beekhof <beekhof at gmail.com> wrote:
>>>
>>>>
>>>> On Nov 6, 2007, at 12:44 PM, Igor Neves wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm fresh new to heartbeat, and I'm still learning all the
>>>>> configurations.
>>>>>
>>>>> I'm using this rule:
>>>>>
>>>>> <rsc_location id="my_resource:connected" rsc="my_resource">
>>>>> <rule id="my_resource:connected:rule" score_attribute="pingd" >
>>>>> <expression id="my_resource:connected:expr:defined"
>>>>> attribute="pingd" operation="defined"/>
>>>>> </rule>
>>>>> </rsc_location>
>>>>>
>>>>> This is working fine, for the example it was made.
>>>>>
>>>>> I need something different, because both of my machines (2 nodes)
>>>>> have the same weight, so both can do the job, since this is only
>>>>> one fail over cluster.
>>>>> Knowing this, i enable the option resource stickiness to INFINITY,
>>>>> to disable the resource group I have created, from migrating again
>>>>> to the main node.
>>>>>
>>>>> Example, i start both nodes, my resource it's at node1, I take out
>>>>> network connectivity from node1, and node2 take care of all the
>>>>> resources. Working fine, but when node1 its up again, the resource
>>>>> should not migrate to node1 again, should stay on node2.
>>>>>
>>>>> My problem with this is, if i have resource_stickiness set to
>>>>> 'INFINITY' and i use this rsc_location rule, that reads the score
>>>>> from the pingd daemon, if the network connectivity goes down, for
>>>>> node1, where the resources were running, they not migrate to
>>>>> node2.
>>>>
>>>> You need to add this rule.
>>>>
>>>>> <rsc_location id="my_resource:connected" rsc="my_resource">
>>>>> <rule id="my_resource:connected:rule" score="-INFINITY" >
>>>>> <expression id="my_resource:connected:expr:defined"
>>>>> attribute="pingd" operation="not_defined"/>
>>>>> </rule>
>>>>> </rsc_location>
>>>>
>>>>
>>>> which excludes any node from running my_resource if it doesn't have
>>>> a value for "pingd"
>>>
>>> Yes it is like that, but its not what i want, i want that
>>> my_resource only run on the host with the higher value of the pingd.
>>
>> This contradicts
>
> With the rule:
>
> <rsc_location id="my_resource:connected" rsc="my_resource">
> <rule id="my_resource:connected:rule" score="-INFINITY" >
> <expression id="my_resource:connected:expr:defined"
> attribute="pingd" operation="not_defined"/>
> </rule>
> </rsc_location>
>
> This means that if node1 is primary (score 200), node2 is secondary
> (score 200), and network link on node1 with the switch fails, node1
> will have score 100, and node2 will have score 200, but my_resource
> will still be in node1.
>
> I need to tell somehow, that my_resource should run always where pingd
> have more score.
You can't have this AND set stickiness > 100.
You need to decide which behavior you want.
>
>
>>
>>>>> Working fine, but when node1 its up again, the resource
>>>>> should not migrate to node1 again, should stay on node2.
>>
>>
>> You need to decide which behavior you want.
>>
>>> The purpose of this is, cluster survive network failures.
>>>
>>> My ideia it's, put the ping hosts to:
>>> * node1: 192.168.1.53 (switch), 10.0.0.52 (node2 heartbeat NIC)
>>> * node2: 192.168.1.53 (switch), 10.0.0.51 (node1 heartbeat NIC)
>>>
>>> With this setup, when the switch fails (burn or something :) ) the
>>> cluster will know switch have failed, but will not stop the
>>> resource, because until we have heartbeat link the pingd will
>>> always have some score, and if the switch fail, resource should not
>>> be stoped or moved.
>>>
>>> Eg1 (normal behaviour, machine failure):
>>> a) node1 primary -> node1 pingd=200, node2 pingd=200
>>> b) node1 fail -> node1 pingd=0, node2 pingd=200
>>> c) node2 primary -> node1 pingd=0, node2 pingd=200
>>> d) node1 online -> node1 pingd=200, node2 pingd=200
>>>
>>> Eg2 (switch failure):
>>> a) node1 primary -> node1 pingd=200, node2 pingd=200
>>> b) switch fail -> node1 pingd=100, node2 pingd=100
>>> c) my_resource -> remain in node1 because of the stickybit
>>>
>>> Eg3 (network fail on node1):
>>> a) node1 primary -> node1 pingd=200, node2 pingd=200
>>> b) node1 network fail -> node1 pingd=100, node2 pingd=200
>>> c) node2 primary -> node1 pingd=100, node2 pingd=200
>>> d) node1 network up -> node1 pingd=200, node2 pingd=200
>>> e) my_resource -> remain in node2 because of the stickybit
>>>
>>>>
>>>>>
>>>>>
>>>>> If i set resource_stickiness to '0', this works very fine, but,
>>>>> when node1 comes up again, heartbeat migrates the resources again
>>>>> to node1.
>>>>>
>>>>> The help i need is, how do i set this behaviour, but disable the
>>>>> migration after node1 comes up again?
>>>>>
>>>>> The idea it's to survive and provide always service, if the public
>>>>> network connectivity it's loss, on any of the nodes, and prevent
>>>>> migrating resources when not needed.
>>>>>
>>>>> Thanks.
>>>
>>> Thanks for the help.
>>> ---------
>>> Igor Neves <igor.neves at 3gnt.net>
>>> 3GNTW - Tecnologias de Informação, Lda
>>>
>>> sip igor at 3gnt.net jid igor at jabber.3gnt.org
>>> icq 249075444 tlm 00351914503611
>>>
>>
>
>
> ---------
> Igor Neves <igor.neves at 3gnt.net>
> 3GNTW - Tecnologias de Informação, Lda
>
> sip igor at 3gnt.net jid igor at jabber.3gnt.org
> icq 249075444 tlm 00351914503611
>
More information about the Linux-HA
mailing list