[Linux-HA] clone behavior when set on-fail to "restart"

Andrew Beekhof beekhof at gmail.com
Tue Nov 13 03:27:02 MST 2007


On Nov 13, 2007, at 9:11 AM, Junko IKEDA wrote:

>>> Hi,
>>>
>>> I set up one clone and one group like this;
>>> (heartbeat version is 2.1.3, changeset:8cb7664b7bfd)
>>> ----------------------------------------------------------------
>>> Node: prec370e (9d9ca527-cea9-470c-9e03-e49fe5630bba): online
>>> Node: prec370d (6a42b4b7-7a31-4ae7-b241-ed4ad4267ec6): online
>>>
>>> Clone Set: clone
>>>   c-Dummy:0   (heartbeat::ocf:Dummy): Started prec370e
>>>   c-Dummy:1   (heartbeat::ocf:Dummy): Started prec370d
>>> Resource Group: group
>>>   g-Dummy-1   (heartbeat::ocf:Dummy): Started prec370d
>>>   g-Dummy-2   (heartbeat::ocf:Dummy): Started prec370d
>>> ----------------------------------------------------------------
>>>
>>> when I broke the clone on prec370e, (remove the state file)
>>> clone resources switched positions with each other.
>>>
>>> ----------------------------------------------------------------
>>> Node: prec370e (9d9ca527-cea9-470c-9e03-e49fe5630bba): online
>>> Node: prec370d (6a42b4b7-7a31-4ae7-b241-ed4ad4267ec6): online
>>>
>>> Clone Set: clone
>>>   c-Dummy:0   (heartbeat::ocf:Dummy): Started prec370d
>>>   c-Dummy:1   (heartbeat::ocf:Dummy): Started prec370e
>>> Resource Group: group
>>>   g-Dummy-1   (heartbeat::ocf:Dummy): Started prec370d
>>>   g-Dummy-2   (heartbeat::ocf:Dummy): Started prec370d
>>> ----------------------------------------------------------------
>>>
>>> Is it an expected behavior for on-fail="restart"?
>>
>> yes, its juggling the locations so that both clones can remain active
>>
>> (yes, it could possibly be even smarter and just start c-Dummy:0 on
>> prec370e again but at least both are active unlike the the example
>> below which needlessly makes :0 unavailable)
>>
>>> I've got the different result in 2.1.2 with the same cib.xml.
>>> this sounds more reasonable to me.
>>
>> not sure why this is more reasonable... its behaving more like on-
>> fail=stop
>
> year.. i'll set it to on-fail="stop".
>
> but, when I set up clone without group resources,
> clone couldn't remain active despite its on-fail="restart".
>
> ----------------------------------------------------------------
> Clone Set: clone
>    c-Dummy:0   (heartbeat::ocf:Dummy): Started prec370e
>    c-Dummy:1   (heartbeat::ocf:Dummy): Started prec370d
> ----------------------------------------------------------------
>
> # rm -f /var/run/heartbeat/rsctmp/Dummy-c-Dummy\:0.state
> ----------------------------------------------------------------
> Clone Set: clone
>    c-Dummy:0   (heartbeat::ocf:Dummy): Stopped	<= ???
>    c-Dummy:1   (heartbeat::ocf:Dummy): Started prec370e
> ----------------------------------------------------------------


hmm... i'd have expected

> ----------------------------------------------------------------
> Clone Set: clone
>    c-Dummy:0   (heartbeat::ocf:Dummy): Stopped
>    c-Dummy:1   (heartbeat::ocf:Dummy): Started prec370d
> ----------------------------------------------------------------

Fixed in:
    http://hg.beekhof.net/lha/crm-dev/rev/7f413e830a74



More information about the Linux-HA mailing list