[Linux-HA] How to make a colocation rule between a Master/Slave resource's Master and another resource?

Andrew Beekhof abeekhof at suse.de
Wed Apr 4 09:19:12 MDT 2007


On Mar 20, 2007 at 4:37 PM Alan Robertson <alanr at unix.sh> wrote:

>
> Andrew Beekhof wrote:
>> On 3/18/07, Alan Robertson <alanr at unix.sh> wrote:
>>> Lars Marowsky-Bree wrote:
>>> > On 2007-03-16T10:38:25, Alan Robertson <alanr at unix.sh> wrote:
>>> >
>>> >> Here's the rule I don't like...
>>> >>
>>> >>  <rsc_colocation id="fs_on_drbd0" to="drbd0-partition"
>>> to_role="master"
>>> >>      from="mount-drbd0" score="1000" />
>>> >>
>>> >> The value 1000 is arbitrary, and weird.
>>> >
>>> > Well, weights _are_ arbitrary. That much is true. They only  
>>> make sense
>>> > in the context of a specific configuration.
>>>
>>> Right.  Which is not a good thing when they can be avoided.  Because
>>> they're hard to explain, and hard to learn, and lead to certain  
>>> kinds of
>>> human errors over time.
>>>
>>> For something like this where it's a VERY common thing to want to  
>>> do and
>>> an easy thing to understand, it ought to be easy to do, and not  
>>> be error
>>> prone.
>>>
>>> >> Is this a valid rule?
>>> >> <rsc_colocation id="fs_on_drbd0" to="drbd0-partition"
>>> to_role="stopped"
>>> >>   from="mount-drbd0" score="-infinity" />
>>> >
>>> > I guess so, yes.
>>> >
>>> >> This is what is now on the web page as the alternative rule...
>>> >
>>> > Can we go back a step and have you explain to me again what the
>>> use-case
>>> > is you're trying to achieve?
>>>
>>> Create a set of rules which do ONLY this, without side-effects
>>>
>>>         run only on a node which resource X is master
>>>                 (regardless of how many masters are in the  
>>> arrangement)
>>>
>>> By "ONLY this, without side effects" I mean have NO other  
>>> influence on
>>> where things run than this. +INFINITY commonly has side-effects  
>>> in some
>>> configurations because it overwhelms all other resource placement
>>> considerations and causes them to be ignored.  By the way, DRBD 8  
>>> could
>>> be modeled as a multi-master resource quite nicely.
>>>
>>> For example, if we model DRBD 8 as a multi-master resource, and  
>>> one side
>>> is on site X and one is on site Y, then there may be VERY GOOD  
>>> reasons
>>> to prefer site X to site Y (or vice versa), but once you use  
>>> +INFINITY
>>> then all other possible considerations about where to run the  
>>> resource
>>> become irrelevant - because +INFINITY +/- anything is +INFINITY.
>>
>> not true
>>
>> as i tried to explained on the phone, colocation with any type of
>> clone (including master/slave) does not swamp the resource's original
>> node preferences.
>>
>> when choosing a clone instance to collocate with, we look for
>> instances on nodes preferred (using rsc_location constraints) by the
>> original resource (and in the order it prefers them).
>>
>> only afterward the clone instance and the resource are linked is the
>> INFINITY score applied.
>> but as you remain unconvinced, can you use ptest to construct a case
>> we don't handle correctly?
>
>
> 1) On the phone I came up with a counterexample to this which you  
> agreed
>   was right (after you pointed out exactly how the algorithms work).
>   I've honestly forgotten the details of it.

if you remember it, then grab ptest and lets restart this discussion.

at the moment I consider this a theoretical problem only

>
> 2) This is relying on a current implementation detail - which is  
> subject
>   to change in the future.

this is a red-herring... even though the mechanism may change again  
one day, the effect (ie. non-swamping) is unlikely to.

> There is nothing in the XML or the DTD
>   which would give one this idea.  Your detailed knowledge of the
>   current algorithm lets you know this.  If it ever stops working
>   in the future, one could not consider this a bug - since this
>   specific behavior isn't implied by the XML or DTD.  And, documenting
>   this as the suggested solution will prevent certain kinds of
>   future redesigns, because people were relying on this implementation
>   detail.  Therefore, this is not an optimal idea.
>
> 3) My suggested alternative does not rely on implementation details,

of course it does, and worse, it basically amounts to voodoo.

the ONLY way to _guarantee_ we will place resources together is to  
use INFINITY. period.
anything else is a hint which may or may not be complied with.

>   and works at least as well, because it relies on putting all
>   the right constraints in the XML.  If this set of constraints
>   breaks in the future, it would be because the implementation
>   was broken, because one of the explicit constraints would have to be
>   violated.  IMHO, this is a better approach to suggest to people
>   than one which relies on knowing the details of the current
>   algorithms.
>
>
>>> I really think that avoiding +INFINITY is a really good idea in  
>>> general.
>>
>> no, please do not go around suggesting this.
>
> I'm suggesting this as a general rule of thumb for what seem like good
> reasons to me.  Help me with this.  Most of the times I've seen  
> someone
> put in a +INFINITY, it turned out to be better replaced with a negated
> condition and -INFINITY.

suggesting a -INFINITY alternative is fine for rsc_location  
constraints but absolutely not for colocation.

>
> I don't mean there are no circumstances under which +INFINITY is the
> right answer, just that there are many fewer than it might at first
> seem, and honestly, I haven't yet seen any where it was the right  
> choice.

see above, colocation.

>
> If you have two nodes, and no sophistication in your rules, and you
> really really want it to run on only one node, then you won't be
> disappointed with +INFINITY - in the short term.  But, if you have any
> non-INFINITY considerations, and more than one node with +INFINITY on
> it, then this is probably not the solution for you.
>
> And, even for those cases, a negated condition with -INFINITY is at
> least as good, and it allows you to update/upgrade your rules in the
> future with minimal disruption.
>
> For example, let's say we have my favorite attribute "has_fc" and it's
> either 0 or 1.

then we're not talking about colocation anymore which is what the  
discussion has been about until now

>
> If you give +INFINITY to has_fc == 1, then you can't write any more
> rules to decide which one is better -- among the set of nodes for  
> which
> has_fc == 1.  But, if you write a rule which assigns -INFINITY for
> has_fc == 0, then you can add other preferential rules all you want,
> without them being "swamped" by the +INFINITY score.
>
> This is what I'm talking about above.
>
> If you start off with the +INFINITY rule, then want to add another  
> rule
> for choosing the machine with the most RAM to run this on, you have to
> replace the +INFINITY rule with the -INFINITY negated rule before the
> rule on RAM size can have any effect.  So, why not just start there?
>
> If you have a predicate which by definition can only hold true for one
> machine, then +INFINITY will work fine for you.  But, so will - 
> INFINITY
> and a negated condition - unless you can't express the negated  
> condition
> according to the DTD.
>
> By avoiding +INFINITY you also completely eliminate the possibility of
> the (INFINITY-INFINITY) error condition.  So far, this is all goodness
> to me.

except when we decide not to put two resources together that the  
admin sort-of-kind-of-maybe wanted on the same node.

i'm pretty sure there'll be howls of frustration when that happens...

>
> The evidence I'm familiar with suggests that this is indeed a good
> general rule.  Not a rule without exceptions, but one that creates a
> good approach in general.

the other way around, there are exceptions (such as rsc_location)  
where this applies, but in general its bad advice

> So, what have I missed?

a complete understanding of the PE.
which is not meant to be an insult, its just a fact of life that the  
person that writes something as complex as the PE will understand it  
better.



More information about the Linux-HA mailing list