[Linux-HA] speed up fail over time

Andrew Beekhof beekhof at gmail.com
Fri Jul 11 03:56:14 MDT 2008


On Jul 11, 2008, at 9:29 AM, Junko IKEDA wrote:

>>> Hi,
>>>
>>> We are now trying to show a good performance report to the potential
>>> customer.
>>> Our customer's requests are here;
>>> * There are more than 100 resources on one node.
>>> * 100 resources are included in one group, so they would start/stop
>>> sequentially.
>>> * Fail over for all of 100 resources should complete within 1  
>>> minute.
>>
>> thats less than a second per resource (since members of a group are
>> started sequentially)... is your resource capable of starting so
>> quickly?
>>
>> in truth, i think that for group that size, 1 minute is an  
>> unrealistic
>> deadline (assuming its not just full of Dummy resources)
>>
>>> * Heartbeat stable 2.1 (maybe release as 2.1.4, soon)
>>> It took about 4 minutes for fail over.
>>>
>>> * Heartbeat-dev(5072025b79b8) + Pacemaker-0.7(ee6832884524)
>>> It took about 3 minutes for fail over.
>>> It's getting better!
>>> Is this some effect of the new xml parser?
>>
>> possible - but more likely the performance optimization i've been
>> doing over the last couple of weeks and months.
>>
>> did you cause the DC fail or another node?
>> because the load spikes generated by electing a new DC have been
>> reduced by 70-80% (no, thats not a typo)
>>
>> and before you ask, no, these changes will never be part of 2.1.x
>>
>>> hb_report are so huge, I created the bugzilla as enhancement.
>>> http://developerbugs.linux-foundation.org/show_bug.cgi?id=1935
>>>
>>> Do you have any good idea to speed up fail over time?
>>
>> split the group up :)
>
> That's what I thought...
> I got more detail, the target system would have 9 nodes, there are 8  
> actives
> and 1 stand-by.
> Each node has one group which contains 15 resources.
> I tried 1 act + 1 sby with 8 group (1 group has 15 resources) as a  
> test,
> It took about 70 - 80 seconds for fail over.
> Is it a reasonable time?

Impossible to say without knowing the characteristics of your resources.

For every item in groupX, add
  the minimum time it takes to start the resource

If the failure isn't one that will cause a STONITH operation, also add
  the minimum time it takes to stop the resource

Do that for every group you have and take the maximum value.

Assuming there is zero overhead in the crm, and that no other  
resources needed to be moved in order for the group to be relocated,  
then this is the minimum possible failover time.


Assuming one second to start or stop each resource in your scenario,  
thats 30s.
Factor in the lrmd limits us to 4 actions at a time, and that means  
that we can only stop/start four of the groups at any one time.

So the absolute minimum time to failover all 8 groups is 60s.

However we also need to factor in how long it took for the CRM to  
notice that a migration was required... if you caused it by failing a  
resource, then it could take up to a whole monitor interval which  
would need to be added to the 60s.

And remember thats the _minimum_ time... how do your resources behave  
under high CPU/disk/network loads?

> Is there any tunable value?
> For example; MAXMSG, MAXUNCOMPRESSED, max_child_count...

max_child_count (lrmd) would have an effect, as would batch-limit  
(see: pengine metadata) and reducing the size of the groups.
All of these things affect the amount of parallelism in the cluster.

>>> It would be best if the performance improvement is available with
>>> Heartbeat
>>> 2.1.4.
>>> I know this kind of performance improvement is not so easy, But this
>>> is a
>>> matter of the greatest urgency If it comes in after the nearest
>>> release, we
>>> are planning to backport it to 2.1.4 for our customer individually.
>>
>> i think you'll find that is an extremely non-trivial task
>
> Year, I know we should recommend Heartbeat 3.0(?) + Pacemaker 1.0,
> but it seems that we have no time to wait them this time.
>
> (it's just our schedule)

When is your deadline?  Perhaps I can alter the Pacemaker schedule to  
accommodate your plans.



More information about the Linux-HA mailing list