[Linux-HA] Measuring time interval required for failover between 2 nodes.

Peter Wong peter.wong at mobidia.com
Fri Jan 5 11:10:23 MST 2007


Greetings:

I have the typical active-standby two node setup with a 
resource group consisting of an application and an IP 
address running on the active node.

I'm looking for ways of measuring the average time it 
takes for the system to failover from the active node 
to the standby node.

I have been asked to have the system fail over from 
node A to node B and then after node B runs for a 
while the system would fail over from node B to node A. 
This flip-flop scenario would be carried out for say 
1000 times.

I have the following questions regarding this scenario:

1. Has anyone done this sort of measurements before?

2. Can Heartbeat handle this flip-flopping of >1000 
   times between the nodes?

3. Are there any scripts/code within the Heartbeat 
   package that would assist in this situation?

4. What is the correct way of measuring this time 
   interval, between one node becomes non-operational 
   and the other node becomes active?

5. In the log files produced by Heartbeat (ha-debug, 
   ha-log), the time stamps have resolution in seconds. 
   Is it possible to get a finer resolution, say 
   milliseconds?

Thanks!

Peter.



More information about the Linux-HA mailing list