[Linux-HA] Measuring time interval required for failover between
2 nodes.
vini.bill at gmail.com
vini.bill at gmail.com
Fri Jan 5 11:48:48 MST 2007
The method I use here ( requires the cluster computers plus on ficticious
client machine ), with the cluster running smothly:
- On the "fictitious client" ping the active IP ( it should be a virtual ip
).
- Pull off the cable from the main node
- the main node will stop responding our fictitious client for a while
- the secondary node takes over the cluster and the client get the responses
from the cluster.
Answering your questions:
1) Yes. Here, it took me 5 seconds of failure.
2) Certanly, but it`s a bit tricky since you have to use the CIB to create
some rules.
3) No. Heartbeat does everything on it`s own. Of course... there`s always
the /etc/init.d/ or the /etc/ha.d/resources scripts being the ones that
initialize a given resource for heartbeat.
4) The number of timeouts you get from the client is the number of seconds
the cluster has failed to give some response.
5) yes, in the ha.cf file change every measure you have defined in seconds (
or no unity at all ) as miliseconds ( addin ms after the number )
Hope it helps.
Vinicius Menezes
On 1/5/07, Peter Wong <peter.wong at mobidia.com> wrote:
>
> Greetings:
>
> I have the typical active-standby two node setup with a
> resource group consisting of an application and an IP
> address running on the active node.
>
> I'm looking for ways of measuring the average time it
> takes for the system to failover from the active node
> to the standby node.
>
> I have been asked to have the system fail over from
> node A to node B and then after node B runs for a
> while the system would fail over from node B to node A.
> This flip-flop scenario would be carried out for say
> 1000 times.
>
> I have the following questions regarding this scenario:
>
> 1. Has anyone done this sort of measurements before?
>
> 2. Can Heartbeat handle this flip-flopping of >1000
> times between the nodes?
>
> 3. Are there any scripts/code within the Heartbeat
> package that would assist in this situation?
>
> 4. What is the correct way of measuring this time
> interval, between one node becomes non-operational
> and the other node becomes active?
>
> 5. In the log files produced by Heartbeat (ha-debug,
> ha-log), the time stamps have resolution in seconds.
> Is it possible to get a finer resolution, say
> milliseconds?
>
> Thanks!
>
> Peter.
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
--
... Vinicius Menezes ...
More information about the Linux-HA
mailing list