[Linux-HA] Couple of beginner questions
Christian Iversen
chrivers at iversen-net.dk
Tue May 20 11:51:34 MDT 2008
Dear $ALL
I've just started using heartbeat, which seems like a really nifty
program overall. I'm unsure why I haven't used it much earlier, because
it's really great.
I have a couple of beginner questions, though. My test-setup is two
nodes, test1 and test2, sitting behind a router, router0. The two
machines currently talk to each other through the router, but they
can/will get a dedicated Ethernet channel between them (crossed link).
1) I'm using the following settings:
keepalive 200ms
deadtime 1000ms
No matter what kind of load I put on the machines, this never seems to
break down. Timings this tight allow me to use 5-second failover time
for a HA-NFS server. My question is this: Is there some (perhaps
non-obvious) reason this might be a bad idea? All the documentation
suggests higher times, so I'm wondering.
2) If I unplug eth0 from test1, the cluster will be split-brained,
because neither node can make a decision to be primary or fail. I've
read that "ipfaild" can be used to detect missing-link situations, and
react differently. Can anyone point to some examples, or help me set it
up? And, is it even the right tool.
3) If I forkbomb test1, it is (of course) completely dead service-wise,
but still sending out heartbeats(!). I've read that a service monitoring
daemon can solve this, by checking reasonable access times to (say) NFS.
Can someone recommend examples or documentation? Or, can someone help
set this up? :)
Thanks in advance.
--
Med venlig hilsen
Christian Iversen
More information about the Linux-HA
mailing list