[Linux-HA] Standby Node Refuses to Take Over
davies147 at gmail.com
Tue Oct 5 04:47:37 MDT 2010
On 1 October 2010 16:59, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> On Mon, Sep 27, 2010 at 09:43:37AM -0700, Robinson, Eric wrote:
>> The primary node hung and the applications became unresponsive, but DRBD
>> status was good and up to date on both nodes, so I did a hb_takeover on
>> the standby node. Following is all that appeared in the ha-debug.log on
>> the standby. (I could not see the log on the primary because I could not
>> login to it.)
>> heartbeat: 2010/09/27_05:47:12 debug: }/*G_remove_client;*/
>> heartbeat: 2010/09/27_05:50:29 debug: StartNextRemoteRscReq() -
>> calling hook
>> heartbeat: 2010/09/27_05:50:29 debug: notify_world: invoking
>> harc: OLD status: active
>> heartbeat: 2010/09/27_05:50:29 debug: Process [hb_takeover]
>> started pid 23304
>> heartbeat: 2010/09/27_05:50:29 debug: Starting notify process
>> heartbeat: 2010/09/27_05:50:29 debug: notify_world: setting
>> SIGCHLD Handler to SIG_DFL
>> heartbeat: 2010/09/27_05:50:29 debug: notify_world: Running harc
>> harc: 2010/09/27_05:50:29 info: Running
>> /etc/ha.d/rc.d/hb_takeover hb_takeover
>> heartbeat: 2010/09/27_05:50:29 info: Managed hb_takeover process
>> 23304 exited with return code 0.
>> heartbeat: 2010/09/27_05:50:29 debug: RscMgmtProc 'hb_takeover'
>> exited code 0
> This is haresources mode, resource management model is simplistic.
> It thought it successfully took over, and marked itself as holding
> "all resources". hb_takeover was over very quickly, so possibly it
> thought it held all resources already, for whatever reason.
> Hm. Maybe it is even worse: hb_takeover is actually implemented as
> sending a "please shut down your resources" message to the other node,
> then waiting for its "thanks, I went standby on my resources please
> proceed" answer. So there is no "forceful takeover" here, only
> cooperative takeover, and if one refuses to cooperate, then nothing
> I'm not sure what happens if A sent that takeover request,
> B is too busy to respond, then B finally dies, while A is still waiting
> for that standby message. Possibly a "node dead" event is not
> deemed good enough while waiting for a "I'm standby now" message?
> Probably exactly your situation.
>> I went so far as to turn off the primary, but the standby still never
>> took over. When I brought the power on the primary back up, it came up
>> secondary and I had to do a hb_takeover on it, but after that all was
> The rebooted node joined the cluster, the still running node told it it
> held all resources, both thought there was nothing to do.
> Then you asked the rebooted node to take over, they both ran their
> scripts again, and this time actually started something.
> Seemingly a limitation of the simplistic haresources model.
Yes, the haresources model is simple. I have encountered the issue
above, and other similar issues.
Right now I have 3 situations that I have discovered, and plan to work
around them (comments from more experienced HA users are welcome):
Fail 1) A node needs to go active, but this fails. This causes an
attempt to go back to slave. RM does not record that it is not-active
unless it can speak to the other node.
Solution 1) Really? I just stopped everything... Of course I should
no-longer be active! I plan to have the RM record that I am inactive
even after the failed ha_standby request, or perhaps beforehand (I'll
add a timeout I guess) This will have knock-on effects, which will
need chasing down :)
Fail 2) Split-brain. This restarts both nodes 'heartbeat' daemons, and
will kill a perfectly working node.
Solution 2) An understandable solution, but sometimes it can be more
clever. I hope to add a F_SPLITBRAIN message that includes a SETWEIGHT
- This will then run an rc script on each node, and allow the 2 nodes
to fight it out. If that fails, then we'll do the restart. The script
in its simplest form can of course just do a heartbeat daemon restart
Fail 3) If 2 nodes get split, but also get out-of-sync. Split brain is
not recognised, and when reconnected, an "Active" message is
exchanged/logged, but ignored.
Solution 3) The "Active" message already causes a 'status' script to
run. I plan to extend this script to cause a Splitbrain alert when
appropriate to cause the same resolution as in 2) above.
Note, all of the above are theoretical solutions, and I do not know
when I might get round to improving them, I just thought it might be
useful to publish my findings so far given that they seem to relate to
The "old" resource manager is beautifully lightweight, and does not
/require/ hundreds of megabytes of Python and XML libraries to
operate. I am working on keeping it lightweight so it can be used in
small systems. Wish me luck :)
More information about the Linux-HA