[Linux-HA] Standby Node Refuses to Take Over
Steve Davies
davies147 at gmail.com
Tue Oct 5 04:47:37 MDT 2010
On 1 October 2010 16:59, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> On Mon, Sep 27, 2010 at 09:43:37AM -0700, Robinson, Eric wrote:
>> The primary node hung and the applications became unresponsive, but DRBD
>> status was good and up to date on both nodes, so I did a hb_takeover on
>> the standby node. Following is all that appeared in the ha-debug.log on
>> the standby. (I could not see the log on the primary because I could not
>> login to it.)
>>
>> heartbeat[15853]: 2010/09/27_05:47:12 debug: }/*G_remove_client;*/
>> heartbeat[15853]: 2010/09/27_05:50:29 debug: StartNextRemoteRscReq() -
>> calling hook
>> heartbeat[15853]: 2010/09/27_05:50:29 debug: notify_world: invoking
>> harc: OLD status: active
>> heartbeat[15853]: 2010/09/27_05:50:29 debug: Process [hb_takeover]
>> started pid 23304
>> heartbeat[15853]: 2010/09/27_05:50:29 debug: Starting notify process
>> [hb_takeover]
>> heartbeat[23304]: 2010/09/27_05:50:29 debug: notify_world: setting
>> SIGCHLD Handler to SIG_DFL
>> heartbeat[23304]: 2010/09/27_05:50:29 debug: notify_world: Running harc
>> hb_takeover
>> harc[23304]: 2010/09/27_05:50:29 info: Running
>> /etc/ha.d/rc.d/hb_takeover hb_takeover
>> heartbeat[15853]: 2010/09/27_05:50:29 info: Managed hb_takeover process
>> 23304 exited with return code 0.
>> heartbeat[15853]: 2010/09/27_05:50:29 debug: RscMgmtProc 'hb_takeover'
>> exited code 0
>
> This is haresources mode, resource management model is simplistic.
>
> It thought it successfully took over, and marked itself as holding
> "all resources". hb_takeover was over very quickly, so possibly it
> thought it held all resources already, for whatever reason.
>
> Hm. Maybe it is even worse: hb_takeover is actually implemented as
> sending a "please shut down your resources" message to the other node,
> then waiting for its "thanks, I went standby on my resources please
> proceed" answer. So there is no "forceful takeover" here, only
> cooperative takeover, and if one refuses to cooperate, then nothing
> moves.
>
> I'm not sure what happens if A sent that takeover request,
> B is too busy to respond, then B finally dies, while A is still waiting
> for that standby message. Possibly a "node dead" event is not
> deemed good enough while waiting for a "I'm standby now" message?
>
> Probably exactly your situation.
>
>> I went so far as to turn off the primary, but the standby still never
>> took over. When I brought the power on the primary back up, it came up
>> secondary and I had to do a hb_takeover on it, but after that all was
>> well.
>
> The rebooted node joined the cluster, the still running node told it it
> held all resources, both thought there was nothing to do.
> Then you asked the rebooted node to take over, they both ran their
> scripts again, and this time actually started something.
>
> Seemingly a limitation of the simplistic haresources model.
>
Yes, the haresources model is simple. I have encountered the issue
above, and other similar issues.
Right now I have 3 situations that I have discovered, and plan to work
around them (comments from more experienced HA users are welcome):
Fail 1) A node needs to go active, but this fails. This causes an
attempt to go back to slave. RM does not record that it is not-active
unless it can speak to the other node.
Solution 1) Really? I just stopped everything... Of course I should
no-longer be active! I plan to have the RM record that I am inactive
even after the failed ha_standby request, or perhaps beforehand (I'll
add a timeout I guess) This will have knock-on effects, which will
need chasing down :)
Fail 2) Split-brain. This restarts both nodes 'heartbeat' daemons, and
will kill a perfectly working node.
Solution 2) An understandable solution, but sometimes it can be more
clever. I hope to add a F_SPLITBRAIN message that includes a SETWEIGHT
- This will then run an rc script on each node, and allow the 2 nodes
to fight it out. If that fails, then we'll do the restart. The script
in its simplest form can of course just do a heartbeat daemon restart
:)
Fail 3) If 2 nodes get split, but also get out-of-sync. Split brain is
not recognised, and when reconnected, an "Active" message is
exchanged/logged, but ignored.
Solution 3) The "Active" message already causes a 'status' script to
run. I plan to extend this script to cause a Splitbrain alert when
appropriate to cause the same resolution as in 2) above.
Note, all of the above are theoretical solutions, and I do not know
when I might get round to improving them, I just thought it might be
useful to publish my findings so far given that they seem to relate to
this thread.
The "old" resource manager is beautifully lightweight, and does not
/require/ hundreds of megabytes of Python and XML libraries to
operate. I am working on keeping it lightweight so it can be used in
small systems. Wish me luck :)
Regards,
Steve
More information about the Linux-HA
mailing list