[Linux-HA] Forcing node to rejoin cluster after Split-Brain
Andrew Beekhof
beekhof at gmail.com
Wed Nov 1 07:56:57 MST 2006
On 10/31/06, Max Hofer <max.hofer at apus.co.at> wrote:
> Ok, it should not happen but sometimes it happens (specially
> during testing periods). A split-brain occured by one of my 4 test nodes.
> (Network went down for a couple minutes and dead-time hit in and
> somehow the three other nodes managed to rejoin (or maybe never left
> the group) but management1 failed to rejoin the cluster).
>
> crm_mon shows me:
>
> Node: routing2 (c5e1bda1-b00b-42e8-89e1-702b2d715c76): online
> Node: routing1 (a98d68fb-807a-4b22-af3c-82e60064aa95): online
> Node: management2 (a69b64a2-4de8-4d4a-b4ba-8107136eec4b): online
> Node: management1 (0044b88e-c148-4269-9d39-449324bf65b8): OFFLINE
>
> heartbeat is running on management 1 and the ha-log on that machine shows me:
>
> crmd[1185]: 2006/10/31_13:22:42 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
> crmd[1185]: 2006/10/31_13:22:46 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
> crmd[1185]: 2006/10/31_13:22:48 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
> crmd[1185]: 2006/10/31_13:22:48 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
> crmd[1185]: 2006/10/31_13:22:51 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
> crmd[1185]: 2006/10/31_13:22:53 WARN: do_dc_join_finalize:join_dc.c join-4: We are still in a transition. Delaying until the TE completes.
>
> Is there a "simple way" to tell node "management1" to rejoin the cluster without
> shutting down heartbeat (rebooting) on that machine?
version?
it will probably also depend on at why the node is not part of the cluster
More information about the Linux-HA
mailing list