Stephen C. Tweedie
Wed, 5 Apr 2000 22:49:34 +0100
On Tue, Apr 04, 2000 at 09:12:43PM -0600, Alan Robertson wrote:
> It is reliable in that it *won't* provide bad data. In the proposal I
> described before, the machine would not come up automatically in the
> circumstances I believe you have described.
What prevents it?
> Although I do confess to
> not being quite sure exactly what you were describing.
5 machines: A,B,C,D,E. A drbd device replicated on A, B, C, and D.
The cluster partitions into (A,B,E) and (C,D). The first partition
has quorum. We then take another fault and repartition into
(A,B) and (C,D,E). The second partition now has quorum, and has
two copies of the drbd data which (as far as it is concerned) are
still recent, because neither C nor D has seen the new data on A or
B. However, that new quorate partition has stale data. Bad news.
If you keep the updated sequence number for the device replicated
on all voting nodes, then in this scenario, the (A,B,E) partition
will record the new sequence on all nodes, including E; and in the
second half of the situation, the (C,D,E) partition knows that
neither C nor D are uptodate, because they can see the incremented
sequence number held by E.
> Hopefully this (CRACC) is a rare occurance :-) A complete shutdown for
> some administrative reason is not too surprising, but hopefully a
> complete and catastrophic failure resulting in no partition having
> quorum happens only rarely.
Yes, but the scenario above, in which we have a partition and where
one node migrates from one partition to another taking quorum with
it, is not at all uncommon if you have dodgy ethernet cabling or
bridging. This scenario is just as bad as the complete cluster
reboot case if you don't allow the moving node to hold a sufficient
record of the cluster state of the last quorate partition it was a