[Linux-HA] DBRD - split brain - and HA is happily migrating
Dominik Klein
dk at in-telegence.net
Wed Jan 2 01:58:28 MST 2008
> Thanks for your help. It looks like everything works as desired:
>
> (postgres-02) [~] ifconfig eth1 down
> (postgres-02) [~] cat /proc/drbd
> version: 8.2.1 (api:86/proto:86-87)
> GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by tg at apache-01, 2007-12-29 17:37:25
> 0: cs:WFConnection st:Secondary/Unknown ds:Outdated/DUnknown C r---
> ns:60499852 nr:713732 dw:713732 dr:60499852 al:0 bm:3693 lo:0 pe:0 ua:0 ap:0
> resync: used:0/31 hits:3777806 misses:3724 starving:0 dirty:0 changed:3724
> act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
> (postgres-02) [~] ifconfig eth1 up
> (postgres-02) [~] cat /proc/drbd
> version: 8.2.1 (api:86/proto:86-87)
> GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by tg at apache-01, 2007-12-29 17:37:25
> 0: cs:WFConnection st:Secondary/Unknown ds:Outdated/DUnknown C r---
> ns:60499852 nr:713732 dw:713732 dr:60499852 al:0 bm:3693 lo:0 pe:0 ua:0 ap:0
> resync: used:0/31 hits:3777806 misses:3724 starving:0 dirty:0 changed:3724
> act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
> (postgres-02) [~] cat /proc/drbd
> version: 8.2.1 (api:86/proto:86-87)
> GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by tg at apache-01, 2007-12-29 17:37:25
> 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
> ns:60499852 nr:715292 dw:715292 dr:60499852 al:0 bm:3705 lo:0 pe:0 ua:0 ap:0
> resync: used:0/31 hits:3777942 misses:3736 starving:0 dirty:0 changed:3736
> act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
>
> When I put the crosslink link down, the disk on postgres-02 gets outdated, if I
> put it back on, it syncs in no-time.
You should be aware of one thing though:
If you have a DRBD splitbrain now and your primary crashes whilst in
splitbrain, heartbeat will never be able to start your resource on the
secondary node, as the DRBD resource is outdated. Read: Your resource
will not run at all, not even with old data.
You will have to manually do something like "drbdadm --
--overwrite-data-of-peer primary $resource" to get the device into
primary state on an outdated disconnected secondary.
When the crashed primary comes back, you need to "drbdadm --
--discard-my-data connect $resource" (on the crashed primary) to get it
in sync again - heartbeat is not able to do that on its own (Which is
good. It shouldnt know a way to force a device to primary state).
Regards
Dominik
More information about the Linux-HA
mailing list