[Linux-HA] Heartbeat doesn't start resources on proper node after DRBD syncing.

Andrew Beekhof beekhof at gmail.com
Mon Oct 8 01:06:10 MDT 2007


On 10/7/07, Michael <misha at onet.ru> wrote:
> On 10/5/07, Andrew Beekhof <beekhof at gmail.com> wrote:
> > either this is becoming a common theme or i was speaking to you
> > yesterday on irc...
> Yeah, you were talking exactly with me, thanks for your kindness ;)
>
> > the OCF agent that comes with heartbeat does not understand how to use
> > drbd8 (apparently the strings it looks for have changed)
> >
> > so you'd either need to fix the RA (and hopefully send us a patch) or
> > use the LSB script that comes drbd.  not being a drbd user myself
> > thats about as much info as I can offer.
>
> Well, i did some investigation on that topic, since drbd is a vital
> part of my system. Existing RA work pretty well with drbd8 (only real
> issue with what i come up with it's it not aware of Unknown/TOO_LARGE
> and SyncTarget state).
> The fail to promote master only happens while drbd syncing large
> amount of data (for example about 1Gb).
> I was trying to reproduce that without heartbeat at all and what i
> come up with, is that sometimes drbd don't want become a primary and
> answer something like this:
>
> tolstoy(7):/home/misha# drbdsetup /dev/drbd0 primary
> No response from the DRBD driver! Is the module loaded?
> State change failed: (0)unknown error.
>
> Although syncing in process, and command get successful only after few
> tries. Sometimes it gets promoted but on another side it's still in
> old state.
>
> So it's strange to expect from RA correct behaviour if drbd don't work
> correct itself.
>
> Anyway, i have a question about how heartbeat behaive: imagine a
> situation we have Node1 comming up from shutdown and while it was off
> Node2 has changed information on its drbd resource, they need to sync.
> More on that, in cib.xml Node1 has an +INFINITY for being master.
> While nodes are in Sync mode RA on Node1 will do crm_master -v 5 -l
> reboot, and on Node2 will do crm_master -v 100 -l reboot.
> As i understand promoted Node2 gets promoted, since 100>5.

right, as writing a value of 5 will replace the existing value of +INFINITY

> But sync
> will finish some time, so the question is: will heartbeat repeat that
> promotion on Node1?

only if the crm_master scores change again
or, if you create a rsc_location constraint with role=INFINITY
score=INFINITY for node1

> Btw, yes, it's possible to set Primary on Node1 while it's in syncing
> to Node2 (as i understand blocks which are marked dirty and requested
> while sync will be retrieved from the net), but it's hard since the
> problem described above, drbd not always want to be Primary and things
> become to destroy :(
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


More information about the Linux-HA mailing list