[Linux-HA] Resend: Stopped working between 8c5da9553636 and 2.1.2-4

Andrew Beekhof beekhof at gmail.com
Tue Oct 9 02:35:08 MDT 2007


On 10/9/07, Andrew Beekhof <beekhof at gmail.com> wrote:
> On 10/9/07, David S. Madole <david at madole.net> wrote:
> > I notice that the body of my message went missing on my last post (maybe because I accidentally sent HTML) -- here is another attempt:
> >
> > I had been running on 8c5da9553636 which happened to be the latest when I grabbed it on 9/23, and I just upgraded to 2.1.2-4 and now my configuration doesn't work.
> >
> > Only one of my two DRBD master/slaves becomes active on either node, and none of the other dependent resources come up even after the one DRBD does become active. It just seems like it's stuck.
> >
> > Did something break or change in between or was my configuration always defective and it just happed to work under the older snapshot?
> >
> > cibadmin -Q > cib.xml is attached. Any insight would be appreciated.
>
> The relevant change is likely to be this one:
>    http://hg.beekhof.net/lha/crm-dev/rev/702e4f418ca8
>
> Take this constraint in your config:
>           <rsc_order id="cyrus_drbd_after_address" from="cyrus_drbd"
> action="promote" type="after" to="cyrus_address" to_action="start"/>
>
> Prior to the change, if cyrus_address couldn't be started then
> cyrus_drbd would still be promoted.  To prevent cyrus_drbd from
> starting in such situations one had to add score=INFINITY to the
> constraint.
>
> We decided that this was counter intuitive and changed the default.
>
> btw. are you sure you dont mean the other way around?  promote drbd
> then add the address?
>
> Resource ordering is described in detail here:
>    http://oss.beekhof.net/~beekhof/heartbeat/docs/Ordering-Explained.pdf
>
>
> Now, in your case named_address is running (and needs to be moved) but
> named_drbd_node is not master anywhere.  Which currently means it
> can't be demoted which means named_address can't be stopped (which is
> bad).
>
> Most people have it the other way around (as suggested above) so there
> is no problem, but this should also work.  I need to think a little on
> how to fix this....
>

fixed in http://hg.beekhof.net/lha/crm-dev/rev/4d8f7cfb7188
(with a regression test added to prevent the problem from re-occuring)


More information about the Linux-HA mailing list