Release Policy (was: Re: [Linux-HA] 2.1.2 and failover of colocated resources)

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Sep 19 12:18:03 MDT 2007


Hi,

On Wed, Sep 19, 2007 at 06:43:21PM +0200, Max Hofer wrote:
> On Wednesday 19 September 2007, Raoul Bhatia [IPAX] wrote:
> > hi,
> >
> > Andrew Beekhof wrote:
> >  > shortly I'll be releasing a revised implementation (including
> >  > documentation!) of colocation which will make it much more intuitive
> >  > and remove the need for hacks like symmetrical=true
> >  >
> >  > if anyone wants to try it sooner rather than later, grab the latest
> >  > from http://hg.linux-ha.org/dev and ping me for the current version of
> >  > the docs.
> >
> > i recently joined this mailinglist and saw that there is a lot of
> > activity going on. after reading the previous statement, i wonder
> > what the release policy for linux-ha is like.
> >
> > will 2.1.3 include the new colocation implementation or will this become
> > available in 2.2.0? will i have to carefully test each "minor" release
> > (e.g. 2.1.3, 2.1.4, etc.) or is precauten taken (apart from the test
> > suite) that things won't break.
> >
> Just to make one thing clear - I would never ever upgrade to a new version 
> before testing it extensively. on a test system.
> 
> What i do on a new release in my test environment:
> I.) Test new version with clean system configuration (on all nodes):
> - make cluster standby on all nodes
> - cibadmin -E
> - shutdown heartbeat on all nodes
> - manually remove CIB file on all nodes
> - upgrade to new release on all nodes
> - start heartbeat on all nodes
> - load the previous CIB configuration
> - now test

This is quite good. Perhaps you could write it at linux-ha.org.

> After all my tests passed I'm at least sure the upgrade cycle had no impact on 
> the cluster.
> 
> NOTE: If you have to change CIB to get the old cluster behaviour your upgrade 
> process will be more painful.
> 
> II) Testing Upgrade Process:
> 
> 1) You did not have to change the CIB:
> 
> a) back to old version:
> - clean heartbeat (set all nodes standby, cibadmin -E, hertbeat stop, delete 
> CIB file on all nodes, remove new heartbeat version)
> - install old heartbeat version on all nodes
> - start heartbeat on all nodes
> - load CIB
> 
> b) now perform this for all nodes (one after the other):
> - set cluster to standby 
> - upgrade to new heartbeat
> - activate node again
> 
> c) test again ---> if all tests passed "Hurray"
> 
> 2.) Lests assume you had to change the CIB to get the same cluster behaviour 
> like you had with the old heartbeat version.
> 
> Old CIB: cib-old.xml
> New CIB: cib-new.xml
> 
> Goal: based on the input of the 2 CIBs you have to find a way to upgrade 
> heartbeat with the shortes down time. This is not a trival task because it 
> really depends on the CIB difference and is specific to your configuration 
> what you can/should do and what not.
> 
> 1.) standby-upgrade-activate process: this is the most secure way to upgrade 
> but the cluster downtime may be various minutes 
> - set one node standby
> - upgrade heartbeat on the standby node
> - set all nodes to standby (downtime start as soon the last node is standby)
> - load cib-new.xml 
> - activate node with new heartbeat version (here you have the cluster up 
> again)
> - upgrade heartbeat on all other nodes
> 
> 2.) set node/resources to unmanaged mode: this means you have a small 
> cib-delta-change and you really know what kind of effect the change has 
> (ptest is your friend for that). 
> 
> So far i always used version 1) because it seems to be more robust. I tried a 
> couple times 2) but it sometimes ended that the heartbeat died on a 
> successive stop (but this may be fixed in newer heartbeat version).

I'd actually opt for this one and so far didn't experience
problems with it. Though I didn't do it so often. The big
advantage is that the resources don't have to be moved around.
Please post the logs, etc, in case Heartbeat gives up.

Thanks.

Dejan

> kind regards Max
> 
> 
> 
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



More information about the Linux-HA mailing list