Release Policy (was: Re: [Linux-HA] 2.1.2 and failover of colocated resources)
dejanmm at fastmail.fm
Wed Sep 19 12:18:03 MDT 2007
On Wed, Sep 19, 2007 at 06:43:21PM +0200, Max Hofer wrote:
> On Wednesday 19 September 2007, Raoul Bhatia [IPAX] wrote:
> > hi,
> > Andrew Beekhof wrote:
> > > shortly I'll be releasing a revised implementation (including
> > > documentation!) of colocation which will make it much more intuitive
> > > and remove the need for hacks like symmetrical=true
> > >
> > > if anyone wants to try it sooner rather than later, grab the latest
> > > from http://hg.linux-ha.org/dev and ping me for the current version of
> > > the docs.
> > i recently joined this mailinglist and saw that there is a lot of
> > activity going on. after reading the previous statement, i wonder
> > what the release policy for linux-ha is like.
> > will 2.1.3 include the new colocation implementation or will this become
> > available in 2.2.0? will i have to carefully test each "minor" release
> > (e.g. 2.1.3, 2.1.4, etc.) or is precauten taken (apart from the test
> > suite) that things won't break.
> Just to make one thing clear - I would never ever upgrade to a new version
> before testing it extensively. on a test system.
> What i do on a new release in my test environment:
> I.) Test new version with clean system configuration (on all nodes):
> - make cluster standby on all nodes
> - cibadmin -E
> - shutdown heartbeat on all nodes
> - manually remove CIB file on all nodes
> - upgrade to new release on all nodes
> - start heartbeat on all nodes
> - load the previous CIB configuration
> - now test
This is quite good. Perhaps you could write it at linux-ha.org.
> After all my tests passed I'm at least sure the upgrade cycle had no impact on
> the cluster.
> NOTE: If you have to change CIB to get the old cluster behaviour your upgrade
> process will be more painful.
> II) Testing Upgrade Process:
> 1) You did not have to change the CIB:
> a) back to old version:
> - clean heartbeat (set all nodes standby, cibadmin -E, hertbeat stop, delete
> CIB file on all nodes, remove new heartbeat version)
> - install old heartbeat version on all nodes
> - start heartbeat on all nodes
> - load CIB
> b) now perform this for all nodes (one after the other):
> - set cluster to standby
> - upgrade to new heartbeat
> - activate node again
> c) test again ---> if all tests passed "Hurray"
> 2.) Lests assume you had to change the CIB to get the same cluster behaviour
> like you had with the old heartbeat version.
> Old CIB: cib-old.xml
> New CIB: cib-new.xml
> Goal: based on the input of the 2 CIBs you have to find a way to upgrade
> heartbeat with the shortes down time. This is not a trival task because it
> really depends on the CIB difference and is specific to your configuration
> what you can/should do and what not.
> 1.) standby-upgrade-activate process: this is the most secure way to upgrade
> but the cluster downtime may be various minutes
> - set one node standby
> - upgrade heartbeat on the standby node
> - set all nodes to standby (downtime start as soon the last node is standby)
> - load cib-new.xml
> - activate node with new heartbeat version (here you have the cluster up
> - upgrade heartbeat on all other nodes
> 2.) set node/resources to unmanaged mode: this means you have a small
> cib-delta-change and you really know what kind of effect the change has
> (ptest is your friend for that).
> So far i always used version 1) because it seems to be more robust. I tried a
> couple times 2) but it sometimes ended that the heartbeat died on a
> successive stop (but this may be fixed in newer heartbeat version).
I'd actually opt for this one and so far didn't experience
problems with it. Though I didn't do it so often. The big
advantage is that the resources don't have to be moved around.
Please post the logs, etc, in case Heartbeat gives up.
> kind regards Max
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA