[Linux-HA] [ANNOUNCE] Interim heartbeat packages refreshed (2.1.2-24)

Andrew Beekhof beekhof at gmail.com
Mon Nov 26 04:49:08 MST 2007


Just a quick note to say that the packages at
    http://software.opensuse.org/download/server:/ha-clustering
were refreshed today after sufficiently (see pending bugs below)  
passing automated testing.

This will be the last interim release for 2007.  I hope they've been  
useful and we'll be back again in 2008.


The current version is 2.1.2-24.1 for .rpm and 2.1.2-24 for .deb
Note: Debian and Ubuntu7.10 packages are now also available for x64.

The source for this version can be obtained from:
   http://hg.beekhof.net/lha/crm-dev/archive/obs-2.1.2-24.tar.bz2


The most important crm-related additions in this release are:
- master placement and colocation
- throttling of the TE to avoid pushing heartbeat beyond its limits
- fix for an additional shutdown hang when crm_resource -C is used
See the change log below for a full list of changes.


Some stats on the changes since the previous build (obs-2.1.2-15):
  * Statistics:
       Changesets:       104
       Diff:                       296 files changed, 3464  
insertions(+), 3275 deletions(-)

* Test hardware:
     + 4-node vmware cluster (sles10-sp1/256Mb/vmware stonith) on a  
single host (opensuse10.3/2Gb/2.66Ghz Quad Core2)
     + 7-node EMC Centera cluster (sles10/512Mb/2Ghz Xeon/ssh stonith)

* All testing was performed with STONITH enabled

* Pending bugs encountered during testing:
     1750 - logd: ipc_bufpool_update: magic number in head does not  
match

Lazy paste from the change log...
   * Changes since 2.1.2-15
   + High: CIB: Fix the behavior of update_attr() and delete_attr()  
when the command is ambiguous
   + High: CRM: In ccm_have_quorum(), an OC_EV_MS_PRIMARY_RESTORED  
event means we _do_ have quorum
   + High: PE: Bug 1765 - Prevent master-master colocation constraints  
from preventing slaves from starting
   + High: PE: Introduce a new API call for determining the location  
of complex resources
   + High: PE: Set next_role recursivly so that group promotion will  
work
   + High: PE: increment_clone() did not overflow correctly 9->10, 99- 
 >100, etc
   + High: crmd: Prevent shutdown hangs caused by pending ops that  
can't be cancelled - because the no longer exist in the lrm
   + High: heartbeat: Fix getnodes() compilation with gcc 4.3
   + High: Novell 293922: Don't use non-blocking writers, but instead  
don't block
   + High: Novell 293922: Heartbeat network processes would block on  
full buffers.
   + Medium: core: Do not disconnect blocking write processes, but  
discard traffic.
   + Medium: PE: All other things being equal, prefer to keep non- 
failed instances alive
   + Medium: PE: Create a syntactic shortcut for the common use-case  
of "{resource} prefers {node} with {score}"
   + Medium: PE: Dont make changes to location constraints, when  
applying to groups, persistent
   + Medium: TE: Allow the CRM to limit the number of resource actions  
the TE can execute in parallel.
   + Medium: TE: Pending operations shouldn't be processed
   + Medium: attrd: Bug 1776 - Attrd doesn't exit or reconnect when  
the CIB is respawned
   + Medium: crm: Dont generate core files for non-fatal assert (ie.  
the CRM_CHECK macro).
   + Medium: crmd: Don't remap LRM_OP_PENDING when building full lrm  
updates
   + Medium: crmd: Prevent shutdowns initiated immediately after a  
node is removed from the cluster with hb_delnode from stalling.
   + Medium: LF bugzilla 1757: apache resource agent grep methodology  
can't handle newlines - even if you change the pattern
   + Low: Admin: Do a full simulation when we have the live CIB -  
since we'll always have a status section
   + Low: Admin: crm_standby - Don't complain about missing values,  
print the default value instead
   + Low: PE: Actions for stonith agents should never default to  
requiring fencing or quorum
   + Low: PE: ptest - Handle malformed inputs more gracefully
   + Low: Rename example ha_logd.cf file to the proper name.
   + Low: cib: Increase the retry interval when connecting to the ccm  
to 3s
   + Low: crmd: Provide a more informative error when DCs detect other  
DCs during a join
   + Low: heartbeat: Use the correct define when choosing to enable  
valgrind - Keisuke MORI
   + Low: LF bug 1705 - cl_respawn core dumps when given -h or --help
   + Low: changed configure.in to force 64-bit objects when on gcc- 
based ppc64 platform
   + Low: configure.in - autoconf spells ppc64 funny...
   * Unclassified user impact
   + LF 1766: Heartbeat installs man pages in the wrong directories by  
default...
   + LF Bug 1662: Put in some RHEL changes provided by Keisuke MORI
   + LF bug 1393 - Cannot rename /var/lib/heartbeat/hostcache.tmp to / 
var/lib/heartbeat/hostcache (scope, risk and severity: all minor)
   + ccm (LF 1546): ensure that the membership instance number never  
decrements (thanks to Guochun Shi)
   + ccm: speed up ccm considerably in case a node is alone in the  
membership (thanks to MATSUDA, Daiki)
   + Debian: Build: Add dependacy on gawk as it is required by the OCF  
IPaddr resource
   + Debian: Build: rename ha_logd.cf to logd.cf in debian/ 
heartbeat.files
   + hb_report (LF 1763): multiple fixes
   + hb_report: add more packagers support (thanks to Sebastian  
Reitenbach)
   + hb_report: fix handling of debug option
   + hb_report: include package verification
   + hb_report: reduce the number of ssh invocations by 1
   + hb_report: update documentation on extracting CTS tests
   + hb_report: use getopts instead of getopt(1) for the sake of  
portability
   + hbagent: fix memory leaks on dropping global resources (thanks to  
Keisuke MORI)
   + hbagent: removed the const qualifier from parameter
   + hbagent: reset gMembershipTable on dropping global resources
   + hbclient: fix memory leaks on signon/signoff (thanks to Keisuke  
MORI)
   + heartbeat: Increase MAXMSGHIST; reduce debugging output.
   + ldirectord: allow per-virtual checkinterval configuration
   + lvs-users: Patch for ldirectord when using mysql service
   + mgmtd (LF 1719): fix the pam file for suse
   + mysql RA (LF 1760): defaults for OpenBSD (thanks to Sebastian  
Reitenbach)
   + RA Xinetd (LF 1742): multiple fixes
   + RA tomcat: allow consecutive starts
   + stonithd: (LF 1727): fix dropping privileges
   + stonithd: check for storage size when copying the host list
   + stonithd: drop root privileges earlier
   + stonithd: shuffle dropping privileges around again



More information about the Linux-HA mailing list