[Linux-HA] [ANNOUNCE] Interim heartbeat packages refreshed (2.1.2-15)

Andrew Beekhof abeekhof at suse.de
Thu Oct 25 02:23:42 MDT 2007


Just a quick note to say that the packages at
     http://software.opensuse.org/download/server:/ha-clustering
were refreshed today after sufficiently (see pending bugs below)  
passing automated testing.


The most important crm-related additions in this release are:
  - the resolution of hanging shutdowns triggered by the use of  
crm_resource -C
  - the addition of hb_report which significantly eases the process  
of gathering data for issue reporting
  - further refinement of the colocation and ordering changes in the  
previous release


The current version is 2.1.2-15.1 for .rpm and 2.1.2-15 for .deb

The packages correspond to Heartbeat as of this revision:
    http://hg.linux-ha.org/dev/shortlog/c492f19cb583

The source for this version can be obtained from:
    http://hg.linux-ha.org/dev/archive/obs-2.1.2-15.tar.bz2


Some stats on the changes since the previous build (obs-2.1.2-4):
   * Statistics:
       Changesets:       158
       Diff:                       173 files changed, 8712 insertions 
(+), 4787 deletions(-)


Lazy paste from the change log...

   * Test hardware:
     + 4-node VMware Server cluster (sles10-sp1/256Mb) on a single  
host (opensuse10.2/2Gb/2.66Ghz Quad Core2)
     + 6-node EMC Centera cluster (sles10/512Mb/2Ghz Xeon)
     + (lmb)    7-node Xen cluster on a single host
     + (debltc) 6-node ppc64 cluster


   * Pending bugs encountered during testing:
     1562 - heartbeat: hostcache file rename error at startup
     1737 - crmd: Inconsistent join state detected (Hard to  
reproduce, possible fix included)
     1750 - logd: ipc_bufpool_update: magic number in head does not  
match

   * Changes since 2.1.2-4
   + High: cib: G_main_del_IPC_Channel() doesn't like being called  
with a NULL pointer and crashes
   + High: crm: Bug 1749 - Make sure the diff-related digests contain  
the complete CIB, not just the first line
   + High: crmd: Prevent shutdown hangs by allowing the crmd to  
forget about pending actions for deleted resources
   + High: PE: Bug 1712 - Ensure manditory ordering constraints can  
cause complex resources to be shut down
   + High: PE: Clone colocation fixes
   + High: PE: Ensure that resources depending (by order) on a master  
are not promoted if no master is available
   + High: PE: Fix manditory ordering with m/s resources
   + High: PE: Prevent an infinite pe/te loop when reprobing with  
resources in master mode
   + High: PE: Prevent use-of-NULL in crm_mon when date_expressions  
are used by ensuring that data_set->now is always set
   + High: PE: Prevent use-of-NULL when the admin creates a  
colocaiton constraint with an empty group
   + High: PE: Relax an assumption regarding clones that is not true  
when they are unmanaged
   + High: PE: Remove an accidental commit that, when called, causes  
use-of-NULL
   + High: PE: Remove an errant call to exit() than prevented an  
assert from being triggered
   + High: RA: Novell 329833 - dotted-quad netmask notation broken in  
findif
   + Medium: cib: Include digests of the cib a diff was made from and  
verify it when applying
   + Medium: CRM: Re-evaluate appropriate IPC message queue lengths  
and throtle IPC _clients_ that hit them
   + Medium: crmd: Bug 1737 - Inconsistent join state detected -  
Possible fix
   + Medium: PE: Bug 1722 - By default, exhibit the old start- 
failures-are-fatal behaviour regardless of how resource-failure- 
stickines is set
   + Medium: PE: Enact a saner default for rsc_order.score (s/0/ 
INFINITY)
   + Medium: PE: Fix minor memory leak when stopping orphaned resources
   + Medium: PE: Master internal ordering enhancements. Added stopped- 
 >start, stopped->promote.
   + Medium: Tools: Bug 1738 - the crmd ignores some requests from  
crm_resource because it exited too quickly
   + Low: contrib: dopd - Remove a pointless and annoying dependancy  
on the crm
   + Low: crmd: Increase the retry interval when stalling the FSA to  
2s (used when connecting to the ccm, lrmd, cib)
   + Low: crmd: Update the CIB with the node's version data when it  
becomes DC
   + Low: crmd: Use the unaltered rc for timed out operation events  
from the lrmd
   + Low: Debian: heartbeat doesn't depend on libperl-dev (probably  
confused by '-lperl' in configure.in)
   + Low: heartbeat: Some compilers feel that the fcli variable could  
be used uninitialized - shut them up
   + Low: PE: Avoid pointless copying of XML items (actions) during  
graph creation
   + Low: PE: Convert all the resource's boolean flags into a single  
bit-field
   + Low: PE: Produce a config error when a clone contains more than  
one resource/group to clone
   + Low: RA: PureFTPd - Support debian's pure-ftpd-wrapper script.   
Patch by Raoul Bhatia
   + Low: Stonith: external/rackpdu - Fix the metadata short  
descriptions
   + Low: ccm: Bug 1723 - fix logging (thanks to MATSUDA, Daiki)
   * Unclassified user impact
   + gui: Add "meta_attributes" support to mgmt
   + gui: Add boolean_op for location rules in mgmt
   + gui: Make several fields of rsc_location editable
   + gui: Prevent clone and master_slave from belonging to group
   + gui: Prevent haclient from falling into an error caused by blank  
resource metadata
   + gui: Provide "Default" setting for crm configurations
   + gui: Provide complete pengine and crmd configurations with  
dynamic rendering
   + gui: Rename "Places" to "Locations" to match with the CIB name
   + gui: Resolve target_role problem for sub-resource
   + gui: Specify the default type of added item to correspond to the  
object that the cursor focused on
   + heartbeat: Remove one unlink() of the temporary hostcache file.
   + LF bug 1502: udpport statement in ha.cf is ignored by heartbeat  
- trivial fix
   + LF bug 1589: ia64 heartbeat unaligned access messages  
(SGI965396) - copy fields in structures - minor fix
   + LF bug 1679: hardcoded heartbeat userid and group in init script  
(minor fix)
   + LF bug 1681: BSC fails to to identify active interface on  
OpenBSD (minor change)
   + LF bug 1700: HA Signon always reports success - fix has minor  
scope.
   + LF bug 1702: during emergency restarts, heartbeat doesn't close  
watchdog correctly.  Fix is minor.
   + LF bug 1712: ha_logd prevented node from shutting down - changed  
code to use the waitout() primitive - scope fix - trivial-minor
   + LF bug 1731: wrong location of libraries for OpenBSD 64Bit  
architectures, with patch
   + LF bug 1734 - removed duplicate return statement as per bug report.
   + lrm stonith plugin: fix exit code handling
   + lrmd (LF 1715): increase the max dispatch time for lrmd
   + lrmd (LF 1729): revise return codes
   + Put in a missing include of <memory.h> for the DRBD code.   
Change scope: trivial
   + RA: mysql, pgsql - use getent(1) instead of /etc/passwd (thanks  
to Raoul Bhatia)
   + RA: mysql: According to Geoff Harrison the sql check is broken  
without this
   + RA: mysql: Allow arbitrary commandline arguments for mysqld (by  
Raoul Bhatia)
   + RA: Raid1: Allow the homehost setting to be specified.
   + RA: Xinetd: Fix stop/monitor to not fail if service isn't  
available yet.
   + stonith/ibmrsa: allow ',' in the hostlist
   + STONITH: external/ssh: Disable StrictHostKeyChecking and  
PasswordAuthentication by default.
   + stonithd (LF 1714,1727): fix a race condition
   + stonithd (LF 1726): a hostlist may be empty
   + stonithd (LF 1727): change attach shmem seg to read only
   + stonithd/lib: replace cookie generation with uuid
   + stonithd: have -a really mean startup alone
   + stonithd: increase the maxdispatchtime
   + stonithd: library code review and cleanup
   + tools: Add hb_report - a reporting utility



More information about the Linux-HA mailing list