[Linux-HA] Heartbeat, DRBD, Named-chroot, Fedora Core 4

Alan Robertson alanr at unix.sh
Mon Jul 25 08:30:28 MDT 2005


Dave wrote:
> Hey folks,
> 
> I've got a strange situation for you, maybe someone can shed some
> light on for me.
> 
> I've got a Fedora 4 cluster, running DRBD & Heartbeat.  This cluster
> will be doing dns for us, using Bind.  We're actually replacing  a
> current cluster (who's on old and busted hardware) running Bind in the
> same environment.
> 
> So here's the trick.  Now Bind is in a Chroot.  I have this chroot
> under a drbd disk.
> 
> When I attempt to fail this cluster over, it cannot unmount the drbd
> disk, because it's still active.  I've got it pinned down to the /proc
> that is part of the named chroot setup.
> 
> This is what happens:
> =-=-=- - HA-log
> heartbeat: 2005/07/25_09:07:41 info: Running
> /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> heartbeat: 2005/07/25_09:07:41 ERROR: Couldn't unmount /data
> heartbeat: 2005/07/25_09:07:41 ERROR: Return code 1 from
> /usr/local/etc/ha.d/resource.d/Filesystem
> heartbeat: 2005/07/25_09:07:41 CRIT: Resource STOP failure. Reboot required!
> heartbeat: 2005/07/25_09:07:41 CRIT: Killing heartbeat ungracefully!
> =-=-=-  HA-DEBUG
> heartbeat: 2005/07/25_09:07:41 debug: Starting
> /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> /data:               25029c
> umount: /data: device is busy
> umount: /data: device is busy
> heartbeat: 2005/07/25_09:07:41 debug:
> /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> done. RC=1
> =-=-=-
> 
> When it says Reboot required.... it does it.  At that point, the
> secondary node has to wait for the deadtime to reach to take over the
> alias, and start up it's processes, which seems like days when it
> comes to dns. ;)
> 
> Has anyone run into this?  Is there a way to kill this /proc in the
> chroot once the chroot is stopped?
> 
> Dave Mullen
> Iso New England

OK...

Is the /proc mount part of the resources in the resource group being 
taken over?  If not, then add the /proc mount to the resource group 
after mounting the /data partition and ahead of starting bind, etc.

If the unmount of /proc fails, then you have a different problem to 
solve ;-)



-- 
     Alan Robertson <alanr at unix.sh>

"Openness is the foundation and preservative of friendship...  Let me 
claim from you at all times your undisguised opinions." - William 
Wilberforce



More information about the Linux-HA mailing list