[Linux-HA] Heartbeat, DRBD, Named-chroot, Fedora Core 4

Dave ghaniba at gmail.com
Mon Jul 25 09:01:11 MDT 2005


Heh, sure enough it was my mounted chroot proc... I found if I just
tweaked the named init script to actually remove it's chroot proc
mount it worked like a charm...  and of course I found this like 13
seconds after I had clicked send on my mail to this list! /sigh

Thanks for the feedback, it was 100% on target.  I didn't know I could
unmount that proc it made... and it didn't even list it in a 'df' -
Thanks again!

Dave
Iso New England

On 7/25/05, Alan Robertson <alanr at unix.sh> wrote:
> Dave wrote:
> > Hey folks,
> >
> > I've got a strange situation for you, maybe someone can shed some
> > light on for me.
> >
> > I've got a Fedora 4 cluster, running DRBD & Heartbeat.  This cluster
> > will be doing dns for us, using Bind.  We're actually replacing  a
> > current cluster (who's on old and busted hardware) running Bind in the
> > same environment.
> >
> > So here's the trick.  Now Bind is in a Chroot.  I have this chroot
> > under a drbd disk.
> >
> > When I attempt to fail this cluster over, it cannot unmount the drbd
> > disk, because it's still active.  I've got it pinned down to the /proc
> > that is part of the named chroot setup.
> >
> > This is what happens:
> > =-=-=- - HA-log
> > heartbeat: 2005/07/25_09:07:41 info: Running
> > /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> > heartbeat: 2005/07/25_09:07:41 ERROR: Couldn't unmount /data
> > heartbeat: 2005/07/25_09:07:41 ERROR: Return code 1 from
> > /usr/local/etc/ha.d/resource.d/Filesystem
> > heartbeat: 2005/07/25_09:07:41 CRIT: Resource STOP failure. Reboot required!
> > heartbeat: 2005/07/25_09:07:41 CRIT: Killing heartbeat ungracefully!
> > =-=-=-  HA-DEBUG
> > heartbeat: 2005/07/25_09:07:41 debug: Starting
> > /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> > /data:               25029c
> > umount: /data: device is busy
> > umount: /data: device is busy
> > heartbeat: 2005/07/25_09:07:41 debug:
> > /usr/local/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
> > done. RC=1
> > =-=-=-
> >
> > When it says Reboot required.... it does it.  At that point, the
> > secondary node has to wait for the deadtime to reach to take over the
> > alias, and start up it's processes, which seems like days when it
> > comes to dns. ;)
> >
> > Has anyone run into this?  Is there a way to kill this /proc in the
> > chroot once the chroot is stopped?
> >
> > Dave Mullen
> > Iso New England
> 
> OK...
> 
> Is the /proc mount part of the resources in the resource group being
> taken over?  If not, then add the /proc mount to the resource group
> after mounting the /data partition and ahead of starting bind, etc.
> 
> If the unmount of /proc fails, then you have a different problem to
> solve ;-)
> 
> 
> 
> --
>      Alan Robertson <alanr at unix.sh>
> 
> "Openness is the foundation and preservative of friendship...  Let me
> claim from you at all times your undisguised opinions." - William
> Wilberforce
>



More information about the Linux-HA mailing list