Heartbeat with dual SCSI config
Alan Robertson
alanr@unix.sh
Tue, 05 Mar 2002 14:47:33 -0700
Hi Roberto,
I've CCed this reply to the linux-ha mailing list.
Roberto Zini wrote:
> Hi Alan !
>
> In the past few days I tried you heartbeat solution on a couple of
> Linux boxes and so far I was impressed by the results I got !
Thanks!
> I'm trying to use the HB (heartbeat) solution as to allow a given
> process (eg, Apache running several CGI scripts) to operate on a
> SCSI disk which is shared between the above 2 boxes. Just to provide
> you with some numbers, I'm using a couple of Adaptec 29160 HA (configured as
> ID=14 on the first box and ID=13 on the second one) whose secondary bus is
> connected to an external SCSI disk configured as ID=5 (the primary SCSI
> bus is being used for the boot/root disk).
>
> Let me preface that I'm neither a SCSI expert nor a Linux one but in
> my tests I've seen that both boxes are able to access the same
> shared HD (which has been prepared with a Linux partition) at the same
> time (ie, they can "mount" it without problems) which can lead to data
> corruption if both OSes try to write the same chunk of data.
>
> I'm wondering if there is an HW/SW solution which prevents the "failover"
> box (from the HB point of view) from mounting the external disk when it's
> already being mounted by the primary box.
You could write a resource script which removes the /dev entry when the
other side has the disk mounted. Or you could do the equivalent at the
kernel level like this:
echo "scsi-remove-single-device A 0 D 0" > /proc/scsi/scsi
to make the kernel believe the device is gone, and then also do
echo "scsi-add-single-device A 0 D 0" > /proc/scsi/scsi
to make it come back just before takeover.
Maybe someone should write such a resource script...
In this case A is number of the SCSI adapter according to /proc/scsi/scsi,
and D is the logical drive number (I think this is the SCSI target ID).
Thanks to the IBM ServeRAID team for this cool tip.
But, you should still use a STONITH device to *GUARANTEE* that the other
machine isn't using the disk. A STONITH device is a smart power switch.
Heartbeat supports around 8-10 types of STONITH devices.
-- Alan Robertson
alanr@unix.sh