[Linux-HA] order troubles

Lars Ellenberg lars.ellenberg at linbit.com
Thu Mar 22 03:34:28 MDT 2012


On Thu, Mar 22, 2012 at 10:16:49AM +0100, Roman Haefeli wrote:
> Hi all
> 
> I only started diving into linux-ha recently and am currently working on
> a test setup, where a 2-node cluster manages a bunch of OpenVZ
> containers. The Containers are running on a cloned nfs mount, which is
> also managed by the cluster. In normal operation mode, everything works
> fine. I can put one node to 'standby' and all resources are migrated to
> the remaining node. If I set it back to 'online', all resources with a
> location preference for the second node are moved back to their original
> node. So far, so good.
> 
> However, if nodeB is not running corosync and I start the corosync
> service on nodeB while it is already set to 'online' in the crm, it will
> try to start the resources (as if it would go from 'standby' to
> 'online'), but starting the containers will fail then. The crm shows
> this message for each container failed to start:
> 
> ---
> ve992_start_0 (node=nodeB, call=19, rc=1, status=complete): unknown error
> ---
> 
> In the corosync log (on Debian: /var/log/syslog), I find this message:
> 
> ---
> Mar 22 09:59:18 nodeB ManageVE[26793]: ERROR: Starting container ... Container is mounted Invalid kernel, or some kernel modules are not loaded Container start failed Container is unmounted
> ---
> 
> This is actually the error message printed by vzctl, when the service
> provided by /etc/init.d/vz is not running.
> 
> >From what I understand, lsb:vz should already be running when starting
> the containers (ocf:hearbeat:ManageVE), according to the order
> constraints I put:
> 
> ---
> order o_nfs_before_vz 0: cl_fs_nfs cl_vz
> order o_vz_before_ve992 0: cl_vz ve992

a score of "0" is roughly equivalent to
"if you happen do plan to do both operations
 in the same transition, would you please consider
 to do them in this order, pretty please, if you see fit"

(no native speaker here, obviously, but you get the idea)

You want "inf" for order constraints that *must* be followed.

Many of your "0" scores below should be "inf".

> ---
> 
> Shouldn't this ensure, that the vz service is only started after nfs was
> started and the the containers only should be started _after_ vz was
> started? It seems to work for:
>  
>   'standby' -> 'online'
> 
> or:
> 
>   'offline' -> 'standby' -> 'online'
> 
> but not for:
> 
>   'offline' -> 'online'
> 
> This is on / with:
> * Debian 6.0.4
> * corosync 1.4.2 (from Debian backports)
> * pacemaker 1.0.9.1 (from Debian backports)
> 
> My current cib is attached.
> 
> Roman
> 
>  
> 

> node virtue4 \
>         attributes standby="off"
> node virtue5 \
>         attributes standby="off"
> primitive p_fs_nfs ocf:heartbeat:Filesystem \
>         params device="10.10.10.201:/vol/virtueprivate/virtueprivate" directory="/virtual" fstype="nfs" options="rsize=32767,wsize=32767" \
>         op monitor interval="30" OCF_CHECK_LEVEL="20"
> primitive p_sbd lsb:sbd \
>         op monitor interval="30"
> primitive p_vz lsb:vz \
>         op monitor interval="30"
> primitive stonith_sbd stonith:external/sbd \
>         params sbd_device="/dev/disk/by-id/scsi-360a98000572d4c73526f696553506366" \
>         meta target-role="Stopped"
> primitive ve1104 ocf:heartbeat:ManageVE \
>         params veid="1104" \
>         op monitor interval="30" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         meta target-role="Started"
> primitive ve1105 ocf:heartbeat:ManageVE \
>         params veid="1105" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         op monitor interval="30" \
>         meta allow-migrate="true" target-role="Started"
> primitive ve2010 ocf:heartbeat:ManageVE \
>         params veid="2010" \
>         op monitor interval="30" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         meta allow-migrate="true" target-role="started"
> primitive ve2100 ocf:heartbeat:ManageVE \
>         params veid="2100" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         op monitor interval="30" \
>         meta target-role="started"
> primitive ve2101 ocf:heartbeat:ManageVE \
>         params veid="2101" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         op monitor interval="30" \
>         meta target-role="Started"
> primitive ve2102 ocf:heartbeat:ManageVE \
>         params veid="2102" \
>         op monitor interval="30" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         meta target-role="Started"
> primitive ve991 ocf:heartbeat:ManageVE \
>         params veid="991" \
>         op monitor interval="30" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         meta target-role="Started"
> primitive ve992 ocf:heartbeat:ManageVE \
>         params veid="992" \
>         op monitor interval="30" \
>         op start interval="0" timeout="75s" \
>         op stop interval="0" timeout="75s" \
>         meta allow-migrate="true" target-role="Started"
> clone cl_fs_nfs p_fs_nfs \
>         meta target-role="Started"
> clone cl_sbd p_sbd \
>         meta target-role="started"
> clone cl_vz p_vz \
>         meta target-role="Started"
> location cli-prefer-stonith_sbd stonith_sbd \
>         rule $id="cli-prefer-rule-stonith_sbd" inf: #uname eq virtue4
> location cli-prefer-ve1104 ve1104 \
>         rule $id="cli-prefer-rule-ve1104" inf: #uname eq virtue5
> location cli-prefer-ve1105 ve1105 \
>         rule $id="cli-prefer-rule-ve1105" inf: #uname eq virtue4
> location cli-prefer-ve2010 ve2010 \
>         rule $id="cli-prefer-rule-ve2010" inf: #uname eq virtue4
> location cli-prefer-ve2100 ve2100 \
>         rule $id="cli-prefer-rule-ve2100" inf: #uname eq virtue5
> location cli-prefer-ve2101 ve2101 \
>         rule $id="cli-prefer-rule-ve2101" inf: #uname eq virtue4
> location cli-prefer-ve2102 ve2102 \
>         rule $id="cli-prefer-rule-ve2102" inf: #uname eq virtue5
> location cli-prefer-ve991 ve991 \
>         rule $id="cli-prefer-rule-ve991" inf: #uname eq virtue4
> location cli-prefer-ve992 ve992 \
>         rule $id="cli-prefer-rule-ve992" inf: #uname eq virtue5

Hm... are you sure you want those location constraints to stick around?
Maybe rather have a slight, non-inf preference, and/or a
resource-stickiness?

below, all "0:" should be "inf:"

> order o_nfs_before_vz 0: cl_fs_nfs cl_vz
> order o_sbd_before_nfs 0: cl_sbd cl_fs_nfs
> order o_vz_before_ve1001 0: cl_vz ve991
> order o_vz_before_ve1002 0: cl_vz ve992
> order o_vz_before_ve1104 0: cl_vz ve1104
> order o_vz_before_ve1105 0: cl_vz ve1105
> order o_vz_before_ve2010 0: cl_vz ve2010
> order o_vz_before_ve2100 0: cl_vz ve2100
> order o_vz_before_ve2101 0: cl_vz ve2101
> order o_vz_before_ve2102 0: cl_vz ve2102
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="false" \
> 


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com



More information about the Linux-HA mailing list