[Linux-HA] Is this the right config?

Dejan Muhamedagic dejanmm at fastmail.fm
Mon Oct 1 05:24:22 MDT 2007


Hi,

On Sun, Sep 30, 2007 at 05:37:10PM -0700, Kelly Byrd wrote:
> I'm running 2.1.2 on two nodes. I want heartbeat to manage 22 VMware VMs
> across the two nodes. In terms of heartbeat resources, each VM is:
> - drbd ocf master-slave resource
> - Filesystem ocf resource (XFS)
> - VM ocf resource (my own ocf script)
> 
> 
> I'm looking for advice on how to group these resources since the all
> depend on each other. I'm testing with config very similar to the
> DRBD/HowTov2 example at: http://www.linux-ha.org/DRBD/HowTov2.
> 
> I have a drbd master/slave resource (ms-drbd1), and then a group
> (group_vm1). The group contains and filesystem resource (vm1-fs) and my VM
> (vm1-vm) resource. I have an rsc_order contraint saying group_vm1 should
> only run where ms-drbd1 has been promoted. I also have a rsc_colocation
> constraint saying group_vm1 follows ms-drbd1.

You mean the other way around: exchange rsc_order and
rsc_colocation.

> Finally I have a location
> constaint saying ms-drbd1 prefers node1.
> 
> When testing this with two VMs (add ms-drbd2 and group_vm2, prefering
> node2), things don't always work out as planned. Sometimes, with only one
> node running, if I "/etc/init.d/heartbeat start" on node1, ms-drbd1 and
> group_vm1 will try to migrate over to node1, fail then return back to
> node2. It's not clear to me what's failing.

What do the logs say?

> I feel like sometimes I end up
> in a state where the drbd resource starts, but the filesystem doesn't and
> therefore the VM resource doesn't. Maybe I need a delay betweeb resource
> starts?

You can insert a Delay resource between the two. However, a delay
should not be needed.

> Should I be grouping these differently?

The config looks ok to me.

> I'm going to be creating
> 22 of these "group of three" resources, with three constraints for each.
> Is there an easier set of XML to configure this? I want half to prefer one
> node, and half to prefer the other. Finally, if both nodes are up and
> group_vm1 failed to start on a node, will it retry later? Actually, that's
> more important to me in the single node case as there is no other place
> for the failed resource to live.

This has been planned. Probably in the next release heartbeat
will after a while forget about the failed start. For now, you
will have to crm_resource -C.

Thanks,

Dejan

> 
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems


More information about the Linux-HA mailing list