[Linux-HA] Heartbeat 2.0.7 with CRM and IBM ServeRAID

Andrew Beekhof beekhof at gmail.com
Tue Oct 10 08:43:55 MDT 2006


On 10/10/06, Jon Fanti <Jon.Fanti at unique.com> wrote:
> >>>> beekhof at gmail.com 10/10/2006 12:04:43 >>>
> >
> >We send a monitor (interval=0) to see what state the resource is in.
> >Assuming it reports 7 (not running), we then start the resource.
> >Assuming it reports 0 (success), we then start any recurring monitor
> >actions that we specified for the resource.
>
> Okay, so let me try to explain what this causes the ServeRAID to do:
>
> The ServeRAID RA has to issue a "unmerge" before it can "merge" the
> configurations onto the active node, if this is not actioned then the
> script will error, which I believe is not correct for Heartbeat (or any
> LSB in fact).
>
> While a merge command is being executed Heartbeat will do as you have
> described above, and issue a second start command. This command locks
> and waits for the first start command to finish, once it has finished
> the second command can issue a unmerge and then merge, in this time
> Heartbeat has decided that the service is still not running and has now
> issued another start command which will unmerge before merging.
> Eventually we get lucky and the device is started without another script
> waiting to be executed. This is the behaviour I am trying to get around,
> I've tried setting a start timeout, a monitor timeout at "start timeout+
> 10s" and additionally a monitor start-delay at 60s against the ServeRAID
> native. None of these have had the desired effect of preventing many
> starts from occurring, or causing Heartbeat to wait until one has
> completed before attempting to issue subsequent starts.

can you attach your current configuration?


More information about the Linux-HA mailing list