[Linux-HA] ERROR: EvmsSCC: vs4 (local) not on active list!
John Lange
john.lange at open-it.ca
Fri Feb 2 14:00:25 MST 2007
This problem with evms_activate taking soooooooo loooooong to start is
now solved!
The delay in starting evms appears to have had more to do with a
multi-path problem than anything to do with heartbeat itself.
In desperation we removed the second fiber channel qlogic card from all
of the nodes and reconfigured everything for single path. After all the
nodes were restarted the evms_activate now returns very quickly and as a
result its also working with heartbeat!
Thats not to say that everything is hunky-dory but those are topics for
another email ;)
John
On Thu, 2007-02-01 at 02:59 -0700, Robert Wipfel wrote:
> >>> On Wed, Jan 31, 2007 at 1:53 PM, in message
> <1170276804.4681.49.camel at ibmlaptop.darkcore.net>, John Lange
> <john.lange at open-it.ca> wrote:
> > I hate to be a pain here but does anyone have any suggestions?
>
> Sorry for the delay...
>
> > We have a cluster that is running only on one node at the moment while
> > we attempt to get heartbeat+evms+ocfs2+nfs in working order.
> >
> > Any suggestions would be much appreciated.
>
> below...
>
> > On Mon, 2007- 01- 29 at 14:18 - 0600, John Lange wrote:
> >> Ok, thanks for the suggestions.
> >>
> >> I've made the suggested change to the EvmsSCC cloneset and indeed it
> >> seems to be getting closer to working but is still will not start.
> >>
> >> For the sake of completeness I have included my entire cib.xml and below
> >> that is what I hope is the relevant portion of the log file.
> >>
> >> >From what I can see the evms_activate command is timing out. I think
> >> this then causes heartbeat to bounce the evms_activate around to the
> >> other nodes causing them to lock each other out.
>
> Starting evms_activate with -d should generate some EVMS logs
> that'll help diagnose what EVMS is doing; details here:
> http://marc.theaimsgroup.com/?l=evms-devel&m=116804016310808&w=2
>
> >> I have one other concern about evms; is EVMS even viable for redundancy?
> >> When a node is down evms will not start complaining that it can't
> >> contact one of the nodes. How do you make it work in a cluster when one
> >> or more of the nodes may be down? Perhaps it makes more sense to back
>
> Hmm.... this sounds odd, the whole point of the EVMS' Cluster Extension
> plugin for Heartbeat is to cooperate with Heartbeat's view of cluster membership.
> Nodes that come and go are handled by Heartbeat - there's no requirement
> for all nodes to be available for EVMS to work properly - it works with whatever
> nodes have quorum according to Heartbeat.
>
> There have been some evmsd fixes for larger clusters:
> http://marc.theaimsgroup.com/?l=evms-devel&m=115747653214402&w=2
>
> >> out of EVMS and go with straight LVM on the shared storage with OCFS2 on
> >> top of that?
>
> Hmm, well, since you want all nodes to share the disks because you have
> a cluster aware file system - OCFS2 - you could bypass the CSM's shared public
> container this way... though you'd give up EVMS coordination across nodes
> if wanting to make changes.
>
> Hth,
> Robert
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list