[Linux-HA] Re: Re:Re:Re:Problems with resources failing over and other little problems

Chris Gallo chrisagallo at gmail.com
Tue Sep 5 09:46:58 MDT 2006


Alright, here is my new cib file http://isthesuck.com/cib.xml

and my ha.cf has remained the same
> > Here it is, this is pretty much what was in the walkthrough.
> > debugfile /var/log/ha-debug
> > logfile /var/log/ha-log
> > logfacility syslog
> > keepalive 2
> > deadtime 7
> > warntime 8
> > initdead 15
> > baud    19200
> > serial  /dev/ttyS0      # Linux
> > bcast   eth1            # Linux
> > watchdog /dev/watchdog
> > node    ldap-1.ev1servers.net
> > node    ldap-2.ev1servers.net
> > ping 207.218.204.193
> > crm yes


> > >Third. Set different scores for rsc_location for different nodes. Node
> > >with the higher score will be primary node.
> >
> > Well, I wanted ldap to run on both nodes at once (so the database will
> > get updated on both) which is why its the same for both nodes. However
> > for the ip address the primary is 100 and the secondary is 0 so it
> > would go back to the primary if it comes back up, however this is not
> > the case.
>
>  Take a look at clones: http://www.linux-ha.org/v2/Concepts/Clones

I put that in the cib, however my problem continues. ldap1 starts up
fine and brings my resources up. But when I bring ldap2 up ldap2 just
sits there. This is all that ldap2 generates in the logs when it
starts up.

heartbeat[22466]: 2006/09/05_10:16:13 info: Configuration validated.
Starting heartbeat 2.0.4
heartbeat[22467]: 2006/09/05_10:16:13 info: heartbeat: version 2.0.4
heartbeat[22467]: 2006/09/05_10:16:13 info: Heartbeat generation: 60
heartbeat[22467]: 2006/09/05_10:16:13 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[22467]: 2006/09/05_10:16:13 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[22467]: 2006/09/05_10:16:13 info: Removing
/var/run/heartbeat/rsctmp failed, recreating.
heartbeat[22467]: 2006/09/05_10:16:13 info: glib: Starting serial
heartbeat on tty /dev/ttyS0 (19200 baud)
heartbeat[22467]: 2006/09/05_10:16:13 info: glib: UDP Broadcast
heartbeat started on port 694 (694) interface eth1
heartbeat[22467]: 2006/09/05_10:16:13 info: glib: UDP Broadcast
heartbeat closed on port 694 interface eth1 - Status: 1
heartbeat[22467]: 2006/09/05_10:16:13 info: glib: ping heartbeat started.
heartbeat[22467]: 2006/09/05_10:16:13 ERROR: Cannot open watchdog
device: /dev/watchdog
heartbeat[22467]: 2006/09/05_10:16:13 info: G_main_add_SignalHandler:
Added signal handler for signal 17
heartbeat[22467]: 2006/09/05_10:16:13 info: Local status now set to: 'up'
heartbeat[22467]: 2006/09/05_10:16:13 info: Exiting
write_hostcachedata process 22477 returned rc 0.
heartbeat[22467]: 2006/09/05_10:16:14 info: Link
ldap-1.ev1servers.net:/dev/ttyS0 up.
heartbeat[22467]: 2006/09/05_10:16:14 info: Status update for node
ldap-1.ev1servers.net: status active
heartbeat[22467]: 2006/09/05_10:16:15 info: Link ldap-1.ev1servers.net:eth1 up.
heartbeat[22467]: 2006/09/05_10:16:15 info: Link
207.218.204.193:207.218.204.193 up.
heartbeat[22467]: 2006/09/05_10:16:15 info: Status update for node
207.218.204.193: status ping
heartbeat[22467]: 2006/09/05_10:16:15 info: Link ldap-2.ev1servers.net:eth1 up.

and then it just waits for ldap1 to die or lose connection. My main
problem is why doesnt ldap2 start up anything or read its config like
ldap1 does? One thing I have noticed is that when both nodes are up
and have been up for a while, the cib.xml files still shows
num_peers=1, shouldnt this be 2?



Another concern, although not quite as important, is how would I go
about decreasing the time between these 2 log entries.
crmd[22491]: 2006/09/05_10:41:25 info: mask(utils.c:crm_timer_popped):
Wait Timer (I_NULL) just popped!
crmd[22491]: 2006/09/05_10:42:25 info: mask(utils.c:crm_timer_popped):
Election Trigger (I_DC_TIMEOUT) just popped!

When the node starts up it waits for 60s after starting the HA
services, and then starting my services. Can't seem to find anything
on decreasing this time, is it possible?


I really appreciate all the help so far. I feel that I have this
ALMOST working..so close.

-Chris


More information about the Linux-HA mailing list