[Linux-HA] heartbeat version 1 behavior

Pamela Rock prock111 at yahoo.com
Fri Feb 10 10:18:26 MST 2006


Johan,

Thank you.  I modified squid init such that it does
not try to start if it is already running.  That being
stated, is there anything I need to return the OS
after I run the this check (return 0, return 1). 
Admittedly this confuses me.

Thank you in advance.

Pamela

--- Johan De Meersman <jdm at operamail.com> wrote:

> 
> ResourceManager[3363]:  2006/02/08_12:48:32 info:
> Running /etc/init.d/squid  start
> ResourceManager[3363]:  2006/02/08_12:48:32 ERROR:
> Return code 1 from /etc/init.d/squid
> ResourceManager[3363]:  2006/02/08_12:48:32 CRIT:
> Giving up resources due to failure of squid
> 
> 
> First guess would be that the squid init script
> doesn't return OK when
> asked to start while already running.
> 
> 
> Pamela Rock wrote:
> 
> >I have a wierd senario (HB version 1 config) I'm
> >hoping someone may provide some help with.  I'm
> >testing heartbeat and all but one of my tests work.
> 
> >The one test that fails is when I bring up the
> >seconadary node while the primary is active.  The
> >heartbeat will fail.
> >
> >If both nodes are up and I shutdown the primary,
> the
> >secondary takes over as expected.  If I bring the
> >secondary back on line, the primary takes over as
> >expected.  If I shutdown the secondary, the primary
> >nodes remains operational (again as expected).  But
> >after I attempt to bring the secondary node back
> up,
> >heartbeat stops working.  This to means that the
> >secondary remains a single point of failure for the
> >entire system.
> >
> >I'm not sure if this is relevent but we are using a
> >second NIC for the heartbeat.  My better refuses to
> >use the recommended serial cable.
> >
> >The error in /var/log/ha-log is ERROR: Return code
> 1
> >from /etc/init.d/squid (complete error log below)
> >
> >(Incidently, Squid works fine by itself (that is,
> the
> >start, stop, and status processes work as
> expected.)
> >
> >I hope someone can help.  The following is some
> info
> >regarding my setup and config.
> >
> >Running RH ES3 with HB version 2.0.2.
> >
> >The config on the primary (secondary is very
> similar)
> >node is:
> >
> >haresources: 
> >server1 10.15.0.15 squid sfagent_control
> >sfserver_control
> >
> >ha.cf:
> >debugfile /var/log/ha-debug
> >logfile /var/log/ha-log
> >logfacility     local0
> >keepalive 2
> >deadtime 30
> >warntime 10
> >initdead 120
> >bcast   eth1 
> >auto_failback on
> >node    server1
> >node    server2
> >ping 10.15.0.254
> >respawn hacluster /usr/lib/heartbeat/ipfail
> >
> >ha-log on the primary node:
> >
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Link
> >server2:eth1 up.
> >heartbeat[2599]: 2006/02/08_12:43:33 info: pid 2599
> >locked in memory.
> >heartbeat[2600]: 2006/02/08_12:43:33 info: pid 2600
> >locked in memory.
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Status
> >update for node server2: status active
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Link
> >10.15.0.254:10.15.0.254 up.
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Status
> >update for node 10.15.0.254: status ping
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Local
> >status now set to: 'active'
> >heartbeat[2582]: 2006/02/08_12:43:33 info: Starting
> >child client "/usr/lib/heartbeat/ipfail" (502,502)
> >harc[2607]:     2006/02/08_12:43:33 info: Running
> >/etc/ha.d/rc.d/status status
> >heartbeat[2608]: 2006/02/08_12:43:34 info: Starting
> >"/usr/lib/heartbeat/ipfail" as uid 502  gid 502
> (pid
> >2608)
> >heartbeat[2582]: 2006/02/08_12:43:34 info: Link
> >server1:eth1 up.
> >heartbeat[2582]: 2006/02/08_12:43:34 info: remote
> >resource transition completed.
> >heartbeat[2582]: 2006/02/08_12:43:34 info: remote
> >resource transition completed.
> >heartbeat[2582]: 2006/02/08_12:43:34 info: Local
> >Resource acquisition completed. (none)
> >heartbeat[2582]: 2006/02/08_12:43:34 info: server2
> >wants to go standby [foreign]
> >heartbeat[2582]: 2006/02/08_12:43:42 info: standby:
> >acquire [foreign] resources from server2
> >heartbeat[2687]: 2006/02/08_12:43:42 info: acquire
> >local HA resources (standby).
> >ResourceManager[2697]:  2006/02/08_12:43:42 info:
> >Acquiring resource group: server1 10.15.0.15 squid
> >sfagent_control sfserver_control
> >ResourceManager[2697]:  2006/02/08_12:43:42 info:
> >Running /etc/ha.d/resource.d/IPaddr 10.15.0.15
> start
> >IPaddr[2755]:   2006/02/08_12:43:43 info:
> >/sbin/ifconfig eth0:0 10.15.0.15 netmask
> 255.255.0.0 
> >broadcast 10.15.255.255
> >IPaddr[2755]:   2006/02/08_12:43:43 info: Sending
> >Gratuitous Arp for 10.15.0.15 on eth0:0 [eth0]
> >IPaddr[2755]:   2006/02/08_12:43:43
> >/usr/lib/heartbeat/send_arp -i 500 -r 10 -p
>
>/var/run/heartbeat/rsctmp/send_arp/send_arp-10.15.0.15
> >eth0 10.15.0.15 auto 10.15.0.15 ffffffffffff
> >ResourceManager[2697]:  2006/02/08_12:43:44 info:
> >Running /etc/init.d/squid  start
> >ResourceManager[2697]:  2006/02/08_12:43:46 info:
> >Running /etc/init.d/sfagent_control  start
> >ResourceManager[2697]:  2006/02/08_12:43:47 info:
> >Running /etc/init.d/sfserver_control  start
> >heartbeat[2687]: 2006/02/08_12:43:49 info: local HA
> >resource acquisition completed (standby).
> >heartbeat[2582]: 2006/02/08_12:43:49 info: Standby
> >resource acquisition done [foreign].
> >heartbeat[2582]: 2006/02/08_12:43:50 info: Initial
> >resource acquisition complete (auto_failback)
> >heartbeat[2582]: 2006/02/08_12:43:50 info: remote
> >resource transition completed.
> >heartbeat[2582]: 2006/02/08_12:45:53 WARN: node
> >server2: is dead
> >heartbeat[2582]: 2006/02/08_12:45:53 WARN: No
> STONITH
> >device configured.
> >heartbeat[2582]: 2006/02/08_12:45:53 WARN: Shared
> >disks are not protected.
> >heartbeat[2582]: 2006/02/08_12:45:53 info:
> Resources
> >being acquired from server2.
> >heartbeat[2582]: 2006/02/08_12:45:53 info: Link
> >server2:eth1 dead.
> >harc[3236]:     2006/02/08_12:45:53 info: Running
> >/etc/ha.d/rc.d/status status
> >mach_down[3248]:        2006/02/08_12:45:53 info:
> >/usr/lib/heartbeat/mach_down: nice_failback:
> foreign
> >resources acquired
> >heartbeat[2582]: 2006/02/08_12:45:53 info:
> mach_down
> >takeover complete.
> >mach_down[3248]:        2006/02/08_12:45:53 info:
> >mach_down takeover complete for node server2.
> >heartbeat[3238]: 2006/02/08_12:45:53 info: Local
> >Resource acquisition completed.
> >heartbeat[2582]: 2006/02/08_12:48:19 info:
> Heartbeat
> >restart on node server2
> >heartbeat[2582]: 2006/02/08_12:48:19 info: Link
> >server2:eth1 up.
> >heartbeat[2582]: 2006/02/08_12:48:19 info: Status
> >update for node server2: status init
> >heartbeat[2582]: 2006/02/08_12:48:19 info: Status
> >update for node server2: status up
> >harc[3313]:     2006/02/08_12:48:19 info: Running
> >/etc/ha.d/rc.d/status status
> >harc[3323]:     2006/02/08_12:48:19 info: Running
> >/etc/ha.d/rc.d/status status
> >heartbeat[2582]: 2006/02/08_12:48:20 info: Status
> >update for node server2: status active
> >heartbeat[2582]: 2006/02/08_12:48:20 info: remote
> >resource transition completed.
> >heartbeat[2582]: 2006/02/08_12:48:20 info: server1
> >wants to go standby [foreign]
> >heartbeat[2582]: 2006/02/08_12:48:20 info: standby:
> >server2 can take our foreign resources
> >heartbeat[3334]: 2006/02/08_12:48:20 info: give up
> >foreign HA resources (standby).
> >harc[3333]:     2006/02/08_12:48:20 info: Running
> >/etc/ha.d/rc.d/status status
> >heartbeat[3334]: 2006/02/08_12:48:20 info: foreign
> HA
> >resource release completed (standby).
> >heartbeat[2582]: 2006/02/08_12:48:20 info: Local
> >standby process completed [foreign].
> 
=== message truncated ===>
_______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Linux-HA mailing list