[Linux-HA] stopping heartbeat on passive node starts all resources already started on active node

Alberto xagonzalezm at gmail.com
Thu Oct 13 09:30:04 MDT 2005


Hi,

I have two node setup running heartbeat 2.0.2 rpm for rhel 3.

1) If I stop heartbeat on passive node, active one try to start resources
already owned again, why?

2) Also if there is a split-brain and both resources have all resources up
when they sync again both nodes shutdown heartbeat and resources and start
again resources on one node. Is this the expected behavior? shouldnt be
resources just stopped in one node and left up in the other one?


node2# /etc/init.d/heartbeat stop


Oct 13 17:20:07 node1 heartbeat: [9050]: info: acquire local HA resources
(standby).
Oct 13 17:20:07 node1 heartbeat: [31940]: info: Received shutdown notice
from 'node2'.
Oct 13 17:20:07 node1 heartbeat: [31940]: info: Resources being acquired
from node2.
Oct 13 17:20:07 node1 ResourceManager[9070]: info: Acquiring resource group:
node1 IEL IPaddr::10.64.110.70/24/eth0 <http://10.64.110.70/24/eth0>
Oct 13 17:20:08 node1 modprobe: modprobe: Can't locate module char-major-203
Oct 13 17:20:11 node1 last message repeated 15 times
Oct 13 17:20:15 node1 heartbeat: [9052]: info: Local Resource acquisition
completed.
Oct 13 17:20:15 node1 ResourceManager[9070]: info: Running
/etc/ha.d/resource.d/IEL start
Oct 13 17:20:19 node1 heartbeat: [31940]: WARN: node node2: is dead
Oct 13 17:20:19 node1 heartbeat: [31940]: info: Dead node node2 gave up
resources.
Oct 13 17:20:19 node1 ipfail: [31947]: info: Status update: Node node2 now
has status dead
Oct 13 17:20:19 node1 heartbeat: [31940]: info: Link node2:eth0 dead.
Oct 13 17:20:19 node1 ipfail: [31947]: info: NS: We are still alive!
Oct 13 17:20:20 node1 modprobe: modprobe: Can't locate module char-major-203
Oct 13 17:20:20 node1 last message repeated 3 times
Oct 13 17:20:20 node1 ipfail: [31947]: info: Link Status update: Link
node2/eth0 now has status dead
Oct 13 17:20:20 node1 ipfail: [31947]: info: Asking other side for ping node
count.
Oct 13 17:20:20 node1 ipfail: [31947]: info: Checking remote count of ping
nodes.
Oct 13 17:20:29 node1 modprobe: modprobe: Can't locate module char-major-203
Oct 13 17:20:29 node1 last message repeated 3 times
Oct 13 17:20:37 node1 heartbeat: [9050]: info: local HA resource acquisition
completed (standby).
Oct 13 17:20:37 node1 heartbeat: [31940]: ERROR: Ignored standby message
'done' from node1 in state 0
Oct 13 17:20:37 node1 harc[9553]: info: Running /etc/ha.d/rc.d/status status
Oct 13 17:20:37 node1 mach_down[9563]: info: /usr/lib/heartbeat/mach_down:
nice_failback: foreign resources acquired
Oct 13 17:20:37 node1 heartbeat: [31940]: info: mach_down takeover complete.
Oct 13 17:20:37 node1 mach_down[9563]: info: mach_down takeover complete for
node node2.
Oct 13 17:20:37 node1 harc[9590]: info: Running
/etc/ha.d/rc.d/ip-request-resp ip-request-resp
Oct 13 17:20:37 node1 ip-request-resp[9590]: received ip-request-resp IEL OK
yes
Oct 13 17:20:37 node1 ResourceManager[9605]: info: Acquiring resource group:
node1 IEL IPaddr::10.64.110.70/24/eth0 <http://10.64.110.70/24/eth0>
Oct 13 17:20:37 node1 modprobe: modprobe: Can't locate module char-major-203
Oct 13 17:20:41 node1 last message repeated 7 times
Oct 13 17:20:45 node1 ResourceManager[9605]: info: Running
/etc/ha.d/resource.d/IEL start
Oct 13 17:20:50 node1 modprobe: modprobe: Can't locate module char-major-203
Oct 13 17:20:58 node1 modprobe: modprobe: Can't locate module char-major-203
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux-ha.org/pipermail/linux-ha/attachments/20051013/19dafa75/attachment.html>


More information about the Linux-HA mailing list