[Linux-HA] Newbie's resources not flying

J. B. Schatz jschatz at linux-nexus.com
Mon Mar 21 06:18:24 MST 2005


Helmut Wollmersdorfer wrote:

> J. B. Schatz wrote:
>
>> Helmut Wollmersdorfer wrote:
>
>>> Does your service-IP 201.225.35.18 exist, is created by HA?
>>
>> No, it is not created by HA. Also, it does not exist before I run HA:
>> there is no IP on eth0, the IFACE I want to use for the service address.
>> Only IFACEs eth1 and eth2 are active during my tests.
>
>
> On both nodes?
>
> Hmm, from a quick reading of /etc/ha.d/resource.d/IPaddr I understand, 
> that it tries to set up an _alias_ IP, and the main interface/IP must 
> be up as precondition.
>
> Just to exclude problems in your network try it manually like this:
>
> xp2400:# ifconfig eth0 201.225.35.18
> xp2400:# ifconfig eth0:0 201.225.35.19
> xp2400:# ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:30:BD:6B:D6:8F
>           inet addr:201.225.35.18  Bcast:201.225.35.255  
> Mask:255.255.255.0
>           [...]
> eth0:0    Link encap:Ethernet  HWaddr 00:30:BD:6B:D6:8F
>           inet addr:201.225.35.19  Bcast:201.225.35.255  
> Mask:255.255.255.0
>           [...]
>
> Then configure some IP on eth0 permanently. Something like "ifconfig 
> eth0:0 201.225.35.19" (setup an alias) will _not_ work, if eth0 is not 
> up.

Ok, I see that my confusion re alias_IP creation stemmed from an unclear 
treatment of the topic in GettingStarted.html together with my 
unfamiliarity with the heartbeat scripts. That's why I failed to see 
that alias-creation is mandatory instead of optional. Thanks very much 
to you and Alan for clearing up that issue.

Unfortunately, I have confirmed that the problem does not relate to 
creating the cluster IP alias; in fact, the problem is with starting any 
resource at all, be it the cluster IP, DRBD, or any of the other 
services to be started out of haresources. There must be something 
blocking heartbeat's start-up of all resources although, again, I do not 
recognize any clues in the log files. Let me explain.

I can create IP aliases manually using Helmut's example above. Also, I 
can set the DRBD disk to primary by running heartbeat's script, 
/etc/ha.d/resource.d/drbddisk manually. And, of course, the individual 
services to be run from haresources can be run normally via their 
respective init scripts. But when I run the init script for hearbeat 
(/etc/init.d/hearbeat start) none of these different services actually run.

I have experimented with different haresources content, including on one 
occasion having only a cluster IP appear there, on another occasion only 
DRBD, and on another occasion only samba, and I can confirm that 
heartbeat is not starting any of them no matter if they appear 
separately or in different combinations or orders. And I re-confirm that 
haresources is identical on both nodes during all testing.

The logs do continue to register a warning, but is it significant? 
Again, thanks to everyone for all the help I've received.

snip from node1 (full log text attached):
Mar 21 07:17:11 [heartbeat] debug: Starting notify process [status]
Mar 21 07:17:11 [heartbeat] info: AnnounceTakeover(local 0, foreign 1, 
reason 'HB_R_BOTHSTARTING' (0))
Mar 21 07:17:11 [heartbeat] debug: process_resources: other now unstable
Mar 21 07:17:11 [heartbeat] debug: Sending hold resources msg: none, 
stable=0 # <none>
Mar 21 07:17:11 [heartbeat] info: AnnounceTakeover(local 0, foreign 1, 
reason 'T_RESOURCES' (0))
Mar 21 07:17:11 [heartbeat] info: STATE 1 => 3
Mar 21 07:17:11 [heartbeat] debug: hb_rsc_isstable: 
ResourceMgmt_child_count: 1, other_is_stable: 0, takeover_in_progress: 
0, going_standby:0, standby running(ms): 0, resourcestate: 3
Mar 21 07:17:11 [heartbeat] info: STATE 3 => 2
Mar 21 07:17:11 [heartbeat] debug: hb_rsc_isstable: 
ResourceMgmt_child_count: 1, other_is_stable: 0, takeover_in_progress: 
0, going_standby:0, standby running(ms): 0, resourcestate: 2
Mar 21 07:17:11 [heartbeat] debug: notify_world: setting SIGCHLD Handler 
to SIG_DFL
Mar 21 07:17:11 [heartbeat] debug: notify_world: Running harc status
Mar 21 07:17:11 [heartbeat] WARN: Exiting status process 32138 returned 
rc 1.
Mar 21 07:17:11 [heartbeat] debug: RscMgmtProc 'status' exited code 1
Mar 21 07:17:22 [heartbeat] info: AnnounceTakeover(local 0, foreign 1, 
reason 'T_RESOURCES' (0))
Mar 21 07:17:22 [heartbeat] info: remote resource transition completed.
Mar 21 07:17:22 [heartbeat] debug: Sending hold resources msg: none, 
stable=0 # <none>
Mar 21 07:17:22 [heartbeat] info: AnnounceTakeover(local 0, foreign 1, 
reason 'T_RESOURCES' (0))
Mar 21 07:17:22 [heartbeat] info: STATE 2 => 3




More information about the Linux-HA mailing list