[Linux-HA] HA of virtual machiens

Andrew Beekhof beekhof at gmail.com
Tue Feb 5 01:23:57 MST 2008


On Feb 5, 2008, at 12:30 AM, Amos Shapira wrote:

> On Feb 4, 2008 7:32 PM, Andrew Beekhof <beekhof at gmail.com> wrote:
>
>> Crashing?
>> What was the subject?  I don't recall this.
>
>
> I couldn't make CentOS 5's 2.1.3 talk to another node when  
> configuring with
> the version 2 style CRM, at some stage I learned that not all programs
> manage to start and stay up. Later also found (I think) that  
> "stonith -h" or
> something like this always bombs on some interrupt.
>
> I don't remember all the details but the thread where I asked about  
> this is
> archived in
> http://lists.community.tummy.com/pipermail/linux-ha/2007-November/029068.html


Sorry, I must have missed this thread.

> I eventually switched to using the old-style haresources config file  
> and
> things seem to work OK with that.

     24 heartbeat[17482]: 2007/11/29_07:12:41 info: Status update for  
node drbd01.test.spammatters.local: status up
     25 heartbeat[17482]: 2007/11/29_07:13:45 info: all clients are  
now paused

line 25 is sure to be part of the problem, but also I don't see any  
evidence that heartbeat even tried to start the crm processes.

this is also interesting...
     13 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write  
socket priority set to IPTOS_LOWDELAY on eth0
     14 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound  
send socket to device: eth0
     15 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound  
receive socket to device: eth0
     16 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast:  
started on port 695 interface eth0 to 192.168.0.248
     17 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write  
socket priority set to IPTOS_LOWDELAY on eth0
     18 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound  
send socket to device: eth0
     19 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound  
receive socket to device: eth0
     20 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast:  
started on port 695 interface eth0 to 192.168.0.249

I wonder if the fact that there are two IPs on eth0 could have been  
causing problems.

Oh, and the reason crm_mon was taking so long is related to your  
choice of deadtime which was quite high.

>
>
> Thanks,
>
> --Amos
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



More information about the Linux-HA mailing list