[Linux-HA] logging out of control

Suman Hanmandla suman_hanmandla at persistent.co.in
Wed Jul 8 05:36:33 MDT 2009


Hi try to change the /etc/ha.d/ha.cf file by changing the bcat with ucast to specifi nodes.
Instead of broadcasting the messages to all the nodes in the cluster and also outside the cluster(Other DB HA cluster) running on the same default port 694.

ucast eth0 <nodename1> 
ucast eth0 <nodename2>

If I have nodename1 and nodename2 are two nodes in cluster.

Or just try adding "udpport 695", just change the default port.
Try restart the services and things should work fine and Log should make some sense without error.

Thanks,
Suman




-----Original Message-----
From: linux-ha-bounces at lists.linux-ha.org [mailto:linux-ha-bounces at lists.linux-ha.org] On Behalf Of David Lang
Sent: Tuesday, July 07, 2009 11:48 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] logging out of control

On Tue, 7 Jul 2009, Dejan Muhamedagic wrote:

> Hi,
>
> On Tue, Jul 07, 2009 at 09:13:35AM -0700, Michael Hutchins wrote:
>> Seems as though since mcast is on it is getting feedback from
>> other boxes. Can I make it not log those?
>
> No, I don't think so. Why don't you use different ports for
> different clusters?

yep, this is a longstanding problem, with broadcast (and apparently with 
multicast) heartbeats you need to make sure that no clusters that can hear each 
other use the same ports.

if you do, the machines log that they are seeing heartbeat messages that aren't 
appropriate for the cluster, which is what you are seeing.

David Lang

> Thanks,
>
> Dejan
>
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[4] : [src=dcwvm-drbdnode-1]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[5] : [(1)srcuuid=0x8137360(36 27)]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[6] : [seq=7e73f]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[7] : [hg=4a4131fc]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[8] : [ts=4a5373fd]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[9] : [ld=0.00 0.00 0.00 1/64 5665]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[10] : [ttl=3]
>> heartbeat[4514]: 2009/07/07_09:11:35 ERROR: MSG[11] : [auth=3 1909c816855795dc7291ea49a1713f87]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: process_status_message: bad node [dcwvm-drbdnode-2] in message
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG: Dumping message with 12 fields
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[0] : [t=status]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[1] : [st=active]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[2] : [dt=7530]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[3] : [protocol=1]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[4] : [src=dcwvm-drbdnode-2]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[5] : [(1)srcuuid=0x813d568(36 27)]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[6] : [seq=7e1e4]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[7] : [hg=4a4131f6]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[8] : [ts=4a5373fe]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[9] : [ld=0.00 0.00 0.00 1/63 9885]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[10] : [ttl=3]
>> heartbeat[4514]: 2009/07/07_09:11:36 ERROR: MSG[11] : [auth=3 2f323b164c1769d2c987cf82660dbbd9]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: process_status_message: bad node [dcwvm-drbdnode-1] in message
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG: Dumping message with 12 fields
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[0] : [t=status]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[1] : [st=active]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[2] : [dt=7530]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[3] : [protocol=1]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[4] : [src=dcwvm-drbdnode-1]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[5] : [(1)srcuuid=0x81390e8(36 27)]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[6] : [seq=7e740]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[7] : [hg=4a4131fc]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[8] : [ts=4a5373ff]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[9] : [ld=0.00 0.00 0.00 1/64 5665]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[10] : [ttl=3]
>> heartbeat[4514]: 2009/07/07_09:11:37 ERROR: MSG[11] : [auth=3 986058a23ee6f1a7ed6fd8d35d1b1046]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: process_status_message: bad node [dcwvm-drbdnode-2] in message
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG: Dumping message with 12 fields
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[0] : [t=status]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[1] : [st=active]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[2] : [dt=7530]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[3] : [protocol=1]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[4] : [src=dcwvm-drbdnode-2]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[5] : [(1)srcuuid=0x8139c50(36 27)]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[6] : [seq=7e1e5]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[7] : [hg=4a4131f6]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[8] : [ts=4a537400]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[9] : [ld=0.00 0.00 0.00 1/63 9885]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[10] : [ttl=3]
>> heartbeat[4514]: 2009/07/07_09:11:38 ERROR: MSG[11] : [auth=3 4d5bd4a69f02a1b6abc4fc8098eabca2]
>>
>> -----Original Message-----
>> From: linux-ha-bounces at lists.linux-ha.org [mailto:linux-ha-bounces at lists.linux-ha.org] On Behalf Of Dejan Muhamedagic
>> Sent: Tuesday, July 07, 2009 9:00 AM
>> To: General Linux-HA mailing list
>> Subject: Re: [Linux-HA] logging out of control
>>
>> Hi,
>>
>> On Mon, Jul 06, 2009 at 02:29:01PM -0700, Michael Hutchins wrote:
>>> Holy cow, how do I turn logging down? My ha-log was 800 megs over the weekend...
>>
>> Wow. Do you have debug turned on? If not, then must be something
>> going on with the cluster all the time. Did you take a look? Can
>> you post a sample?
>>
>> Thanks,
>>
>> Dejan
>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.



More information about the Linux-HA mailing list