[Linux-HA] stand_alone_ping: Node xx.yy.zz.ww is unreachable (read)
jarach at op.pl
jarach at op.pl
Tue Jul 28 08:37:22 MDT 2009
Does anybody have a clue what is going on - is this a bug or a real problem with connection that is not notified by system ping.
Is there any way to replace pingd with e.g. a bash script that check the connectivity and reports a connection status to the heartbeat system (e.g. resource is stopped or resource has failure)? So a score for a certain resource group is recalculated and in the case of connectivity problems the resource group is relocated to other machine. Is this practically possible to apply to the crm style configuration?
E.g. bash subroutine:
check_connection () {
?node=$1
?[ -z "$node" ] && return 1
?NPACKETS=3
?stat=0
?ping -n -q -c $NPACKETS "$node" >/dev/null 2>&1
?if [ "$?" -ne 0 ]; then
??echo "ERROR: Ping node $node does not answer to ICMP pings"
??stat=1
?else
? echo "INFO: Ping node $node answers to ICMP pings"
?fi
?return $stat
}
I would be grateful for help,
Jarek
"General Linux-HA mailing list" <linux-ha at lists.linux-ha.org> napisał(a):
>
> I found additionally the error message attached below. Please advise.
>
> Thanks
> Jarek
>
> pingd[6890]: 2009/07/24_14:47:15 debug: stand_alone_ping: Node 3.27.60.1 is alive
> pingd[6890]: 2009/07/24_14:47:15 debug: debug2: ping_close: Closed connection to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:15 debug: send_update: Sent update: pingd=1000 (1 active ping nodes)
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: stand_alone_ping: Checking connectivity
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Got address 3.27.60.1 for 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_open: Opened connection to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: ping_read: Got 59 bytes
> No error message: -1: Resource temporarily unavailable (11)
> pingd[6890]: 2009/07/24_14:47:16 debug: process_icmp_error: No error message: -1: Resource temporarily unavailable (11)
> pingd[6890]: 2009/07/24_14:47:16 debug: debug2: dump_v4_echo: Echo from 3.27.60.1 (exp=1238, seq=18367, id=11669, dest=3.
> 27.60.1, data=pingd-v4): Echo Reply
> pingd[6890]: 2009/07/24_14:47:16 info: stand_alone_ping: Node 3.27.60.1 is unreachable (read)
> pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:17 debug: debug2: ping_read: Got 59 bytes
> No error message: -1: Resource temporarily unavailable (11)
> pingd[6890]: 2009/07/24_14:47:17 debug: process_icmp_error: No error message: -1: Resource temporarily unavailable (11)
> pingd[6890]: 2009/07/24_14:47:17 debug: debug2: dump_v4_echo: Echo from 3.27.60.1 (exp=1239, seq=1238, id=6890, dest=3.27
> .60.1, data=pingd-v4): Echo Reply
> pingd[6890]: 2009/07/24_14:47:17 info: stand_alone_ping: Node 3.27.60.1 is unreachable (read)
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed connection to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:18 debug: send_update: Sent update: pingd=0 (0 active ping nodes)
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: stand_alone_ping: Checking connectivity
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Got address 3.27.60.1 for 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_open: Opened connection to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_read: Got 59 bytes
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: dump_v4_echo: Echo from 3.27.60.1 (exp=1240, seq=1240, id=6890, dest=3.27
> .60.1, data=pingd-v4): Echo Reply
> pingd[6890]: 2009/07/24_14:47:18 debug: stand_alone_ping: Node 3.27.60.1 is alive
> pingd[6890]: 2009/07/24_14:47:18 debug: debug2: ping_close: Closed connection to 3.27.60.1
> p
>
> "General Linux-HA mailing list" <linux-ha at lists.linux-ha.org> napisał(a):
> > Below is part of the output with error message produced by command:
> > /usr/lib64/heartbeat/pingd -VVV -a pingd -d 10 -m 1000 -h 3.27.60.1
> >
> > The machine has three network interfaces and is connected to three different subnets (3.27.x.x, 192.168.x.x - cluster subnet, 172.22.x.x - dedicated for heartbeat).
> >
> > pingd[6890]: 2009/07/24_14:44:36 debug: debug2: ping_close: Closed connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:36 debug: send_update: Sent update: pingd=1000 (1 active ping nodes)
> > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: stand_alone_ping: Checking connectivity
> > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Got address 3.27.60.1 for 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_open: Opened connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:37 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_read: Got 59 bytes
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: dump_v4_echo: Echo from 3.27.60.1 (exp=1080, seq=1080, id=6890, dest=3.27.60.1, data=pingd-v4): Echo Reply
> > pingd[6890]: 2009/07/24_14:44:38 debug: stand_alone_ping: Node 3.27.60.1 is alive
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_close: Closed connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:38 debug: send_update: Sent update: pingd=1000 (1 active ping nodes)
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: stand_alone_ping: Checking connectivity
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Got address 3.27.60.1 for 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_open: Opened connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:38 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: ping_read: Got 262 bytes
> > No error message: -1: Resource temporarily unavailable (11)
> > pingd[6890]: 2009/07/24_14:44:39 debug: process_icmp_error: No error message: -1: Resource temporarily unavailable (11)
> > pingd[6890]: 2009/07/24_14:44:39 debug: debug2: dump_v4_echo: Echo from 172.22.10.2 (exp=1081, seq=0, id=0, dest=3.27.60.1, data=E?): Unreachable Port
> > pingd[6890]: 2009/07/24_14:44:39 info: stand_alone_ping: Node 3.27.60.1 is unreachable (read)
> > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: ping_read: Got 262 bytes
> > No error message: -1: Resource temporarily unavailable (11)
> > pingd[6890]: 2009/07/24_14:44:40 debug: process_icmp_error: No error message: -1: Resource temporarily unavailable (11)
> > pingd[6890]: 2009/07/24_14:44:40 debug: debug2: dump_v4_echo: Echo from 192.168.0.5 (exp=1082, seq=0, id=0, dest=3.27.60.1, data=E?): Unreachable Port
> > pingd[6890]: 2009/07/24_14:44:40 info: stand_alone_ping: Node 3.27.60.1 is unreachable (read)
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: pingd=0 (0 active ping nodes)
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: stand_alone_ping: Checking connectivity
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Got address 3.27.60.1 for 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_open: Opened connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_write: Sent 39 bytes to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_read: Got 59 bytes
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: dump_v4_echo: Echo from 3.27.60.1 (exp=1083, seq=1083, id=6890, dest=3.27.60.1, data=pingd-v4): Echo Reply
> > pingd[6890]: 2009/07/24_14:44:41 debug: stand_alone_ping: Node 3.27.60.1 is alive
> > pingd[6890]: 2009/07/24_14:44:41 debug: debug2: ping_close: Closed connection to 3.27.60.1
> > pingd[6890]: 2009/07/24_14:44:41 debug: send_update: Sent update: pingd=1000 (1 active ping nodes)
> >
> > Thanks
> > Jarek
> >
> > "General Linux-HA mailing list" <linux-ha at lists.linux-ha.org> napisał(a):
> > > 2009/7/24 <jarach at op.pl>:
> > > >
> > > > Rpm built for RHEL5:
> > > > heartbeat-common-2.99.2-8.1
> > > > libheartbeat2-2.99.2-8.1
> > > > heartbeat-2.99.2-8.1
> > > > heartbeat-resources-2.99.2-8.1
> > > > pacemaker-1.0.3-2.2
> > > > pacemaker-mgmt-client-1.99.1-2.1
> > > > libpacemaker3-1.0.3-2.2
> > > > pacemaker-mgmt-1.99.1-2.1
> > > >
> > > > If i start pingd manually (beside working heartbeat+pacemaker) it gives me following when in /var/log/ha-debug appears "stand_alone_ping: Node xx.yy.zz.ww is unreachable (read)":
> > > >
> > > > [root at gate2]# date ;/usr/lib64/heartbeat/pingd -a pingd -d 10 -m 1000 -h xx.yy.zz.ww; date
> > > > Thu Jul 23 19:25:24 CEST 2009
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > No error message: -1: Resource temporarily unavailable (11)
> > > > ...
> > > >
> > > > System ping reports no errors.
> > > >
> > >
> > > If you repeat that test with some extra -V arguments, you should see
> > > more information (which would be helpful).
> > > But its pretty clear there must be a bug, so its probably worth
> > > creating an entry in bugzilla.
> > > _______________________________________________
> > > Linux-HA mailing list
> > > Linux-HA at lists.linux-ha.org
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list