[Linux-ha-dev] Add real monitoring capabilities to IPaddr2 resource agent
Robert Euhus
euhus-liste1 at rrzn.uni-hannover.de
Fri Feb 11 11:15:34 MST 2011
Thanks for your comments.
Lars Marowsky-Bree schrieb:
> On 2011-02-05T00:16:45, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
>
>> I do like this packet counter monitoring.
>
> So do I, but I'd just casually suggest that this may make sense as a
> daemon, or at least per physical NIC - instead of per virtual IP.
>
I do understand the reasoning, but right now it's beyond my abilities to
code such a daemon. But I have talked to a work mate who said it should
be doable (for him) to write a small daemon listening on the netlink
interface of the kernel and act on that (link loss etc..). This would of
course be a much cleaner solution, and faster too. We will see whether
we have time to go into this.
So for now I will just go with this approach, which has some benefits
too, since you have the monitoring right by the IP you care for...
> (Andrew, much to my dismay, has changed the pingd code to instead be a
> periodically executed RA, which IMHO is a detrimental change - we need
> to stop executing more.)
>
>>> +# 09: check for nonempty ARP cache
>>> +# 10: watch for packet counter changes
>>> +#
>>> +# 19: check arping_ip_list
>>> +# 20: check arping ARP cache entries
>>> +#
>>> +# 30: watch for packet counter changes in promiscios mode
>>> +#
>>> +# If unsuccessfull in levels 18 and above,
>>> +# the tests for higher check levels are run.
>
> As a general suggestion, I would not base this on the "monitor depth".
> These are not necessarily incremental, and we have the ability to pass
> arbitrary parameters to the monitor operation. You could either have one
> parameter that you treat like a flag list or individual ones.
This sounds reasonable to me. I will look into this as son as I can.
Does anybody have an other opinions on that?
>
> And, clearly, one may want to try arping|pinging the default gateway
> too. So there's significant overlap here with the "ping" stuff.
Not that clear to me (yet?) that this would be useful. Couldn't I be
that we just serve an IP for one subnet on one Interface, but the
default gateway is on an other subnet on an other interface? If there is
a gateway this interface, then I would definitely recommend the user to
add its IP to the arping_ip_list. Maybe I should add a comment about this.
I don't see so much overlap here yet, but maybe it's because I haven't
looked at the ping-RA yet.
> In particular, your changes capture a point in time, while ping, by
> virtue of going through attrd, is able to dampen changes. Your approach
> will immediately fail a NIC - possibly on all nodes - if a switch
> reboots, or if there really is a brief lack of traffic. I'm not quite
> sure that is a desirable property.
Me neither, but I'm also not sure that the opposite is true. If I have
the two (ore more) nodes connected to different switches then I might in
fact want the IP to switch over as soon as one fails. But on the other
hand if all nodes are connected to just one switch, then moving the IP
on a switch reboot would not be such a good idea. (Altought one could
expect an administrator to disable the monitor option prior to rebooting
the switch... :)
Maybe one could keep that configurable (eg. by a dampen option). I will
have a look at the ping RA and attrd.
Yours,
Robert.
More information about the Linux-HA-Dev
mailing list