[Linux-HA] [right one] Please can anybody help: switch from v1 to v2 with 25 nodes

Ragnar Kjørstad linux-ha at ragnark.vestdata.no
Wed Mar 14 01:52:22 MDT 2007


Hi Alan.

On Tue, Mar 13, 2007 at 07:59:51AM -0600, Alan Robertson wrote:
> hb_generation is used for eliminating a security vulnerability called a
> "replay attack"
> 
> ...
> 
> To keep that from happening, packet sequence numbers consist basically
> of a pair of numbers (generation, seqno).  Each time a node does a
> protocol restart, it increments the generation number (which is stored
> on disk), while resetting the sequence number to 1.

When exactly does a protocol restart occur?

We need a way to be able to automatically reinstall a node that is part
of a heartbeat cluster and allow it to join back in without manual
interaction, but the generation counter is a bit of a challange.

One possibility would be to record the generation counter before the
reinstall so that it can be put back afterwards, but the problem is that
reinstallations can not be anticipated. (Who knows when hardware will
break?). So, when do one need to record the generation counter? is every
time heartbeat starts sufficient, or can it change within one hartbeat
session as well?

I suppose one workaround is to use time instead of the generation
counter, but the fact that adjusting the system clock on the system may
stop heartbeat from operating is a problem.

Skipping generating counts are ok, right? So what about a hybrid
solution where hb_generation is set to time_t when heartbeat is
installed, but then it's handled as a regular generation counter?
This way, over time, it would fall behind time_t again and give you some
room to adjust the clock without causing problems. (e.g. adjusting the
clock one hour for a system that has been up for days should work just
fine). Or is this solution too magic?

Finally, in datacenters where all the communication goes through cables
connected back to back between the two nodes in a safe environment,
completely disabling the feature would be the simplest. Is that
possible?



In short, I miss a FAQ entry for "How do I reinstall one of my heartbeat
nodes?". So far we've discovered that hb_generation and hb_uuid must be
set. Anything else? 


-- 
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
Scaling the Linux Datacenter


More information about the Linux-HA mailing list