[Linux-HA] crm_failcount queries quite slow?
Andrew Beekhof
beekhof at gmail.com
Fri Apr 4 02:52:21 MDT 2008
On Fri, Apr 4, 2008 at 10:08 AM, Andrew Beekhof <beekhof at gmail.com> wrote:
> On Fri, Apr 4, 2008 at 8:25 AM, Dominik Klein <dk at in-telegence.net> wrote:
> > Lars Marowsky-Bree wrote:
> >
> > > On 2008-04-03T13:59:36, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > >
> > >
> > > > Any crm* program is significantly slower on a non-DC node
> > > > regardless of whether something's happening in the cluster. It's
> > > > always been like that.
> > > >
> > >
> >
> > I can confirm that. It's been for me ever since I started using heartbeat.
>
> I have a theory, I'll test it with oprofile shortly and report back
>
The remote calls do a bunch of extra stuff
* Two network delays (request + response)
* The output gets packed + zipped on the DC
* The output gets unzipped + unpacked on the local machine
* Copied a few extra times both on the DC and locally
Oprofile confirms that this is basically where all the extra time goes.
The good news is that there are some changes coming that reduce the
amount of copying (significantly) and adopt a smarter packing
strategy.
It may also be the case that the OpenAIS stack does better here as its
packing, and it message layer generally, are implemented differently.
I'll check next week, unless someone else gets in first.
More information about the Linux-HA
mailing list