[Linux-HA] crm_mon vs cl_status
beekhof at gmail.com
Mon Mar 9 00:50:21 MDT 2009
On Thu, Mar 5, 2009 at 15:24, Harakiri <harakiri_23 at yahoo.com> wrote:
> --- On Thu, 3/5/09, Andrew Beekhof <beekhof at gmail.com> wrote:
>> From: Andrew Beekhof <beekhof at gmail.com>
>> Subject: Re: [Linux-HA] crm_mon vs cl_status
>> To: harakiri_23 at yahoo.com
>> Cc: "Linux-HA mailing list" <linux-ha at lists.linux-ha.org>
>> Date: Thursday, March 5, 2009, 6:46 AM
>> On Mar 5, 2009, at 12:39 PM, Harakiri wrote:
>> >> YES it _is_.
>> >> The log messages above indicate the order
>> heartbeat starts
>> >> them in -
>> >> anything after that is up to the scheduler of your
>> >> Regardless, the crmd and cib both have loops that
>> >> opening
>> >> connections to the services they require - with
>> >> possible exception
>> >> of the cluster itself.
>> > But these loops dont work - as i said on other systems
>> like debian the processes are executed in the right order
>> but not here.
>> > I can manually fix the opening of pipes with adding a
>> while loop ipcsocket.c when the pipe does not exist yet - if
>> they would loop itself to try again - why isnt it working ?
>> i dont see any reference to a loop to
>> > struct IPC_CHANNEL *
>> > socket_client_channel_new(GHashTable *ch_attrs)
>> > where is it?
>> the loops i'm talking about are at a much higher level
>> - i've no knowledge of how the IPC code works.
>> eg. do_cib_control() arranges for the crmd to try
>> connecting to the cib up to 30 times before giving up.
>> it sounds like the solaris equivalent of
>> socket_client_channel_new() isnt failing properly.
> Yes - when i compile on sparc10 with sockets enabled instead of pipes the loops are working :
> cib: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Activation failed
> cib: 2009/03/05_13:13:35 WARN: ccm_connect: CCM Connection failed 1 times (30 max)
> cib: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Activation failed
> cib: 2009/03/05_13:13:38 WARN: ccm_connect: CCM Connection failed 2 times (30 max)
> but this never happends when pipes are used, since pipes are also controled in the same socket_client_channel_new there is no difference - if either socket or pipes fail NULL is returned - in crm/crmd/ccm.c i found the retry code - i have no idea why it would fail - maybe an exception is thrown somewhere in between?!
No such thing in C.
Perhaps its the function that calls socket_client_channel_new() thats
the problem... or the one that calls that...
More information about the Linux-HA