[Linux-HA] pingd can't start up

Dejan Muhamedagic dejanmm at fastmail.fm
Sun Oct 28 16:01:47 MDT 2007


Hi,

On Sat, Oct 27, 2007 at 05:43:40PM +1300, Ivan wrote:
> Hi,
> 
> Trying to monitor my connectivity by pinging my router but pingd can't
> start up for some reason.
> 2 node cluster, HA-2.0.7, SLES10SP1, both configured in ha.cf:
> ping 10.29.36.254
> 
> Config:
> 
> <clone id="pingd">
>   <instance_attributes id="pingd-ia">
>     <attributes>
>       <nvpair id="pingd-ia-01" name="clone_node_max" value="1"/>
>     </attributes>
>   </instance_attributes>
>   <primitive id="pingd-child" provider="heartbeat" class="ocf" type="pingd">
>     <operations>
>       <op id="pingd-child-op-01" name="monitor" interval="20s"
> timeout="40s" prereq="nothing"/>
>       <op id="pingd-child-op-02" name="start" prereq="nothing"/>
>     </operations>
>     <instance_attributes id="pingd-child-ia">
>       <attributes>
>          <nvpair id="pingd-child-ia-01" name="dampen" value="5s"/>
>          <nvpair id="pingd-child-ia-02" name="multiplier" value="100"/>
>          <nvpair id="pingd-child-ia-03" name="user" value="hacluster"/>
>          <nvpair id="pingd-child-ia-04" name="pidfile"
> value="/var/lib/heartbeat/cores/hacluster/pingd.pid"/>

This location is not very good. That's where the core files go.

>       </attributes>
>     </instance_attributes>
>   </primitive>
> </clone>
> 
> Log:
> pingd[20957][20963]: 2007/10/27_16:40:11 ERROR: Could not run
> /usr/lib/heartbeat/pingd -D -p /var/tmp/pingd.pid -a pingd -d 5s  -m 100
> : rc=1

This looks like an error from the pingd RA. There should've been
one from the pingd program. You can also try to run this by hand
and see what it says. Which shell do you have for the hacluster
user? /bin/false won't do.

> I did try with root user as well and with various pid locations too. Not
> sure actually how the default /tmp/pingd.pid) location gets along with
> the tmp-cleaner scripts which wipe everything untouched for 3
> months...hence had to change the location to something where hacluster
> can write to.
> 
> Interesting thing is that I used the example first from the web with
> "globally_unique=false" option and got this:
> cib[6062]: 2007/10/27_16:07:25 ERROR: No declaration for attribute
> globally_unique of element instance_attributes

What does it look like? Should be like this:

<nvpair id="clone_1-meta-options-globally_unique" name="globally_unique" value="false"/>


Thanks,

Dejan

> cib[6062]: 2007/10/27_16:07:25 ERROR: validate_with_dtd: CIB does not
> validate against /usr/lib/heartbeat/crm.dtd
> cib[6062]: 2007/10/27_16:07:25 ERROR: activateCibXml: Ignoring invalid CIB
> cib[6062]: 2007/10/27_16:07:25 WARN: activateCibXml: Reverting to last
> known CIB
> cib[6062]: 2007/10/27_16:07:25 WARN: cib_process_command: Activation failed
> cib[6062]: 2007/10/27_16:07:25 WARN: do_cib_notify: cib_create of <clone
> > FAILED: Activation Failed
> cib[6062]: 2007/10/27_16:07:25 WARN: cib_diff_notify: Update (client:
> 16376, call:2): 0.1.31 -> 0.1.32 (Activation Failed)
> cib[6062]: 2007/10/27_16:07:25 WARN: do_cib_notify: cib_create of <clone
> > FAILED: Activation Failed
> cib[6062]: 2007/10/27_16:07:25 ERROR: cib_process_request: cib_create
> operation failed: Activation Failed
> cib[6062]: 2007/10/27_16:07:25 info: crm_log_message_adv: #=========
> Input message message start ==========#
> cib[6062]: 2007/10/27_16:07:25 info: MSG: Dumping message with 9 fields
> cib[6062]: 2007/10/27_16:07:25 info: MSG[0] : [__name__=cib_command]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[1] : [t=cib]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[2] : [cib_op=cib_create]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[3] : [cib_section=resources]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[4] : [cib_callid=2]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[5] : [cib_callopt=0]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[6] :
> [(2)cib_calldata=0x8074e08(1074 1381)]
> cib[6062]: 2007/10/27_16:07:25 info:  <clone id="pingd"/>
> cib[6062]: 2007/10/27_16:07:25 info: MSG[7] :
> [cib_clientid=6a4d86fe-cc7c-492a-b1d4-65125bfe2303]
> cib[6062]: 2007/10/27_16:07:25 info: MSG[8] : [cib_clientname=16376]
> 
> Removing it worked but still can't start up. Is it my version or the web
> is wrong? If so would be good to correct.
> 
> Thanks a lot,
> Ivan
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



More information about the Linux-HA mailing list