[Linux-HA] Orphan resource process(es) running

AR digger86 at hotmail.com
Thu Oct 7 18:40:59 MDT 2010


No, not for the cluster.  It is for the socks proxy listening adapter.

On Thu, 2010-10-07 at 13:52 -0300, mike wrote:
> I'm curious - is that 10.8.64.140 address the VIP address for the cluster?
> 
> 
> On 10-10-07 01:21 PM, AR wrote:
> > On Wed, 2010-10-06 at 21:30 -0700, AR wrote:
> >
> > Solved.
> >
> > the issue was that the 10.8.64.140 address was sticking to node1.  I
> > dont know why this was happening?  But once I removed the address all is
> > working well.
> >
> >    
> >> On Wed, 2010-10-06 at 20:45 -0300, mike wrote:
> >>      
> >>> On 10-10-06 07:09 PM, AR wrote:
> >>>        
> >>>> Hi, First let me say thank you to those of you that support the project.
> >>>>
> >>>> It appears that there are orphan processes running?  How do I get rid of
> >>>> these?
> >>>>
> >>>> # crm_verify -LVV
> >>>> crm_verify[31892]: 2010/10/06_14:55:10 WARN: process_orphan_resource:
> >>>> Nothing known about resource vip_0:1 running on node2
> >>>> crm_verify[31892]: 2010/10/06_14:55:10 ERROR: unpack_rsc_op: Hard error
> >>>> - vip_0:1_monitor_0 failed with rc=2: Preventing vip_0:1 from
> >>>> re-starting on node2
> >>>>
> >>>> I have already rebuilt the cluster from scratch.
> >>>>
> >>>> # rm /var/lib/heartbeat/crm/*
> >>>>
> >>>> current configuration
> >>>>
> >>>> # crm configure show xml
> >>>> <?xml version="1.0" ?>
> >>>> <cib admin_epoch="0" crm_feature_set="3.0.1" dc-uuid="node2" epoch="66"
> >>>> have-quorum="1" num_updates="17" validate-with="pacemaker-1.0">
> >>>>     <configuration>
> >>>>       <crm_config>
> >>>>         <cluster_property_set id="cib-bootstrap-options">
> >>>>           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> >>>> value="1.0.3-0080ec086ae9c20ad5c4c3562000c0ad68374f0a"/>
> >>>>           <nvpair id="cib-bootstrap-options-expected-quorum-votes"
> >>>> name="expected-quorum-votes" value="2"/>
> >>>>           <nvpair id="nvpair-38d6c5a8-3510-4fc8-97fd-944e32f8fbfe"
> >>>> name="stonith-enabled" value="false"/>
> >>>>           <nvpair id="nvpair-9429cf6e-009d-465c-bb9a-5d7a90056680"
> >>>> name="no-quorum-policy" value="ignore"/>
> >>>>           <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> >>>> name="last-lrm-refresh" value="1286398848"/>
> >>>>           <nvpair id="nvpair-1214a8eb-bf4a-41ae-9c4e-33d9aac8d07c"
> >>>> name="default-resource-stickiness" value="1"/>
> >>>>         </cluster_property_set>
> >>>>       </crm_config>
> >>>>       <rsc_defaults/>
> >>>>       <op_defaults/>
> >>>>       <nodes>
> >>>>         <node id="node2" type="normal" uname="node2">
> >>>>           <instance_attributes id="nodes-node2">
> >>>>             <nvpair id="standby-node2" name="standby" value="false"/>
> >>>>           </instance_attributes>
> >>>>         </node>
> >>>>         <node id="node2" type="normal" uname="node2">
> >>>>           <instance_attributes id="nodes-node2">
> >>>>             <nvpair id="standby-node2" name="standby" value="false"/>
> >>>>           </instance_attributes>
> >>>>         </node>
> >>>>       </nodes>
> >>>>       <resources>
> >>>>         <group id="vip_n_sockd">
> >>>>           <meta_attributes id="vip_n_sockd-meta_attributes">
> >>>>             <nvpair id="nvpair-e3e90b0b-161b-49c2-8723-98647feb7b6c"
> >>>> name="target-role" value="Started"/>
> >>>>             <nvpair id="vip_n_sockd-meta_attributes-is-managed"
> >>>> name="is-managed" value="true"/>
> >>>>           </meta_attributes>
> >>>>           <primitive class="ocf" id="vip" provider="heartbeat"
> >>>> type="IPaddr2">
> >>>>             <meta_attributes id="vip-meta_attributes">
> >>>>               <nvpair id="nvpair-cc694b48-ebd2-468f-a1bf-b3289d2cf28e"
> >>>> name="target-role" value="Started"/>
> >>>>               <nvpair id="vip-meta_attributes-is-managed"
> >>>> name="is-managed" value="true"/>
> >>>>             </meta_attributes>
> >>>>             <operations id="vip-operations">
> >>>>               <op id="vip-op-monitor-10s" interval="20s" name="monitor"
> >>>> start-delay="0s" timeout="10s"/>
> >>>>             </operations>
> >>>>             <instance_attributes id="vip-instance_attributes">
> >>>>               <nvpair id="nvpair-5d33bc8c-3c04-405d-b71b-3cb2174da8ba"
> >>>> name="ip" value="10.8.64.140"/>
> >>>>             </instance_attributes>
> >>>>           </primitive>
> >>>>           <primitive class="lsb" id="sockd" type="sockd">
> >>>>             <meta_attributes id="sockd-meta_attributes">
> >>>>               <nvpair id="nvpair-d6564710-29eb-4562-a77d-7997ef649764"
> >>>> name="target-role" value="Started"/>
> >>>>             </meta_attributes>
> >>>>           </primitive>
> >>>>         </group>
> >>>>       </resources>
> >>>>       <constraints/>
> >>>>     </configuration>
> >>>> </cib>
> >>>>
> >>>> now the issue is that if I put node1 into standby the resources got to
> >>>> unmanaged.
> >>>>
> >>>> # crm_verify -LVV
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: unpack_config: On loss of
> >>>> CCM Quorum: Ignore
> >>>> crm_verify[377]: 2010/10/06_15:07:47 WARN: unpack_rsc_op: Operation
> >>>> vip_monitor_0 found resource vip active on node1
> >>>> crm_verify[377]: 2010/10/06_15:07:47 WARN: unpack_rsc_op: Processing
> >>>> failed op vip_stop_0 on node1: unknown error
> >>>> crm_verify[377]: 2010/10/06_15:07:47 WARN: process_orphan_resource:
> >>>> Nothing known about resource vip_0:1 running on node2
> >>>> crm_verify[377]: 2010/10/06_15:07:47 ERROR: unpack_rsc_op: Hard error -
> >>>> vip_0:1_monitor_0 failed with rc=2: Preventing vip_0:1 from re-starting
> >>>> on node2
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: group_print: Resource
> >>>> Group: vip_n_sockd
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: native_print:     vip
> >>>> (ocf::heartbeat:IPaddr2):	Started node1 (unmanaged) FAILED
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: native_print:     sockd
> >>>> (lsb:sockd):	Stopped
> >>>> crm_verify[377]: 2010/10/06_15:07:47 WARN: common_apply_stickiness:
> >>>> Forcing vip away from node1 after 1000000 failures (max=1000000)
> >>>> crm_verify[377]: 2010/10/06_15:07:47 WARN: native_color: Resource sockd
> >>>> cannot run anywhere
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: LogActions: Leave resource
> >>>> vip	(Started unmanaged)
> >>>> crm_verify[377]: 2010/10/06_15:07:47 notice: LogActions: Leave resource
> >>>> sockd	(Stopped)
> >>>> Warnings found during check: config may not be valid
> >>>>
> >>>>
> >>>> Thanks, Alex
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Linux-HA mailing list
> >>>> Linux-HA at lists.linux-ha.org
> >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>> See also: http://linux-ha.org/ReportingProblems
> >>>>
> >>>>
> >>>>          
> >>> Is the configuration identical on both nodes, i.e. is cib.xml exactly
> >>> the same?
> >>>        
> >> I just wiped all files in /var/lib/heartbeat/crm copied cib.xml to both
> >> nodes.  And:
> >> # chown hacluster:haclient *
> >> # chmod 600 *
> >> # rcopenais start
> >>
> >> Started on node2 all works fine.
> >> when I start openais on node1 the resources go down.  I do a cleanup and
> >> the resources come up on node1.
> >>
> >> I then try to put node1 in standby and the resources go down, a cleanup
> >> will not start them.  The resources will only start on node1.
> >>
> >> # crm_verify -LVV
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: unpack_config: On loss of
> >> CCM Quorum: Ignore
> >> crm_verify[21359]: 2010/10/06_21:01:32 WARN: unpack_rsc_op: Operation
> >> vip_monitor_0 found resource vip active on node1
> >> crm_verify[21359]: 2010/10/06_21:01:32 WARN: unpack_rsc_op: Processing
> >> failed op vip_stop_0 on node1: unknown error
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: group_print: Resource
> >> Group: vip_n_sockd
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: native_print:     vip
> >> (ocf::heartbeat:IPaddr2):	Started node1 (unmanaged) FAILED
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: native_print:     sockd
> >> (lsb:sockd):	Stopped
> >> crm_verify[21359]: 2010/10/06_21:01:32 WARN: common_apply_stickiness:
> >> Forcing vip away from node1 after 1000000 failures (max=1000000)
> >> crm_verify[21359]: 2010/10/06_21:01:32 WARN: native_color: Resource
> >> sockd cannot run anywhere
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: LogActions: Leave
> >> resource vip	(Started unmanaged)
> >> crm_verify[21359]: 2010/10/06_21:01:32 notice: LogActions: Leave
> >> resource sockd(Stopped)
> >> Warnings found during check: config may not be valid
> >>      
> >>> _______________________________________________
> >>> Linux-HA mailing list
> >>> Linux-HA at lists.linux-ha.org
> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>> See also: http://linux-ha.org/ReportingProblems
> >>>
> >>>        
> >>
> >> _______________________________________________
> >> Linux-HA mailing list
> >> Linux-HA at lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> >>      
> >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> >    
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 





More information about the Linux-HA mailing list