AW: [Linux-HA] Apache failover / renaming the binary

Ehlers, Kolja ehlers at clinresearch.com
Thu Jul 3 03:48:23 MDT 2008


yes you are right, at that point I had not applied the fix. But now I have and when I rename the httpd- back to httpd on www1test and reboot www2test the IPaddre is failed over successfully to www1test but apache is not started. Just before the reboot I tryed to run apache manually ( /usr/lib/ocf/resource.d/heartbeat/apache start) and it worked heartbeat is not starting it though. Attached is the log

-----Ursprüngliche Nachricht-----
Von: linux-ha-bounces at lists.linux-ha.org
[mailto:linux-ha-bounces at lists.linux-ha.org]Im Auftrag von Andrew
Beekhof
Gesendet: Donnerstag, 3. Juli 2008 11:07
An: General Linux-HA mailing list
Betreff: Re: [Linux-HA] Apache failover / renaming the binary


On Thu, Jul 3, 2008 at 09:59, Ehlers, Kolja <ehlers at clinresearch.com> wrote:
> thanks for the reply, still the problem remains.


Because you didn't follow his advice.

> Failed actions:
>    apache_2_start_0 (node=www1test, call=6, rc=6): complete

Your RA is still returning 6 (OCF_ERR_CONFIGURED) instead of 5
(OCF_ERR_INSTALLED) when the binary is missing.

> If apache cannot be started/restarted it is not failed over to the second node. I have two equal servers and I want to run the virtual ip + apache (grouped) on either one of the nodes. To test the configuration I have renamed httpd on the one node to httpd_ else I am not sure how to simulate a non starting apache. But either way when heartbeat is started the apache start is failed on www1test and nothing happens then. I have attached my CIB and the logs
>
> This is what crm_mon gives me:
>
> Refresh in 1s...
>
> ============
> Last updated: Thu Jul  3 09:53:34 2008
> Current DC: www2test (5e0f97b7-6780-4487-baf9-6c36500b1276)
> 2 Nodes configured.
> 1 Resources configured.
> ============
>
> Node: www2test (5e0f97b7-6780-4487-baf9-6c36500b1276): online
> Node: www1test (3a325e23-2184-46ed-9e88-42a11f28c2be): online
>
> Resource Group: group_1
>    IPaddr_192_168_11_25        (ocf::heartbeat:IPaddr):        Started www1test
>    apache_2    (ocf::heartbeat:apache):        Stopped
>
> Failed actions:
>    apache_2_start_0 (node=www1test, call=6, rc=6): complete
>
>
>
> www1test:~ # crm_verify -VVVVL
> crm_verify[8124]: 2008/07/03_09:54:55 info: main: =#=#=#=#= Getting XML =#=#=#=#=
> crm_verify[8124]: 2008/07/03_09:54:55 info: main: Reading XML from: live cluster
> crm_verify[8124]: 2008/07/03_09:54:55 notice: main: Required feature set: 2.0
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'false' for cluster option 'stonith-enabled'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'reboot' for cluster option 'stonith-action'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '0' for cluster option 'default-resource-failure-stickiness'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '60s' for cluster option 'cluster-delay'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '30' for cluster option 'batch-limit'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '20s' for cluster option 'default-action-timeout'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'true' for cluster option 'stop-orphan-resources'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'true' for cluster option 'stop-orphan-actions'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'false' for cluster option 'remove-after-stop'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '-1' for cluster option 'pe-error-series-max'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '-1' for cluster option 'pe-warn-series-max'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value '-1' for cluster option 'pe-input-series-max'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'true' for cluster option 'startup-fencing'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cluster_option: Using default value 'true' for cluster option 'start-failure-is-fatal'
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: Default action timeout: 20s
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: Default stickiness: 1000000
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: Default failure stickiness: 0
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: STONITH of failed nodes is disabled
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: Cluster is symmetric - resources can run anywhere by default
> crm_verify[8124]: 2008/07/03_09:54:55 debug: unpack_config: On loss of CCM Quorum: Stop ALL resources
> crm_verify[8124]: 2008/07/03_09:54:55 info: determine_online_status: Node www2test is online
> crm_verify[8124]: 2008/07/03_09:54:55 info: determine_online_status: Node www1test is online
> crm_verify[8124]: 2008/07/03_09:54:55 debug: common_apply_stickiness: fail-count-apache_2: INFINITY
> crm_verify[8124]: 2008/07/03_09:54:55 ERROR: unpack_rsc_op: Hard error: apache_2_start_0 failed with rc=6.
> crm_verify[8124]: 2008/07/03_09:54:55 ERROR: unpack_rsc_op:   Preventing apache_2 from re-starting anywhere in the cluster
> crm_verify[8124]: 2008/07/03_09:54:55 WARN: unpack_rsc_op: Processing failed op apache_2_start_0 on www1test: Error
> crm_verify[8124]: 2008/07/03_09:54:55 WARN: unpack_rsc_op: Compatability handling for failed op apache_2_start_0 on www1test
> crm_verify[8124]: 2008/07/03_09:54:55 notice: group_print: Resource Group: group_1
> crm_verify[8124]: 2008/07/03_09:54:55 notice: native_print:     IPaddr_192_168_11_25    (ocf::heartbeat:IPaddr):        Started www1test
> crm_verify[8124]: 2008/07/03_09:54:55 notice: native_print:     apache_2        (ocf::heartbeat:apache):        Stopped
> crm_verify[8124]: 2008/07/03_09:54:55 debug: group_rsc_location: Processing rsc_location pref_run_apache_group for group_1
> crm_verify[8124]: 2008/07/03_09:54:55 debug: native_merge_weights: IPaddr_192_168_11_25: Rolling back scores from apache_2
> crm_verify[8124]: 2008/07/03_09:54:55 debug: native_assign_node: Assigning www1test to IPaddr_192_168_11_25
> crm_verify[8124]: 2008/07/03_09:54:55 debug: native_assign_node: All nodes for resource apache_2 are unavailable, unclean or shutting down
> crm_verify[8124]: 2008/07/03_09:54:55 WARN: native_color: Resource apache_2 cannot run anywhere
> crm_verify[8124]: 2008/07/03_09:54:55 notice: NoRoleChange: Leave resource IPaddr_192_168_11_25 (www1test)
> Warnings found during check: config may not be valid
> crm_verify[8124]: 2008/07/03_09:54:55 debug: cib_native_signoff: Signing out of the CIB Service
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: linux-ha-bounces at lists.linux-ha.org
> [mailto:linux-ha-bounces at lists.linux-ha.org]Im Auftrag von Dominik Klein
> Gesendet: Donnerstag, 3. Juli 2008 08:27
> An: General Linux-HA mailing list
> Betreff: Re: [Linux-HA] Apache failover / renaming the binary
>
>
> http://hg.linux-ha.org/dev/file/5072025b79b8/resources/OCF/apache
>
> lines 516-518
>
> another example of how to use exits codes incorrectly.
>
> I'll commit a patch soon.
>
> In your script: Make line 518 look like this (on all nodes!):
> exit $OCF_ERR_INSTALLED
>
> Then cleanup the resource or start the cluster from scratch and try
> again. Should fix it.
>
> Regards
> Dominik
>
>
> Ehlers, Kolja wrote:
>> Hello,
>>
>> my simple active/passive cluster seems to work but when running and I do:
>>
>> /opt/apache2/bin/apachectl stop && mv /opt/apache2/bin/httpd /opt/apache2/bin/httpd_
>>
>> Heartbeat is not failing over apache to node2 (Hard error: apache_2_start_0 failed with rc=6.) This is really odd because the log states "All 2 cluster nodes are eligible to run resources." but then 4 lines further it says "ERROR: unpack_rsc_op:   Preventing apache_2 from re-starting anywhere in the cluster". I am using a very simple CIB with one virtual ip and apache grouped. If i stop apache manually heartbeat does restart apache fine. By the way can I configure it so that it does failover right to the other node if apache is stopped or fails? When manually stopping heartbeat the failover does work.
>>
>> So I am not sure which part of my configuration or logs you need to see. I guess im missing something important here.
>>
>> This is my cib
>>
>>  <cib admin_epoch="0" generated="true" have_quorum="true" ignore_dtd="false" num_peers="2" cib_feature_revision="2.0" crm_feature_set="2.0" epoch="38" num_updates="3" cib-last-written="Wed Jul  2 16:16:51 2008" ccm_transition="2" dc_uuid="5e0f97b7-6780-4487-baf9-6c36500b1276">
>>    <configuration>
>>      <crm_config>
>>        <cluster_property_set id="cib-bootstrap-options">
>>          <attributes>
>>            <nvpair id="cib-bootstrap-options-symmetric-cluster" name="symmetric-cluster" value="true"/>
>>            <nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stickiness" value="INFINITY"/>
>>            <nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="true"/>
>>            <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="stop"/>
>>            <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="2.1.3-node: a3184d5240c6e7032aef9cce6e5b7752ded544b3"/>
>>          </attributes>
>>        </cluster_property_set>
>>      </crm_config>
>>      <nodes>
>>        <node id="5e0f97b7-6780-4487-baf9-6c36500b1276" uname="www2test" type="normal"/>
>>        <node id="3a325e23-2184-46ed-9e88-42a11f28c2be" uname="www1test" type="normal"/>
>>      </nodes>
>>      <resources>
>>        <group id="group_1">
>>          <primitive class="ocf" id="IPaddr_192_168_11_25" provider="heartbeat" type="IPaddr">
>>            <operations>
>>              <op id="IPaddr_192_168_11_25_mon" interval="5s" name="monitor" timeout="5s"/>
>>            </operations>
>>            <instance_attributes id="IPaddr_192_168_11_25_inst_attr">
>>              <attributes>
>>                <nvpair id="IPaddr_192_168_11_25_attr_0" name="ip" value="192.168.11.25"/>
>>              </attributes>
>>            </instance_attributes>
>>          </primitive>
>>          <primitive class="ocf" id="apache_2" provider="heartbeat" type="apache">
>>            <operations>
>>              <op id="apache_2_mon" interval="5s" name="monitor" timeout="10s"/>
>>            </operations>
>>            <instance_attributes id="apache_2_inst_attr">
>>              <attributes>
>>                <nvpair id="apache_2_attr_0" name="configfile" value="/opt/apache2/conf/httpd.conf"/>
>>              </attributes>
>>            </instance_attributes>
>>            <instance_attributes id="apache_2">
>>              <attributes>
>>                <nvpair id="apache_2-httpd" name="httpd" value="/opt/apache2/bin/httpd"/>
>>              </attributes>
>>            </instance_attributes>
>>          </primitive>
>>        </group>
>>      </resources>
>>      <constraints>
>>        <rsc_location id="run_group1" rsc="group_1">
>>          <rule id="pref_run_apache_group" score="0">
>>            <expression attribute="#uname" operation="eq" value="www1test" id="7667baf9-522d-40ac-a901-195bfe84a3df"/>
>>          </rule>
>>        </rsc_location>
>>      </constraints>
>>    </configuration>
>>  </cib>
>>
>> Gesch�ftsf�hrung: Dr. Michael Fischer, Reinhard Eisebitt
>> Amtsgericht K�ln HRB 32356
>> Steuer-Nr.: 217/5717/0536
>> Ust.Id.-Nr.: DE 204051920
>> --
>> This email transmission and any documents, files or previous email
>> messages attached to it may contain information that is confidential or
>> legally privileged. If you are not the intended recipient or a person
>> responsible for delivering this transmission to the intended recipient,
>> you are hereby notified that any disclosure, copying, printing,
>> distribution or use of this transmission is strictly prohibited. If you
>> have received this transmission in error, please immediately notify the
>> sender by telephone or return email and delete the original transmission
>> and its attachments without reading or saving in any manner.
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
>
> --
>
> IN-telegence GmbH & Co. KG
> Oskar-Jäger-Str. 125
> 50825 Köln
>
> Registergericht Köln - HRA 14064, USt-ID Nr. DE 194 156 373
> ph Gesellschafter: komware Unternehmensverwaltungsgesellschaft mbH,
> Registergericht Köln - HRB 38396
> Geschäftsführende Gesellschafter: Christian Plätke und Holger Jansen
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
> Geschäftsführung: Dr. Michael Fischer, Reinhard Eisebitt
> Amtsgericht Köln HRB 32356
> Steuer-Nr.: 217/5717/0536
> Ust.Id.-Nr.: DE 204051920
> --
> This email transmission and any documents, files or previous email
> messages attached to it may contain information that is confidential or
> legally privileged. If you are not the intended recipient or a person
> responsible for delivering this transmission to the intended recipient,
> you are hereby notified that any disclosure, copying, printing,
> distribution or use of this transmission is strictly prohibited. If you
> have received this transmission in error, please immediately notify the
> sender by telephone or return email and delete the original transmission
> and its attachments without reading or saving in any manner.
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

Geschäftsführung: Dr. Michael Fischer, Reinhard Eisebitt
Amtsgericht Köln HRB 32356
Steuer-Nr.: 217/5717/0536
Ust.Id.-Nr.: DE 204051920
--
This email transmission and any documents, files or previous email
messages attached to it may contain information that is confidential or
legally privileged. If you are not the intended recipient or a person
responsible for delivering this transmission to the intended recipient,
you are hereby notified that any disclosure, copying, printing,
distribution or use of this transmission is strictly prohibited. If you
have received this transmission in error, please immediately notify the
sender by telephone or return email and delete the original transmission
and its attachments without reading or saving in any manner.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: ha-debug
Type: application/octet-stream
Size: 51205 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20080703/50f5555c/ha-debug-0001.obj


More information about the Linux-HA mailing list