[Linux-HA] ha.cf stonith command question
Chun Tian (binghe)
binghe.lisp at gmail.com
Fri Feb 22 12:27:59 MST 2008
Hi, Doug
> I tried Dejan's suggestion but I received the same result. Chun have
> you had the cluster working and stonith _not_ complaining for a few
> days and _then_ it started to complain about the 256 code? Or maybe
> you are just now seeing that stonith is giving this message?
Some logs:
Feb 21 21:56:04 afs-6 stonithd: [32048]: debug: external_run_cmd:
Calling '/usr/lib/stonith/plugins/external/ipmi status' returned 256
Feb 21 21:56:04 afs-6 stonithd: [32048]: debug: external_run_cmd: '/
usr/lib/stonith/plugins/external/ipmi status' output: error executing
ipmi command
Feb 21 21:56:04 afs-6 crmd: [9918]: ERROR: process_lrm_event: LRM
operation resource_ipmi_afs-8:0_monitor_60000 (call=1721, rc=1) Error
unknown error
Feb 21 21:56:08 afs-6 stonithd: [32093]: debug: external_run_cmd:
Calling '/usr/lib/stonith/plugins/external/ipmi status' returned 256
Feb 21 21:56:08 afs-6 stonithd: [32093]: debug: external_run_cmd: '/
usr/lib/stonith/plugins/external/ipmi status' output: error executing
ipmi command
This happens sometimes, but 'ipmi status' will be called every minutes
(I set a monitor oper on these ipmi resource), and I DO verify this
monitor behavior by tcpdump the IPMI traffic.
>
>
> I believe the problem may be how stonith is interpreting the ipmi
> scripts return value.
>
> One thing I have seen in my googling is that there are not many
> examples of people using stonith and ipmi at least not in the non crm
> (version 1) way. Can anyone reading this post acknowledge that they
> are using it and offer any suggestions.
>
> In case it matters (because it might) I am running on CentOS 5.1
> (fully patched as of last week) on x86_64 system.
>
> thanks and regards,
>
> Doug
>
> On Fri, Feb 22, 2008 at 8:47 AM, Chun Tian (binghe)
> <binghe.lisp at gmail.com> wrote:
>> Hi, there
>>
>>
>>> I have tested stonith from the command line and was able to reset
>>> the
>>> target PC. On the command I used the following:
>>>
>>> stonith -t external/ipmi -T reset -p "capestor2 10.43.120.134 ADMIN
>>> mypassword" capestor2
>>>
>>> This worked marvelously! So then I move the stuff into the ha.cf.
>>> Not having much in the way of examples for ipmi this was my best
>>> guess
>>>
>>> stonith_host capestor1 external/ipmi capestor2 10.43.120.134 ADMIN
>>> mypassword
>>>
>>> Syslog gives me this:
>>> Feb 21 17:20:24 capestor2 heartbeat: [4133]: info: Checking status
>>> of
>>> STONITH device [IPMI STONITH device ]
>>> Feb 21 17:20:24 capestor2 heartbeat: [4133]: info: glib:
>>> external_run_cmd: Calling '/usr/lib64/stonith/plugins/external/ipmi
>>> status' returned 256
>>
>> I met this too.
>>
>> I guess this calling actually return 0 but SOMETIMES stonith/external
>> thought the return value is 256...
>>
>> I have a running Heartbeat 4-node cluster with stonith enabled, a few
>> days ago, I got a return value 256 when 'ipmi status' be calling.
>>
>>
>>>
>>> Feb 21 17:20:24 capestor2 heartbeat: [4133]: ERROR: STONITH device
>>> IPMI STONITH device not operational!
>>> Feb 21 17:20:24 capestor2 heartbeat: [4111]: WARN: Managed
>>> STONITH-stat process 4133 exited with return code 1.
>>> Feb 21 17:20:24 capestor2 heartbeat: [4111]: ERROR: STONITH status
>>> operation failed.
>>> Feb 21 17:20:24 capestor2 heartbeat: [4111]: info: This may mean
>>> that
>>> the STONITH device has failed!
>>>
>>> I even went so far as to copy the ipmi plugin to test-ipmi and
>>> hardcoded the values for the variables that are passed in my
>>> heartbeat. That worked.
>>>
>>> Any ideas as to what I may be doing wrong?
>>>
>>> thanks
>>>
>>> Doug
>>>
>>>
>>>
>>>
>>> --
>>> What profits a man if he gains the whole world yet loses his soul?
>>
>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
>
>
> --
> What profits a man if he gains the whole world yet loses his soul?
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list