[Linux-HA] troubles with external/riloe plugin.
Edward Clay
eclay at novell.com
Sun Jul 1 20:35:42 MDT 2007
I am looking for some help with the external/riloe stonith plug in. I have been working with the one that ships in SLES 10 SP1 heartbeat 2.0.8-0.19. I have used the following XML to create the clone resource.
<clone id="CL_stonithset_node1">
<instance_attributes id="CL_stonithset_node1">
<attributes>
<nvpair id="CL_stonithset_node1_clone_node_max" name="clone_node_max" value="1"/>
</attributes>
</instance_attributes>
<primitive id="CL_stonith_node1" class="stonith" type="external/riloe" provider="heartbeat">
<operations>
<op name="monitor" interval="30s" timeout="20s" id="CL_stonith_node1_monitor"/>
<op name="start" timeout="60s" id="CL_stonith_node1_start"/>
</operations>
<instance_attributes id="CL_stonith_node1">
<attributes>
<nvpair id="CL_stonith_node1_hostlist" name="hostlist" value="node1"/>
<nvpair id="CL_stonith_node1_RI_HOSTRI" name="RI_HOSTRI" value="il-node1"/>
<nvpair id="CL_stonith_node1_RI_LOGIN" name="RI_LOGIN" value="Administrator"/>
<nvpair id="CL_stonith_node1_RI_PASSWORD" name="RI_PASSWORD" value="password"/>
</attributes>
</instance_attributes>
</primitive>
</clone>
Sample errors in the messages log.
Jun 27 11:30:41 node1 haclient: on_event:evt:cib_changed
Jun 27 11:30:41 node1 stonithd: [5318]: info: Cannot get parameter hostname from StonithNVpair
Jun 27 11:30:41 node1 stonithd: [5318]: ERROR: Invalid config info for external/riloe device.
Jun 27 11:30:41 node1 lrmd: [12035]: ERROR: sending stonithRA op to stonithd failed.
Jun 27 11:30:41 node1 cib: [12048]: info: write_cib_contents: Wrote version 0.46.2095 of the CIB to disk
This error shows up a couple of times in a row also.
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error parsing token: couldnt find attr_name
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error at or before: ="ilo_hostname" uniq
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error parsing token: error parsing child
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error at or before: <longdesc lang=en
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error parsing token: error parsing child
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error at or before: > <parameter name="
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error parsing token: error parsing child
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: parse_xml: Error at or before: c> <parameters> <pa
Jun 27 11:30:41 node1 crmd: [5320]: ERROR: crm_abort: find_xml_node: Triggered non-fatal assert at xml.c:75 : root != NULL
The resource is created OK but I can't start the resource. It gives an error that it can't run anywhere. I also see errors about not being able to fin hostname. So I did some digging in the riloe file and it shows the RI_ entries as legacy. lower it in the file it shows some ilo_ values. So I tried creating the same file above with the new ilo equivalents.
<clone id="CL_stonithset_node1">
<instance_attributes id="CL_stonithset_node1">
<attributes>
<nvpair id="CL_stonithset_node1_clone_node_max" name="clone_node_max" value="1"/>
</attributes>
</instance_attributes>
<primitive id="CL_stonith_node1" class="stonith" type="external/riloe" provider="heartbeat">
<operations>
<op name="monitor" interval="30s" timeout="20s" id="CL_stonith_node1_monitor"/>
<op name="start" timeout="60s" id="CL_stonith_node1_start"/>
</operations>
<instance_attributes id="CL_stonith_node1">
<attributes>
<nvpair id="CL_stonith_node1_hostlist" name="hostlist" value="node1"/>
<nvpair id="CL_stonith_node1_ilo_hostname" name="ilo_hostname" value="il-node1"/>
<nvpair id="CL_stonith_node1_ilo_user" name="ilo_user" value="Administrator"/>
<nvpair id="CL_stonith_node1_ilo_password" name="ilo_password" value="password"/>
<nvpair id="CL_stonith_node1_ilo_protocol" name="ilo_protocol" value="1.2"/>
</attributes>
</instance_attributes>
</primitive>
</clone>
Same results resource is created but doesn't start. I can ping the hostname and the ilo hostname of node1 and il-node1 from all boxes. I am able to ssh and https to the ilo card and login with the admin account. I have attached the riloe plug in that I am trying to use.
The hardware is a dl350 running ilo firmware 1.22.
Does anyone know what type of connection the plug in makes to the ilo card?
Do I need to have the ilo2 device at a certain firmware version?
Do I need a driver loaded for the ilo card to work or does it communicate to it through ssh or https?
What can I do to trouble shoot this problem?
TIA
Edward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: riloe
Type: application/octet-stream
Size: 6261 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20070701/da9e2abc/riloe.obj
More information about the Linux-HA
mailing list