[Linux-HA] stonith problem with apcmastersnmp

Alexander Kordecki alex at kordecki.de
Thu Sep 23 01:03:11 MDT 2004


hi !

i have two nodes, "mailstore1" and "mailstore2" both connected to an
AP7921.

I use heartbeat 1.2.3

stonithing with the "stonith" command works, but heartbeat gives me  
this log:

-----------------------------------

heartbeat: 2004/09/22_19:15:10 WARN: node mailstore2: is dead
heartbeat: 2004/09/22_19:15:10 info: Local status now set to: 'active'
heartbeat: 2004/09/22_19:15:10 info: Starting child client  
"/usr/lib/heartbeat/ipfail" (511,99)
heartbeat: 2004/09/22_19:15:10 info: Resetting node mailstore2 with  
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:10 ERROR: Host mailstore2 not reset!
heartbeat: 2004/09/22_19:15:10 info: Checking status of STONITH device  
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:10 ERROR: STONITH device  
APCMasterSNMP-Stonith not operational!
heartbeat: 2004/09/22_19:15:10 info: Starting  
"/usr/lib/heartbeat/ipfail" as uid 511  gid 99 (pid 26880)
heartbeat: 2004/09/22_19:15:10 WARN: Exiting STONITH-stat process 26879  
returned rc 1.
heartbeat: 2004/09/22_19:15:10 ERROR: STONITH status operation failed.
heartbeat: 2004/09/22_19:15:10 info: This may mean that the STONITH  
device has failed!
heartbeat: 2004/09/22_19:15:15 info: Resetting node (null) with  
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:15 ERROR: Host (null) not reset!
heartbeat: 2004/09/22_19:15:15 ERROR: Exiting STONITH (null) process  
26883 killed by signal 11.
heartbeat: 2004/09/22_19:15:15 ERROR: STONITH of (null) failed.   
Retrying...
heartbeat: 2004/09/22_19:15:20 info: Resetting node 310^Y^K^H310^Y^K^Hpt
^Hpt
^Hx265^W at x265^W at 200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702 
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@ 
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ with
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:20 ERROR: Host pt
^H310^Y^K^Hp265^W at p265^W at x265^W at x265^W at 200265^W at 200265^W@ N^M^H370261
^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702 
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@ 
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ not reset!
heartbeat: 2004/09/22_19:15:20 WARN: ha_msg_add_nv: line doesn't contain
'='
heartbeat: 2004/09/22_19:15:20 info:pt

heartbeat: 2004/09/22_19:15:20 ERROR: NV failure (msgfromsteam): pt
]
heartbeat: 2004/09/22_19:15:20 WARN: Exiting STONITH
h265^W at h265^W at p265^W at p265^W
@310^Y^K^H310^Y^K^H200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702 
65^W at 270
265^W at 300265^W process 26884 returned rc 1.
heartbeat: 2004/09/22_19:15:20 ERROR: STONITH of
h265^W at h265^W at p265^W at p265^W at 310
^Y^K^H310^Y^K^H200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702 
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@ 
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ failed.  Retrying...

----------------------------------------

my ha.cf on mailstore1:

debugfile /var/log/ha-debug
logfile /var/log/ha-log
keepalive 2
deadtime 7
warntime 5
initdead 12
udpport 694
baud    38400
serial  /dev/ttyS0      # Linux
ucast eth0 192.168.2.101
auto_failback off
stonith_host * apcmastersnmp mailstore-power 161 private mailstore2
node    mailstore1
node    mailstore2
ping 192.168.2.103
ping 192.168.2.104
respawn hacluster /usr/lib/heartbeat/ipfail

i also tried it with :
stonith_host * apcmastersnmp mailstore-power 161 private
but this doesn't change anything.


can anyone help me ?

alex



More information about the Linux-HA mailing list