[Linux-HA] stonith problem with apcmastersnmp
Alexander Kordecki
alex at kordecki.de
Thu Sep 23 01:03:11 MDT 2004
hi !
i have two nodes, "mailstore1" and "mailstore2" both connected to an
AP7921.
I use heartbeat 1.2.3
stonithing with the "stonith" command works, but heartbeat gives me
this log:
-----------------------------------
heartbeat: 2004/09/22_19:15:10 WARN: node mailstore2: is dead
heartbeat: 2004/09/22_19:15:10 info: Local status now set to: 'active'
heartbeat: 2004/09/22_19:15:10 info: Starting child client
"/usr/lib/heartbeat/ipfail" (511,99)
heartbeat: 2004/09/22_19:15:10 info: Resetting node mailstore2 with
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:10 ERROR: Host mailstore2 not reset!
heartbeat: 2004/09/22_19:15:10 info: Checking status of STONITH device
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:10 ERROR: STONITH device
APCMasterSNMP-Stonith not operational!
heartbeat: 2004/09/22_19:15:10 info: Starting
"/usr/lib/heartbeat/ipfail" as uid 511 gid 99 (pid 26880)
heartbeat: 2004/09/22_19:15:10 WARN: Exiting STONITH-stat process 26879
returned rc 1.
heartbeat: 2004/09/22_19:15:10 ERROR: STONITH status operation failed.
heartbeat: 2004/09/22_19:15:10 info: This may mean that the STONITH
device has failed!
heartbeat: 2004/09/22_19:15:15 info: Resetting node (null) with
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:15 ERROR: Host (null) not reset!
heartbeat: 2004/09/22_19:15:15 ERROR: Exiting STONITH (null) process
26883 killed by signal 11.
heartbeat: 2004/09/22_19:15:15 ERROR: STONITH of (null) failed.
Retrying...
heartbeat: 2004/09/22_19:15:20 info: Resetting node 310^Y^K^H310^Y^K^Hpt
^Hpt
^Hx265^W at x265^W at 200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ with
[APCMasterSNMP-Stonith]
heartbeat: 2004/09/22_19:15:20 ERROR: Host pt
^H310^Y^K^Hp265^W at p265^W at x265^W at x265^W at 200265^W at 200265^W@ N^M^H370261
^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ not reset!
heartbeat: 2004/09/22_19:15:20 WARN: ha_msg_add_nv: line doesn't contain
'='
heartbeat: 2004/09/22_19:15:20 info:pt
heartbeat: 2004/09/22_19:15:20 ERROR: NV failure (msgfromsteam): pt
]
heartbeat: 2004/09/22_19:15:20 WARN: Exiting STONITH
h265^W at h265^W at p265^W at p265^W
@310^Y^K^H310^Y^K^H200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702
65^W at 270
265^W at 300265^W process 26884 returned rc 1.
heartbeat: 2004/09/22_19:15:20 ERROR: STONITH of
h265^W at h265^W at p265^W at p265^W at 310
^Y^K^H310^Y^K^H200265^W at 200265^W at 220^Z^K^H N^M^H 303
^H310303
^H230265^W at 230265^W at Hv^M^HHv^M^H250265^W at 250265^W at 260265^W at 260265^W at 2702
65^W at 270
265^W at 300265^W at 300265^W at 310265^W at 310265^W@^W@^W@^W@^W at 340265^W at 340265^W@
350
265^W at 350265^W at 360265^W at 360265^W at 370265^W at 370265^W@ failed. Retrying...
----------------------------------------
my ha.cf on mailstore1:
debugfile /var/log/ha-debug
logfile /var/log/ha-log
keepalive 2
deadtime 7
warntime 5
initdead 12
udpport 694
baud 38400
serial /dev/ttyS0 # Linux
ucast eth0 192.168.2.101
auto_failback off
stonith_host * apcmastersnmp mailstore-power 161 private mailstore2
node mailstore1
node mailstore2
ping 192.168.2.103
ping 192.168.2.104
respawn hacluster /usr/lib/heartbeat/ipfail
i also tried it with :
stonith_host * apcmastersnmp mailstore-power 161 private
but this doesn't change anything.
can anyone help me ?
alex
More information about the Linux-HA
mailing list