[Linux-HA] behavior of lrmd/crmd when lrmd process is killed
Junko IKEDA
ikedaj at intellilink.co.jp
Thu Jun 26 22:48:33 MDT 2008
Hi,
When I checked the following bug using the latest heartbeat-dev and
pacemaker-dev,
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1924
I found the weird behavior.
There are these five resources.
============
Last updated: Fri Jun 27 13:07:11 2008
Current DC: x3650b (db1e4cef-d242-419e-9393-bf5113384744)
2 Nodes configured.
1 Resources configured.
============
Node: x3650a (ce2caf3f-c150-4394-916d-3b4b635394d7): online
Node: x3650b (db1e4cef-d242-419e-9393-bf5113384744): online
Resource Group: grpPostgreSQLDB
prmFsPostgreSQLDB1 (ocf::heartbeat:Filesystem): Started x3650a
prmFsPostgreSQLDB2 (ocf::heartbeat:Filesystem): Started x3650a
prmFsPostgreSQLDB3 (ocf::heartbeat:Filesystem): Started x3650a
prmIpPostgreSQLDB (ocf::heartbeat:IPaddr): Started x3650a
prmApPostgreSQLDB (ocf::heartbeat:pgsql): Started x3650a
When "lrmd" is killed, crmd can not notice that event due to (maybe) a
glib's problem.
hb_report-10/x3650a:line 616
heartbeat[24311]: 2008/06/27_12:57:55 WARN: Managed
/usr/lib64/heartbeat/lrmd -r process 24327 killed by signal 9 [SIGKILL -
Kill, unblockable].
but if I stop pgsql like this,
# su - postgres
$ pg_ctl stop
waiting for server to shut down.... done
server stopped
the frozen process is resumed.
hb_report-10/x3650a:line 657
crmd[24330]: 2008/06/27_13:09:36 CRIT: lrm_connection_destroy: LRM
Connection failed
Heartbeat 2.1.3 did the same.
I wonder why the status of Postgres affects this.
Thanks,
Junko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hb_report-10.tar.gz
Type: application/octet-stream
Size: 74398 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20080627/ff095cd8/hb_report-10.tar-0001.obj
More information about the Linux-HA
mailing list