[Linux-ha-dev] Using ha_logd can cause blocking in clients
Andrew Beekhof (GMail)
beekhof at gmail.com
Mon Apr 4 05:13:59 MDT 2005
I think that much of the delayed message problems we've been having are
related to how clients use ha_logd.
Right now, if clients exceed their queue to ha_logd, then they do a
direct log (which in my case will call syslog directly) where it is
quite plausible that they will block.
It probably also explains why messages are logged out of sync (if you
remember my bringing that up a while back) since logs sent directly to
syslog can easily come out before previous ones in the queue to
ha_logd.
It would also explain why once it starts that it gets progressively
worse, since blocking causes late messages, which causes more logs,
which block in syslog, which causes more late messages, etc etc.
I'm about to try with the following patch and expect it to help
(basically it discards the overflow logs). Gshi, maybe you can improve
on it and commit something like it. Perhaps we need a ha.cf directive
for determining the behavior when the log queue is full...
(discard/block/other?).
While on the topic, i think it would be helpful to include some subtle
way of indicating if a message was logged via ha_logd or directly via
syslog. That would show things like this up a lot earlier.
Andrew
Btw. the -dev mailing list seems to be misbehaving, hence the CC's
--- cl_log.c 17 Mar 2005 09:16:06 +0100 1.43
+++ cl_log.c 04 Apr 2005 13:11:44 +0200
@@ -330,6 +330,8 @@
* non-blocking IPC.
*/
+gboolean last_log_failed = FALSE;
+
/* Cluster logging function */
void
cl_log(int priority, const char * fmt, ...)
@@ -376,15 +378,27 @@
return_to_orig_privs();
}
- if ( use_logging_daemon &&
- cl_log_depth <= 1 &&
- LogToLoggingDaemon(priority, buf, nbytes + 1, TRUE) == HA_OK){
- goto LogDone;
+ if ( use_logging_daemon && cl_log_depth <= 1) {
+ if(LogToLoggingDaemon(priority, buf, nbytes + 1, TRUE) != HA_OK){
+ /* uhm? */
+ char msg[] = "Logging overflow,"
+ " discarding logs until congestion eases";
+ if(last_log_failed == FALSE) {
+ cl_direct_log(LOG_WARNING, msg, TRUE, NULL,
+ cl_process_pid, NULLTIME);
+ last_log_failed = TRUE;
+ }
+
+ } else {
+ last_log_failed = FALSE;
+ }
+
+
}else {
+ /* this may cause blocking... maybe should make it optional? */
cl_direct_log(priority, buf, TRUE, NULL, cl_process_pid, NULLTIME);
}
- LogDone:
cl_log_depth--;
return;
}
--
Andrew Beekhof
"No means no, and no means yes, and everything in between and all the
rest" - TISM
More information about the Linux-HA-Dev
mailing list