[Linux-HA] possible race condition in OCF apache RA monitor
BBuckingham at BroadViewNet.com
Fri Oct 15 11:16:21 MDT 2010
>> The race condition is that when apache is started, it is possible for
>> to have written it's PID file, but not yet completed its
>> to the point where the wget would succeed. I was able to work around
>> this problem by placing a simple "sleep 5" after starting httpd and
>> first call to monitor_apache().
>If that's the case, then the start action should loop on
>monitor_apache internally until that returns Ok.
>That way, start will only return once monitoring does actually work.
>Bonus: you get a start failure already, if monitoring is not configured
>... looking at the code ...
>Wait. It does that already, since May 2007.
start_apache() only loops monitoring if monitor_apache() returns
$OCF_NOT_RUNNING (7). monitor_apache() returned 1 ($OCF_ERR_GENERIC)
due to the control flow described above.
I think what is needed is specific monitoring logic for apache startup
which allows for the PID file to be there but some period of time before
an HTTP request is returned. Once apache is running, I agree that the
monitor_apache() function, which requires the PID file, process matching
the pid, and a successful wget is OK.
More information about the Linux-HA