GUI Application HA

Alan Robertson alanr@unix.sh
Thu, 20 Feb 2003 15:40:33 -0700


Matthew Tedder wrote:
> I've discussed and refined an idea for making HA possible for GUI applications 
> such as word processing, email, etc. So if one server fails, the next picks 
> it up and the user doesn't notice... (Except his/her last few 
> keystrokes/mouse-clicks) may be undone..  
> 
> Question:  Does this appropriate fit into the Linux-HA mailing list?  Or, does 
> it need it's own home...  HA of GUI apps is of very high importance for 
> us--both in our Linux-based Cyber Cafe and for thin-client based solutions.
> 
> With constant public use from 11:30am to 11:00pm (and often beyond), the whole 
> Red Hat server has frozen about once every three weeks.  It may not be much, 
> compared with Windows but such large outages of desktop systems can be 
> unacceptible.  One Linux server (dual athlon 1.4Ghz w/ 512MB) can accomodate 
> about 80 client workstations comfortably--nearly triple that number if 
> Konqueror is used instead of Mozilla.  

How long does it take to recover from these freezes?  Is it the outage 
that's the problem, or the length of the outage?

> Our typical usage is mostly web browsing, secondly games, and thirdly word 
> processing/spreadsheet.  In our customer's spaces, it's mostly word 
> processing/spreadsheet, web browsing, and email.

As was mentioned before, this is VERY hard - and a very long way from being 
a solved problem - by anyone using commodity hardware or OSes.

It's pretty easy to fix the client failure problem by using VNC, but the 
server failure problem (in this scenario) is extraordinarily difficult, and 
probably expensive.  Would you cut your ratio down to 10 per server, or 
maybe 20 per server to solve this problem?

In all likelihood, it will involve changing every single application 
involved to do checkpoint-restart every few seconds (or several times a 
second), or going with a highly-customized kernel which probably crashes 
every few days instead of every few weeks.

Whatever your final solution, you'll want to think carefully about what your 
minimum requirements are.  Letting your ambitions get out of hand, will 
likely cost you a great deal, and make achieving it much more difficult.

Could you make the crashes go away by throwing more hardware at it, so that 
it isn't so heavily loaded (and hence less likely to crash)?  This would be 
MUCH MUCH cheaper than fixing the problem in software.

Have you considered running an old but very stable kernel like 2.0.36 or 
something?

-- 
     Alan Robertson <alanr@unix.sh>

"Openness is the foundation and preservative of friendship....  Let me claim 
from you at all times your undisguised opinions." - William Wilberforce