[SGVLUG] process monitor and control

Claude Felizardo cafelizardo at gmail.com
Mon Jun 3 13:03:21 PDT 2013


Hey all,

I'm looking for a package that will not only monitor processes but also
restart them if needed, preferably with configurable check intervals and
retry limits.

Most of the existing monitoring here has been using Big Brother and they
are starting to migrate things to Nagios but I'm not sure if it has a
restart service capability.  For the stuff I work on, some processes and
log monitoring have recently been added to BB but most are not being
monitored.  When I do get a BB page, it's usually an obvious problem like a
process has died or a log hasn't been updated in a while but quite often
the process is still running, the log still being updated but only upon
close examination can you determine that there is a problem.  Sometimes a
restart might be overkill, just need to send the appropriate message into
the system.  Some restarts need to be coordinated and it's annoying to get
an alarm while you are restarting part of the system.

Now about half of the programs are C++ but a lot of the newer ones are Java
started via a shell script.  I also need to monitor a bunch of ActiveMQ
servers, some of which are controlled by another group but I do need to
know when they are offline so I can make sure my stuff is okay or restart
some of my processes when the remote servers are back.

Someone had suggested Monit which from the descriptions sounds like it
might do the trick.

Has anyone used either Nagios or Monit or can recommend something that does
restart?  Needs to run on both Solaris and Linux.

thanks,
Claude
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sgvlug.net/pipermail/sgvlug/attachments/20130603/5dfc1c7d/attachment.html>


More information about the SGVLUG mailing list