[SGVLUG] process monitor and control

Claude Felizardo cafelizardo at gmail.com
Fri Jun 7 10:55:05 PDT 2013


Rae, I started looking at your link but didn't see anything about nagios or
monit.

Looks like that's two votes for Monit so long as I watch out for the
restart timing.

Any comments about nagios?

Claude


On Tue, Jun 4, 2013 at 11:57 AM, Rae Yip <rae.yip at gmail.com> wrote:

> Monit is okay, but can be a bit tricky to tune properly. It can get
> into some restart loops if your processes have some start-up delay or
> complicated initialization state. This can also result in a lot of
> notification spam.That said, there's not really any other solution
> exactly in that niche.
>
> That's why I tend to factor monitoring and notification as separate
> functions from auto-restart/process watchdog. The latter is relatively
> simple to do with a wrapper script or a cron job, and a lot more
> customizable for poll intervals. Then you can set the monitoring and
> notification to check and warn on longer time intervals, reducing
> alert spam.
>
> There are also more heavy duty "daemon supervisor" systems, some of
> which may come default with your distro:
>
> http://tech.cueup.com/blog/2013/03/08/running-daemons/
>
> There isn't a clear winner yet, and some solutions seem tightly bound
> to what language you're using (which IMHO seems wrong).
>
> -Rae.
>
> On 6/3/13, Michael Proctor-Smith <mproctor13 at gmail.com> wrote:
> > I use monit, it does what you are asking for in that it supports lots of
> > types of monitoring and has configurable monitor interval and you can
> solve
> > your restart problem by calling local command or making a http call.
> >
> >
> > On Mon, Jun 3, 2013 at 1:03 PM, Claude Felizardo
> > <cafelizardo at gmail.com>wrote:
> >
> >> Hey all,
> >>
> >> I'm looking for a package that will not only monitor processes but also
> >> restart them if needed, preferably with configurable check intervals and
> >> retry limits.
> >>
> >> Most of the existing monitoring here has been using Big Brother and they
> >> are starting to migrate things to Nagios but I'm not sure if it has a
> >> restart service capability.  For the stuff I work on, some processes and
> >> log monitoring have recently been added to BB but most are not being
> >> monitored.  When I do get a BB page, it's usually an obvious problem
> like
> >> a
> >> process has died or a log hasn't been updated in a while but quite often
> >> the process is still running, the log still being updated but only upon
> >> close examination can you determine that there is a problem.  Sometimes
> a
> >> restart might be overkill, just need to send the appropriate message
> into
> >> the system.  Some restarts need to be coordinated and it's annoying to
> >> get
> >> an alarm while you are restarting part of the system.
> >>
> >> Now about half of the programs are C++ but a lot of the newer ones are
> >> Java started via a shell script.  I also need to monitor a bunch of
> >> ActiveMQ servers, some of which are controlled by another group but I do
> >> need to know when they are offline so I can make sure my stuff is okay
> or
> >> restart some of my processes when the remote servers are back.
> >>
> >> Someone had suggested Monit which from the descriptions sounds like it
> >> might do the trick.
> >>
> >> Has anyone used either Nagios or Monit or can recommend something that
> >> does restart?  Needs to run on both Solaris and Linux.
> >>
> >> thanks,
> >> Claude
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sgvlug.net/pipermail/sgvlug/attachments/20130607/36d92a00/attachment.html>


More information about the SGVLUG mailing list