#375 Monit pro
- Download:
- source codeProject Files in Zip (79 KB)
- mp4Full Size H.264 Video (43.2 MB)
- m4vSmaller H.264 Video (20.4 MB)
- webmFull Size VP8 Video (23.5 MB)
- ogvFull Size Theora Video (50.4 MB)
Let’s say that we’ve successfully deployed a Rails application to production. How do we ensure that it stays up and running smoothly? We could set up a monitoring system to restart any processes that fail or which end up using too many resources and in this episode we’ll use Monit to do this. This can be configured to alert us or to restart a process when something on our server goes awry.
Installing Monit
First we’ll SSH in to the VPS that’s running our application. We’re using Ubuntu 12.04 on our server so the commands used might be different if you’re running a different version or distro. The easiest way to install Monit here is to use apt-get
.
$ sudo apt-get update $ sudo apt-get install monit
We now have Monit installed and running so let’s configure it. A configuration file will already be have been set up at /etc/monit/monitrc
. This file is well-commented and it’s worth taking the time to read through it. We’ll delete the comments in this file so that we can see what the default configuration options are.
set daemon 120 # check services at 2-minute intervals set logfile /var/log/monit.log set idfile /var/lib/monit/id set statefile /var/lib/monit/state set eventqueue basedir /var/lib/monit/events # set the base directory where events will be stored slots 100 # optionally limit the queue size include /etc/moni/conf.d/*
The first line above tells Monit to run in the background and to perform a check every two minutes. We’ll change this to 30 seconds so that the application is checked more frequently. The next few lines configure various file paths and then we have an eventqueue
setting that tells Monit to remember any alert messages that it can’t send over email. The last line tells Monit to include any configuration files that appear in the conf.d
directory. We don’t have any files in this directory but if we want to configure Monit for various processes such as Nginx or Unicorn this is a good place to put them. That’s all we need to do to this configuration file and next we’ll instruct Monit to monitor our Rails application. The manual is a good resource when editing the configuration and there’s a Configuration Examples page that is useful. This tells us that we can run the check process
command to do this, passing in a unique name and a pidfile
. Monit will then check this process and we can tell it what to do if the process fails. We could set up this configuration directly on the server by adding files to the /etc/monit/conf.d
directory but it’s generally better not to do extensive configuration on the server but instead to put configuration files within our Rails application under source control.
Monitoring Nginx
It’s a good idea to create Capistrano recipes for setting up the server, like we did in episode 337. This way we can create templates for the configuration files which can be generated and copied to any server using Capistrano. As there are multiple configuration files for Monit we’ll make a new monit
directory under /config/recipes/templates
. Our first file will be for configuring Nginx. Note that this is an erb file so that we can add dynamic content. Our goal here is to monitor Nginx and restart it if it fails. We do this by calling check process
, giving it a name and passing in a pidfile
option with the path to Nginx’s pid. We could make this path dynamic with some erb if we wanted to configure it through Capistrano, but we won’t do that here. We also need to tell Monit how to start and stop the program.
check process nginx with pidfile /var/run/nginx.pid start program = "/etc/init.d/nginx start" stop program = "/etc/init.d/nginx stop"
This is all the code that’s necessary to ensure that Nginx stays up and running. We need to copy this config file over to the server and we’ll do this through a Capistrano recipe. We’ll create a monit.rb
file for this.
namespace :monit do desc "Install Monit" task :install do run "#{sudo} apt-get -y install monit" end after "deploy:install", "monit:install" desc "Setup all Monit configuration" task :setup do monit_config "nginx" syntax reload end after "deploy:setup", "monit:setup" task(:nginx, roles: :web) { monit_config "nginx" } task(:postgresql, roles: :db) { monit_config "postgresql" } task(:unicorn, roles: :app) { monit_config "unicorn" } %w[start stop restart syntax reload].each do |command| desc "Run Monit #{command} script" task command do run "#{sudo} service monit #{command}" end end end def monit_config(name, destination = nil) destination ||= "/etc/monit/conf.d/#{name}.conf" template "monit/#{name}.erb", "/tmp/monit_#{name}" run "#{sudo} mv /tmp/monit_#{name} #{destination}" run "#{sudo} chown root #{destination}" run "#{sudo} chmod 600 #{destination}" end
Here we define several tasks, one to install Monit in case we’re installing this on a new server and another to set up Monit by copying over the configuration files. This calls a monit_config
method which is defined at the bottom of the file. We pass this the configuration file’s name which determines the template file that will be run, move this file into the conf.d
directory then set the ownership and permissions. The setup task will then run two other tasks: syntax
to check the syntax of the configuration file then reload
to reload the config files. Both of these tasks delegate to the service monit
command with that given task passed in. If you’re unfamiliar with defining Capistrano tasks like this there’s more information about this in episode 337. To get this working we have to include it in our deploy.rb
file so we’ll add it to the list of recipes that we load in.
load "config/recipes/monit"
To get this file over to the server we just need to run cap monit:setup task
. This will copy over the nginx.config
file, checking the syntax and then reloading it so that Nginx is monitored.
Let’s try this out. Our application is currently up and running but if we SSH into our server and kill nginx
then it will no longer work. Monit will detect that Nginx is down within 30 seconds and if we reload the page 30 seconds later the site will be up again.
There’s a lot more that we can do inside this Nginx config file. A common line to use when checking Nginx is one to check that the number of child processes is greater than 250 and to restart the server if so. This way if Nginx gets stuck spawning child processes we can restart it.
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
if children > 250 then restart
Whenever we have conditions that trigger a restart it’s a good idea to handle the situations where that condition is continually met to prevent Monit from continually trying to restart the server. We can do this by adding another line to this file.
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
if children > 250 then restart
if 5 restarts within 5 cycles then timeout
Monit will now stop monitoring this process if it is restarting too often. A cycle here is equivalent to a check so the duration of this is dependent on what we set in the monitrc
file. Earlier in this episode we set this to 30 seconds so Monit will stop monitoring our server if it restarts five times within two-and-a-half minutes.
Monitoring The Database
We’ll move on now to monitoring something else. Our application connects to a Postgres database so we’ll create a new file to monitor that. The code we need here is similar to the code for monitoring Ngnix.
check process postgresql with pidfile <%= postgresql_pid %>
start program = "/etc/init.d/postgresql start"
stop program = "/etc/init.d/postgresql stop"
if failed host localhost port 5432 protocol pgsql then restart
if 5 restarts within 5 cycles then timeout
There are a couple of changes here. We’re using erb to pass in the path to the pid file (this is defined in the Postgres recipe file and includes the version number). This makes this more flexible if we change the version we’re running. We also check to see that the database is responding on port 5432. (We could move the port number into the recipe file, too, to make it easier to change the port number.) If Postgres doesn’t respond on that port we restart its process and again we stop monitoring if we have to restart it five times in five cycles.
Now that we have a config file for Postgres we need to copy it to the server when we run the monit:setup
task. We could just add this to the setup
task with another call to monit_config
but the database might be running on a different server from Nginx. We can handle this scenario with Capistrano roles so that each task is called in a separate task.
namespace :monit do desc "Install Monit" task :install do run "#{sudo} apt-get -y install monit" end after "deploy:install", "monit:install" desc "Setup all Monit configuration" task :setup do nginx postgresql syntax reload end after "deploy:setup", "monit:setup" task(:nginx, roles: :web) { monit_config "nginx" } task(:postgresql, roles: :db) { monit_config "postgresql" } # Rest of file omitted end
If we do have multiple servers set up in Capistrano this copy over the correct Monit configs to each server.
Unicorn
Next we’ll move on to monitoring Unicorn which is what runs our Rails app. We’ll create a new Monit config file containing the code to monitor the Unicorn process.
check process <%= application %>_unicorn with pidfile <%= unicorn_pid %> start = "/etc/init.d/unicorn_<%= application %> start" stop = "/etc/init.d/unicorn_<%= application %> stop"
There are a couple of things to note about this code. First we use the application name so that if we do have multiple applications running on the same server there won’t be any collision and they’ll each have a unique name. We’re keeping this monitoring simple as we’re more interested in monitoring the child processes which are more likely to leak memory and require a restart. The tricky part with this is that the child processes don’t have a pid file to point to but we can get around this issue by writing our own pidfile when a child process is spawned. We can do this in the Unicorn config file. In episode 373 we set up an after_fork
block to handle zero-downtime deployment. We can use this to write a pidfile as well as it’s triggered whenever Unicorn spawns a worker.
after_fork do |server, worker|
# Start up the database connection again in the worker
if defined?(ActiveRecord::Base)
ActiveRecord::Base.establish_connection
end
child_pid = server.config[:pid].sub(".pid", "#{worker.nr}.pid")
system("echo #{Process.pid} > #{child_pid}")
end
The last two lines here grab the pid path to Unicorn, append the worker number to it then pass the process pid to the pid path. This means that we’re able to reference each child process through our Monit config and we’ll add the code to do this now.
<% unicorn_workers.times do |n| %> <% pid = unicorn_pid.sub(".pid", ".#{n}.pid") %> check process <%= application %>_unicorn_worker_<%= n %> with pidfile <%= pid %> start program = "/bin/true" stop program = "/usr/bin/test -s <%= pid %> && /bin/kill -QUIT `cat <%= pid %>`" if mem > 200.0 MB for 1 cycles then restart if cpu > 50% for 3 cycles then restart if 5 restarts within 5 cycles then timeout <% end %>
This isn’t the prettiest code but it means that we can keep everything contained in one file. It works by looping through the number of Unicorn workers (this is a variable that’s set in the unicorn.rb
recipe file) and checking each of these child workers. We point to a pid path that is generated in the same way we did this earlier by taking the Unicorn master pid and adding the worker number to it. To start the worker we run /bin/true
which does essentially nothing as we don’t want Monit to manage starting up the child processes, the master Unicorn process handles this automatically for us. We’re more interested in how it quits the child processes. This is done by checking that the pidfile exists and sending the QUIT
signal to that given pid. We then check to see if the process is using more than a specified amount of memory or CPU and restart it if so. Finally we check to see if the process is restarting too often and if so stop monitoring it. The final step is to adjust our Capistrano recipe so that it has a Unicorn Monit setup task on the app role.
task :setup do nginx postgresql unicorn syntax reload end after "deploy:setup", "monit:setup" task(:nginx, roles: :web) { monit_config "nginx" } task(:postgresql, roles: :db) { monit_config "postgresql" } task(:unicorn, roles: :app) { monit_config "unicorn" }
We can now deploy these changes. Since we’ve adjusted the Unicorn config we’ll run the Unicorn setup task to copy the new configuration file over and also restart Unicorn before running the monit:setup
task.
$ cap unicorn:setup unicorn:restart monit:setup
This will copy over the configuration files and restart Unicorn and Monit.
Monitoring Monit
If our Rails app acts up now Monit will spot this and will restart it automatically for us. That said we shouldn’t ignore Monit and have it blindly acting on its own. It’s a good idea to keep tabs on what it’s doing and there are several ways that we can do this. One way is to occasionally check the log file to see if any errors have occurred.
$ sudo tail /var/log/monit.log
This is a fairly passive way to monitor errors so instead we’ll set up an email alert system so that we’re notified immediately when an error occurs. We could configure this through the monitrc
file but instead we’ll keep the configuration changes in our application’s source code and add a new monitrc.erb
file to our Monit templates.
set daemon 30 set logfile /var/log/monit.log set idfile /var/lib/monit/id set statefile /var/lib/monit/state set eventqueue basedir /var/lib/monit/events slots 100 set mailserver smtp.gmail.com port 587 username "foo@example.com" password "secret" using tlsv1 with timeout 30 seconds set alert ryan@railscasts.com include /etc/monit/conf.d/*
Most of this is the same as the other monitrc
file apart from a couple of changes related to email. We set the mailserver
settings to use Gmail’s SMTP server. The username and password are hard-coded here but we could set them through environment variables and use erb to read them here. We also set the alert
to define who the alerts are sent to. We’ll need to copy this file to every server and we’ll do this in the setup task.
task :setup do monit_config "monitrc", "/etc/monit/monitrc" nginx postgresql unicorn syntax reload end
We pass two arguments to monit_config
. The second is a destination and this defaults to a filename in the /etc/montit/conf.d
directory based on the name. We don’t want that default here so we’ve passed in the second argument. We have control over when the alerts are sent and can trigger them on given conditions as we would with a restart. Below is an example of how we could do a performance check on the entire system to see the load average, memory and CPU usage.
check system blog_server if loadavg(5min) > 2 for 2 cycles then alert if memory > 75% for 2 cycles then alert if cpu(user) > 75% for 2 cycles then alert
Some processes may be noisy in sending alerts such as the Unicorn workers. For these cases we can only send alerts for certain events. We can so this by calling alert
, passing in an email address and telling it what events we want to be alerted on. We’ll tell Monit to only alert us about Unicorn changes if the pid changes two times within sixty seconds. This way we can be notified if our Unicorn servers are restarting frequently but not when minor things happen to the workers.
Another way to keep tabs on Monit is through a web interface. We can enable this in the monitrc
file by adding some configuration like this.
set httpd port 2812 allow admin: "secret”
This starts up the HTTPD server on the port that we specify and we can define the permissions for those users who are allowed to access the interface. Again we’ve hard-coded values here but we could use erb code to make this dynamic. We can try this by running monit:setup
again to copy our monitrc file to the server. Once we’ve done so if we visit port 2812 on the server and enter the username and password we’ll be taken to the Monit Service Manager. This gives us details about the various processes that Monit is monitoring.
We can see further details by clicking on one of the listed processes.
It’s also worth taking a look at M/Monit. This provides a slick interface for managing multiple Monit servers at once. We can handle alerts through this and there’s even an iPhone app available. This is a paid product, however, so you’ll need to read the licensing agreement before deciding whether or not to user it.
How does Monit compare to other monitoring solutions such as God or Bluepill? Both of these are written in Ruby and their configuration is much more flexible. They do require more memory to run, however, which may be reason enough for you to stick with Monit.