#375 Monit pro

Aug 16, 2012 | 16 minutes | Tools, Deployment, Production

Monit can help ensure your Rails app stays up and running smoothly. Here I show how to set it up, receive alerts, and keep tabs on it through a web interface.

Click to Play Video ▶

Download:
source codeProject Files in Zip (79 KB)
mp4Full Size H.264 Video (43.2 MB)
m4vSmaller H.264 Video (20.4 MB)
webmFull Size VP8 Video (23.5 MB)
ogvFull Size Theora Video (50.4 MB)

Let’s say that we’ve successfully deployed a Rails application to production. How do we ensure that it stays up and running smoothly? We could set up a monitoring system to restart any processes that fail or which end up using too many resources and in this episode we’ll use Monit to do this. This can be configured to alert us or to restart a process when something on our server goes awry.

Installing Monit

First we’ll SSH in to the VPS that’s running our application. We’re using Ubuntu 12.04 on our server so the commands used might be different if you’re running a different version or distro. The easiest way to install Monit here is to use apt-get.

          terminal
        
$ sudo apt-get update
$ sudo apt-get install monit

We now have Monit installed and running so let’s configure it. A configuration file will already be have been set up at /etc/monit/monitrc. This file is well-commented and it’s worth taking the time to read through it. We’ll delete the comments in this file so that we can see what the default configuration options are.

          /etc/monit/monitrc
        
set daemon 120  # check services at 2-minute intervals
set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state
set eventqueue
  basedir /var/lib/monit/events # set the base directory where events will be stored
  slots 100                     # optionally limit the queue size
include /etc/moni/conf.d/*

The first line above tells Monit to run in the background and to perform a check every two minutes. We’ll change this to 30 seconds so that the application is checked more frequently. The next few lines configure various file paths and then we have an eventqueue setting that tells Monit to remember any alert messages that it can’t send over email. The last line tells Monit to include any configuration files that appear in the conf.d directory. We don’t have any files in this directory but if we want to configure Monit for various processes such as Nginx or Unicorn this is a good place to put them. That’s all we need to do to this configuration file and next we’ll instruct Monit to monitor our Rails application. The manual is a good resource when editing the configuration and there’s a Configuration Examples page that is useful. This tells us that we can run the check process command to do this, passing in a unique name and a pidfile. Monit will then check this process and we can tell it what to do if the process fails. We could set up this configuration directly on the server by adding files to the /etc/monit/conf.d directory but it’s generally better not to do extensive configuration on the server but instead to put configuration files within our Rails application under source control.

Monitoring Nginx

It’s a good idea to create Capistrano recipes for setting up the server, like we did in episode 337. This way we can create templates for the configuration files which can be generated and copied to any server using Capistrano. As there are multiple configuration files for Monit we’ll make a new monit directory under /config/recipes/templates. Our first file will be for configuring Nginx. Note that this is an erb file so that we can add dynamic content. Our goal here is to monitor Nginx and restart it if it fails. We do this by calling check process, giving it a name and passing in a pidfile option with the path to Nginx’s pid. We could make this path dynamic with some erb if we wanted to configure it through Capistrano, but we won’t do that here. We also need to tell Monit how to start and stop the program.

          /config/recipes/templates/monit/nginx.erb
        
check process nginx with pidfile /var/run/nginx.pid
  start program = "/etc/init.d/nginx start"
  stop program = "/etc/init.d/nginx stop"

This is all the code that’s necessary to ensure that Nginx stays up and running. We need to copy this config file over to the server and we’ll do this through a Capistrano recipe. We’ll create a monit.rb file for this.

          /config/recipes/monit.rb
        
namespace :monit do
  desc "Install Monit"
  task :install do
    run "#{sudo} apt-get -y install monit"
  end
  after "deploy:install", "monit:install"

  desc "Setup all Monit configuration"
  task :setup do
    monit_config "nginx"
    syntax
    reload
  end
  after "deploy:setup", "monit:setup"
  
  task(:nginx, roles: :web) { monit_config "nginx" }
  task(:postgresql, roles: :db) { monit_config "postgresql" }
  task(:unicorn, roles: :app) { monit_config "unicorn" }

  %w[start stop restart syntax reload].each do |command|
    desc "Run Monit #{command} script"
    task command do
      run "#{sudo} service monit #{command}"
    end
  end
end

def monit_config(name, destination = nil)
  destination ||= "/etc/monit/conf.d/#{name}.conf"
  template "monit/#{name}.erb", "/tmp/monit_#{name}"
  run "#{sudo} mv /tmp/monit_#{name} #{destination}"
  run "#{sudo} chown root #{destination}"
  run "#{sudo} chmod 600 #{destination}"
end

Here we define several tasks, one to install Monit in case we’re installing this on a new server and another to set up Monit by copying over the configuration files. This calls a monit_config method which is defined at the bottom of the file. We pass this the configuration file’s name which determines the template file that will be run, move this file into the conf.d directory then set the ownership and permissions. The setup task will then run two other tasks: syntax to check the syntax of the configuration file then reload to reload the config files. Both of these tasks delegate to the service monit command with that given task passed in. If you’re unfamiliar with defining Capistrano tasks like this there’s more information about this in episode 337. To get this working we have to include it in our deploy.rb file so we’ll add it to the list of recipes that we load in.

          /config/deploy.rb
        
load "config/recipes/monit"

To get this file over to the server we just need to run cap monit:setup task. This will copy over the nginx.config file, checking the syntax and then reloading it so that Nginx is monitored.

Let’s try this out. Our application is currently up and running but if we SSH into our server and kill nginx then it will no longer work. Monit will detect that Nginx is down within 30 seconds and if we reload the page 30 seconds later the site will be up again.

There’s a lot more that we can do inside this Nginx config file. A common line to use when checking Nginx is one to check that the number of child processes is greater than 250 and to restart the server if so. This way if Nginx gets stuck spawning child processes we can restart it.

          /config/recipes/templates/monit/nginx.erb
        
check process nginx with pidfile /var/run/nginx.pid
  start program = "/etc/init.d/nginx start"
  stop program = "/etc/init.d/nginx stop"
  if children > 250 then restart

Whenever we have conditions that trigger a restart it’s a good idea to handle the situations where that condition is continually met to prevent Monit from continually trying to restart the server. We can do this by adding another line to this file.

          /config/recipes/templates/monit/nginx.erb
        
check process nginx with pidfile /var/run/nginx.pid
  start program = "/etc/init.d/nginx start"
  stop program = "/etc/init.d/nginx stop"  
  if children > 250 then restart
  if 5 restarts within 5 cycles then timeout

Monit will now stop monitoring this process if it is restarting too often. A cycle here is equivalent to a check so the duration of this is dependent on what we set in the monitrc file. Earlier in this episode we set this to 30 seconds so Monit will stop monitoring our server if it restarts five times within two-and-a-half minutes.

Monitoring The Database

We’ll move on now to monitoring something else. Our application connects to a Postgres database so we’ll create a new file to monitor that. The code we need here is similar to the code for monitoring Ngnix.

          /config/postgresql.erb
        
check process postgresql with pidfile <%= postgresql_pid %>
  start program = "/etc/init.d/postgresql start"
  stop program = "/etc/init.d/postgresql stop"
  if failed host localhost port 5432 protocol pgsql then restart
  if 5 restarts within 5 cycles then timeout

There are a couple of changes here. We’re using erb to pass in the path to the pid file (this is defined in the Postgres recipe file and includes the version number). This makes this more flexible if we change the version we’re running. We also check to see that the database is responding on port 5432. (We could move the port number into the recipe file, too, to make it easier to change the port number.) If Postgres doesn’t respond on that port we restart its process and again we stop monitoring if we have to restart it five times in five cycles.

Now that we have a config file for Postgres we need to copy it to the server when we run the monit:setup task. We could just add this to the setup task with another call to monit_config but the database might be running on a different server from Nginx. We can handle this scenario with Capistrano roles so that each task is called in a separate task.

          /config/recipes/monit.rb
        
namespace :monit do
  desc "Install Monit"
  task :install do
    run "#{sudo} apt-get -y install monit"
  end
  after "deploy:install", "monit:install"

  desc "Setup all Monit configuration"
  task :setup do
    nginx
    postgresql
    syntax
    reload
  end
  after "deploy:setup", "monit:setup"
  
  task(:nginx, roles: :web) { monit_config "nginx" }
  task(:postgresql, roles: :db) { monit_config "postgresql" }
 
  # Rest of file omitted
end

If we do have multiple servers set up in Capistrano this copy over the correct Monit configs to each server.

Unicorn

Next we’ll move on to monitoring Unicorn which is what runs our Rails app. We’ll create a new Monit config file containing the code to monitor the Unicorn process.

          /config/recipes/templates/monit/unicorn.erb
        
check process <%= application %>_unicorn with pidfile <%= unicorn_pid %>
  start = "/etc/init.d/unicorn_<%= application %> start"
  stop = "/etc/init.d/unicorn_<%= application %> stop"

There are a couple of things to note about this code. First we use the application name so that if we do have multiple applications running on the same server there won’t be any collision and they’ll each have a unique name. We’re keeping this monitoring simple as we’re more interested in monitoring the child processes which are more likely to leak memory and require a restart. The tricky part with this is that the child processes don’t have a pid file to point to but we can get around this issue by writing our own pidfile when a child process is spawned. We can do this in the Unicorn config file. In episode 373 we set up an after_fork block to handle zero-downtime deployment. We can use this to write a pidfile as well as it’s triggered whenever Unicorn spawns a worker.

          /config/recipes/templates/unicorn.rb.erb
        
after_fork do |server, worker|
  # Start up the database connection again in the worker
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
  end
  child_pid = server.config[:pid].sub(".pid", "#{worker.nr}.pid")
  system("echo #{Process.pid} > #{child_pid}")
end

The last two lines here grab the pid path to Unicorn, append the worker number to it then pass the process pid to the pid path. This means that we’re able to reference each child process through our Monit config and we’ll add the code to do this now.

          /config/recipes/templates/monit/unicorn.erb
        
<% unicorn_workers.times do |n| %>
  <% pid = unicorn_pid.sub(".pid", ".#{n}.pid") %>
  check process <%= application %>_unicorn_worker_<%= n %> with pidfile <%= pid %>
    start program = "/bin/true"
    stop program = "/usr/bin/test -s <%= pid %> && /bin/kill -QUIT `cat <%= pid %>`"
    if mem > 200.0 MB for 1 cycles then restart
    if cpu > 50% for 3 cycles then restart
    if 5 restarts within 5 cycles then timeout
<% end %>

This isn’t the prettiest code but it means that we can keep everything contained in one file. It works by looping through the number of Unicorn workers (this is a variable that’s set in the unicorn.rb recipe file) and checking each of these child workers. We point to a pid path that is generated in the same way we did this earlier by taking the Unicorn master pid and adding the worker number to it. To start the worker we run /bin/true which does essentially nothing as we don’t want Monit to manage starting up the child processes, the master Unicorn process handles this automatically for us. We’re more interested in how it quits the child processes. This is done by checking that the pidfile exists and sending the QUIT signal to that given pid. We then check to see if the process is using more than a specified amount of memory or CPU and restart it if so. Finally we check to see if the process is restarting too often and if so stop monitoring it. The final step is to adjust our Capistrano recipe so that it has a Unicorn Monit setup task on the app role.

          /config/recipes/monit.rb
        
task :setup do
  nginx
  postgresql
  unicorn
  syntax
  reload
end
after "deploy:setup", "monit:setup"
  
task(:nginx, roles: :web) { monit_config "nginx" }
task(:postgresql, roles: :db) { monit_config "postgresql" }
task(:unicorn, roles: :app) { monit_config "unicorn" }

We can now deploy these changes. Since we’ve adjusted the Unicorn config we’ll run the Unicorn setup task to copy the new configuration file over and also restart Unicorn before running the monit:setup task.

          terminal
        
$ cap unicorn:setup unicorn:restart monit:setup

This will copy over the configuration files and restart Unicorn and Monit.

Monitoring Monit

If our Rails app acts up now Monit will spot this and will restart it automatically for us. That said we shouldn’t ignore Monit and have it blindly acting on its own. It’s a good idea to keep tabs on what it’s doing and there are several ways that we can do this. One way is to occasionally check the log file to see if any errors have occurred.

          terminal
        
$ sudo tail /var/log/monit.log

This is a fairly passive way to monitor errors so instead we’ll set up an email alert system so that we’re notified immediately when an error occurs. We could configure this through the monitrc file but instead we’ll keep the configuration changes in our application’s source code and add a new monitrc.erb file to our Monit templates.

          /config/templates/monit/monitrc.erb
        
set daemon 30

set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state

set eventqueue
    basedir /var/lib/monit/events
    slots 100

set mailserver smtp.gmail.com port 587
    username "foo@example.com" password "secret"
    using tlsv1
    with timeout 30 seconds

set alert ryan@railscasts.com

include /etc/monit/conf.d/*

Most of this is the same as the other monitrc file apart from a couple of changes related to email. We set the mailserver settings to use Gmail’s SMTP server. The username and password are hard-coded here but we could set them through environment variables and use erb to read them here. We also set the alert to define who the alerts are sent to. We’ll need to copy this file to every server and we’ll do this in the setup task.

          /config/recipes/monit.rb
        
task :setup do
 monit_config "monitrc", "/etc/monit/monitrc"
  nginx
  postgresql
  unicorn
  syntax
  reload
end

We pass two arguments to monit_config. The second is a destination and this defaults to a filename in the /etc/montit/conf.d directory based on the name. We don’t want that default here so we’ve passed in the second argument. We have control over when the alerts are sent and can trigger them on given conditions as we would with a restart. Below is an example of how we could do a performance check on the entire system to see the load average, memory and CPU usage.

          /config/templates/monit/monitrc.erb
        
check system blog_server 
  if loadavg(5min) > 2 for 2 cycles then alert
  if memory > 75% for 2 cycles then alert
  if cpu(user) > 75% for 2 cycles then alert

Some processes may be noisy in sending alerts such as the Unicorn workers. For these cases we can only send alerts for certain events. We can so this by calling alert, passing in an email address and telling it what events we want to be alerted on. We’ll tell Monit to only alert us about Unicorn changes if the pid changes two times within sixty seconds. This way we can be notified if our Unicorn servers are restarting frequently but not when minor things happen to the workers.

Another way to keep tabs on Monit is through a web interface. We can enable this in the monitrc file by adding some configuration like this.

          /config/templates/monit/monitrc.erb
        
set httpd port 2812
    allow admin: "secret”

This starts up the HTTPD server on the port that we specify and we can define the permissions for those users who are allowed to access the interface. Again we’ve hard-coded values here but we could use erb code to make this dynamic. We can try this by running monit:setup again to copy our monitrc file to the server. Once we’ve done so if we visit port 2812 on the server and enter the username and password we’ll be taken to the Monit Service Manager. This gives us details about the various processes that Monit is monitoring.

We can see further details by clicking on one of the listed processes.

It’s also worth taking a look at M/Monit. This provides a slick interface for managing multiple Monit servers at once. We can handle alerts through this and there’s even an iPhone app available. This is a paid product, however, so you’ll need to read the licensing agreement before deciding whether or not to user it.

How does Monit compare to other monitoring solutions such as God or Bluepill? Both of these are written in Ruby and their configuration is much more flexible. They do require more memory to run, however, which may be reason enough for you to stick with Monit.