#278 Search with Sunspot
- Download:
- source codeProject Files in Zip (231 KB)
- mp4Full Size H.264 Video (21.6 MB)
- m4vSmaller H.264 Video (11.8 MB)
- webmFull Size VP8 Video (16.1 MB)
- ogvFull Size Theora Video (28.3 MB)
Sunspot is a solution for adding full-text searching to Ruby applications. It uses Solr in the background and has many great features. In this episode we’ll use it to add full-text searching to a Rails application, using the simple blogging app we’ve used before in previous episodes.
This application has a page that displays a number of articles and we want to implement the ability to search across them. Using SQL to do this can quickly become difficult and is often not the best approach. A dedicated full-text solution such as Sunspot is a much better way to implement this feature.
Installing Sunspot
Sunspot comes as a gem and is installed in the usual way by adding it to the Gemfile
and running bundle
.
source 'http://rubygems.org' gem 'rails', '3.0.9' gem 'sqlite3' gem 'nifty-generators' gem 'sunspot_rails'
Once the gem and its dependencies have installed we’ll need to generate Sunspot’s configuration file which we can do by running
$ rails g sunspot_rails:install
This command creates a YML file at /config/sunspot.yml
. We don’t need to make any changes to the default settings in this file.
Sunspot embeds Solr inside the gem so there’s no need to install it separately. This means that it works straight out of the box which makes it far more convenient to use in development. To get it up and running we run
$ rake sunspot:solr:start
If you’re running OS X Lion and you haven’t installed a Java runtime you’ll be prompted to do so when you run this command. You may also see a deprecation warning but this can be safely ignored. The command will also create some more configuration files for advanced configuration. We won’t cover them here but there are details in the documentation on how to modify these.
Using Sunspot
Now that we have Sunspot installed we can use it in our Article
model. To add full text searching we use the searchable
method.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :content end end
This method takes a block and inside it we define the attributes that we want to search against so that Sunspot knows what data to index. We can use the text
method to define the attributes that will have full-text searches run against them. For our articles we’ll do this for the name and content fields.
Sunspot automatically indexes any new records but not existing ones. We can tell Sunspot to reindex the existing records by running
$ rake sunspot:reindex
All of the articles are now in our Solr database and can be searched so we’ll add a search field at the top of the index page.
<% title "Articles" %> <%= form_tag articles_path, :method => :get do %> <p> <%= text_field_tag :search, params[:search] %> <%= submit_tag "Search", :name => nil %> <% end %> <!-- rest of view omitted -->
This form is submitted to the index action using GET, so any search parameters added will be added to the query string. We’ll modify the controller next so that it fetches the articles using that search
parameter. To perform a search with Sunspot we call search
on the model and pass in a block. Inside the block we can call various methods to handle complex searches. We’ll use the fulltext
method and pass it the search parameters from the form. Finally we’ll assign the result of all of this to @search
. We can call results
on this to get a list of the matching articles.
def index @search = Article.search do fulltext params[:search] end @articles = @search.results end
We can test this now by reloading the articles page and searching for a keyword. When we do so we’ll get a list of matching articles returned.
The search returns a list of the articles that contain the search term whether it’s in the article’s name
or its content
.
There’s a lot more that we can do inside the searchable
block in the Article
model. For example we can use boost
to weigh the results so that matches in the article’s name are considered more important than those in the content.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :boost => 5 text :content end end
This is important when we want to sort results by relevance. In this case articles whose name contains the search term will appear higher up in the results than articles where the search term only appears in the content.
The attributes listed in the searchable
block don’t have to be actual database columns, we can use any method that we define in the model. We’ll create a publish_month
method that will return a string containing the name of the month and the year when the article was published, then search against that method just as if it was a database column.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :boost => 5 text :content, :publish_month end def publish_month published_at.strftime("%B %Y") end end
We’ll need to reindex the records by running rake sunspot:reindex
again before we can search against this new column, but once we’ve done so we can search for articles based on their month name.
As an alternative to creating a method we can pass in a block and search against whatever the block returns. An article has many comments so we’ll add the ability to search for the comments’ content by using a block.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :boost => 5 text :content, :publish_month text :comments do comments.map(&:content) end end def publish_month published_at.strftime("%B %Y") end end
The context inside the block is an instance of an Article
so inside it we can get the comments for an article and map them to the content of each comment. Even though this returns an array Sunspot will handle this and index all of the comments so that they’re searchable.
Searching Against Attributes
What if we want to add some search capabilities that go beyond simple full-text searching, maybe searching on a specific attribute? For this we can pass in the type of attribute we want to search, whether it’s a string, an integer, a float or even a timestamp. To add the published_at
attribute to the search fields we can use the time
method.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :boost => 5 text :content, :publish_month text :comments do comments.map(&:content) end time :published_at end def publish_month published_at.strftime("%B %Y") end end
We can make use of this in the ArticlesController
to restrict the searches to articles with a published_at
date earlier than the current time. We use the with
method to do this.
def index @search = Article.search do fulltext params[:search] with(:published_at).less_than(Time.zone.now) end @articles = @search.results end
With this in place the search won’t return articles that haven’t yet been published. There is some great documentation on the attributes you can pass in on the Sunspot wiki page.
Faceted Searching
Faceted Searching allows us to filter the search results based on certain attributes such as the month on which the article was published. Let’s say that we want to add a list of links showing the months for which there are published articles. When we click one of the links it will filter the list of articles so that only those published in that month are shown.
To do this we’ll first add a string
attribute to the searchable
block for our publish_month
method.
class Article < ActiveRecord::Base attr_accessible :name, :content, :published_at has_many :comments searchable do text :name, :boost => 5 text :content, :publish_month text :comments do comments.map(&:content) end time :published_at string :publish_month end def publish_month published_at.strftime("%B %Y") end end
We can turn this into a facet by calling facet
in the search
block in the ArticlesController
.
def index @search = Article.search do fulltext params[:search] with(:published_at).less_than(Time.zone.now) facet(:publish_month) end @articles = @search.results end
Now we can list those facets on the index
page by adding the following code between the search box and list of articles.
<div id="facets"> <h3>Published</h3> <ul> <% for row in @search.facet(:publish_month).rows %> <li> <% if params[:month].blank? %> <%= link_to row.value, :month => row.value %> (<%= row.count %>) <% else %> <strong><%= row.value %></strong> (<%= link_to "remove", :month => nil %>) <% end %> </li> <% end %> </ul> </div>
In this code we loop through each of the publish_month
facet items and display them. If we call .facet
on our @search
object and pass in the attribute that we want to list the facets by, in this case :publish_month
, and then call .rows
on that it will return every facet option for that attribute.
When we call row.value
it returns the value for that attribute, e.g. “January 2011”. We can also call row.count
to return the number of articles that match that value. If there’s a month
parameter in the query string we’ll display the value along with a “remove” link that will remove the parameter. This gives us some nice functionality for selecting a given facet and passing it in through a month
parameter.
When we reload the page now, and we’ve reindexed the records, we’ll see a list of facets in a panel, each one of which shows a month and the number of articles published in that month. If we select a month we’ll see it as a month
parameter in the query string but the articles aren’t filtered. To fix this we need to add another with
parameter to the search
in the controller so that it filters by the month if the month
parameter is present.
def index @search = Article.search do fulltext params[:search] with(:published_at).less_than(Time.zone.now) facet(:publish_month) with(:publish_month, params[:month]) ↵ if params[:month].present? end @articles = @search.results end
Now when we select a month we’ll see the list correctly filtered by the articles that were published that month.
Clicking the “remove” link will return us to the complete list. This works in conjunction with search results too. If we enter a search term the list will show the months that have articles that match.
Facets are a great feature to have alongside searching.
That’s it for this episode on Sunspot. It’s a great way to add full-text searching to Rails applications and has many extra features that we’ve not covered here. Be sure to take a look at the wiki for more information.