#399 Autocomplete Search Terms pro
- Download:
- source codeProject Files in Zip (93.2 KB)
- mp4Full Size H.264 Video (31 MB)
- m4vSmaller H.264 Video (17.6 MB)
- webmFull Size VP8 Video (21.5 MB)
- ogvFull Size Theora Video (38.3 MB)
In the example application below we have a list of products, each one of which has a name, a price and a category. We also have a way to search these products by either their name or category. What we’d like to do is add some auto-completion to the search field so that when a user starts typing a list of suggestions is shown containing products that match the search term. Doing this presents some interesting problems, especially in regards to performance. Let’s get started.
There are a variety of ways that we can we can handle this on the client-side but we’ll keep it simple and use jQueryUI which makes it easy to add an auto-completion dropdown to a text field. We’ll step through this part fairly quickly as it was covered in more detail back in episode 102. To include jQueryUI in our application we just need to modify our application’s JavaScript manifest file and add a line there to include it.
//= require jquery //= require jquery-ui //= require jquery_ujs //= require_tree .
We can add auto-complete behaviour to the search text field in the products CoffeeScript file. We’ll do this when the DOM loads, although you might want to change this behaviour if you’re using TurboLinks. All we need to do is call autocomplete
to the text field. This takes several options and we’ll use source
to specify where the auto-complete data comes from. This will be a URL in our application as there’s too much data to include and load them all on the client at once. We’ll use a search_suggestions
path.
jQuery -> $('#search').autocomplete source: "/search_suggestions"
Our application will need to be able to respond to this path. It doesn’t currently so we’ll need a new controller. We’ll actually create a whole new resource with a model and a controller so that we have a convenient location to store our search suggestions and look them up quickly. There are some performance concerns with storing this kind of data in a SQL database but we’ll look at this later. We’ll give our new resource a term
field, which will contain the search suggestion term that’s returned to the user, and a popularity
field which will determine how the results are sorted.
$ rails g resource search_suggestion term popularity:integer $ rake db:migrate
The /search_suggestions
path now routes to the index
action of our new SearchSuggestionsController
. jQueryUI expects some JSON to be returned from this action and an array of data returned here will display the results to the user. To test that this is working we’ll add some test data here.
class SearchSuggestionsController < ApplicationController def index render json: %w[foo bar] end end
We’ve already added some CSS to make the auto-complete list look good so we can test our auto-complete behaviour.
Instead of displaying dummy data the list should show common keywords from the products that match the text that has been entered. We’ll write this behaviour in the SearchSuggestions
model and have the controller call a method on the model.
class SearchSuggestionsController < ApplicationController def index render json: SearchSuggestion.terms_for(params[:term]) end end
This method should return an array of the terms that we want to suggest to the user but how should it do that? We need to look up a list of suggestions that start with the text that has been entered into the text box. Ideally these would already be in the database and we could look them up with a query, ordered by their popularity, and return up to, say, 10 results. The code to do that would look something like this:
def self.terms_for(prefix) suggestions = where("term like ?", "#{prefix}_%") suggestions.order("popularity desc").limit(10).pluck(:term) end
To get this to work we’d just need to fill up our table by indexing the products data. We’ll create a Rake task called search_suggestions:index
to do this.
namespace :search_suggestions do desc "Generate search suggestions from products" task :index => :environment do SearchSuggestion.index_products end end
This Rake task calls an index_products
method on the SearchSuggestion
model which we’ll need to write.
def self.index_products Product.find_each do |product| index_term(product.name) product.name.split.each { |t| index_term(t) } index_term(product.category) end end def self.index_term(term) where(term: term.downcase).first_or_initialize.tap do |suggestion| suggestion.increment! :popularity end end
Here we loop through all the products, using find_each
so that a batch find is used and all the products aren’t loaded into memory at once. We then call an index_term
method on each name and also split each product’s name at the spaces and index each separate word in the name too. We also index each product’s category. There’s probably a more efficient way to do this but what we’ve got will work well here. In the index_term
method we look for a SearchSuggestion
with the term that’s passed in and use first_or_initialize
to find or create a term if a matching one isn’t found. We then increment its popularity so that terms that are found more often appear nearer the top of the list. Instead of using this as a metric for popularity we could measure the popularity of products and have more popular products show up at the top of the list or keep track of the search terms that are used most often and order the list based on this. We’ll run this Rake task now to index our products.
$ rake search_suggestions:index
Now it’s the moment of truth. When we start typing in the search box we should see the matching suggestions and we do.
This has worked. We can see a list of search suggestions with the most popular terms at the top of the list.
Increasing Performance
Our search box now works but how well does it perform? The request to fetch matching search terms will be triggered frequently as users type search terms and we might be getting hundreds of searches made per second if our site gets busy. How can we measure this and how can we ensure that the results are returned as quickly as possible? One way is to use the Rack Mini-Profiler gem which we covered in episode 368. This is installed by adding it to the gemfile and running bundle.
gem 'rack-mini-profiler'
When we restart the server now each request will report the time it took to process and this even works for AJAX requests. As we type into the search field each request’s time will be added to the list on that’s shown on the page.
Each request only takes a few milliseconds, which isn’t bad, but lets see if we can improve it. If we want to shave milliseconds off a request a good place to look is in the Rack middleware. Each request that comes in to our application goes through an entire middleware stack which can add overhead. To get around this we can apply the technique we demonstrated in episode 150. Instead of responding with a traditional Rails controller we can add some middleware near the top of the stack which will intercept requests for search suggestions. This way the request won’t go through the entire stack, although we will lose some functionality such as logging and cookies. We’ll put our custom middleware in an app/middleware
directory.
class SearchSuggestions def initialize(app) @app = app end def call(env) if env["PATH_INFO"] == "/search_suggestions" request = Rack::Request.new(env) terms = SearchSuggestion.terms_for(request.params["term"]) [200, {"Content-Type" => "application/json"}, [terms.to_json]] else @app.call(env) end end end
Middleware can just be a plain Ruby class with an initialize
method that takes a Rack application, which is what we’ve done here. In initialize
we store the app in an instance method here for later use. We also have a call
method which accepts a Rack environment and in here we intercept requests that match the search_suggestions
path. Requests that don’t match are passed on to the application which makes our middleware act as a kind of endpoint as it can handle certain types of request directly. For those requests that do match we return a 200 OK
response, setting the content type to application/json
and setting the body to the output from SearchSuggestion.terms_for
for the terms passed in. We don’t have access to the request params in the usual way so we grab the request
object and use that to get the term
param. Now we just need to add this middleware to our app in its configuration file. By using insert_before
and passing in 0
as the first argument and the name of our middleware class as the second our middleware will be added at the top of the stack.
config.middleware.insert_before 0, "SearchSuggestions"
After we restart the server again and try entering a search term now the suggestions should come back more quickly.
The search suggestions are now taking less than 2ms to return so we’ve decreased the response time quite substantially with our middleware. Keep in mind that we’re profiling our application in the development environment so these numbers may well be different in production. To get more accurate figures we could set up a staging environment with similar hardware to that we’ll be using in production. That said, a decrease in response time in development will generally be reflected in production, too.
The next performance concern that we’ll take a look at is the call to the database that returns the suggestions. This is where most of the time will be spent for these requests and while these queries are currently fairly quick as the dataset is small as it grows this might become an issue. One way to solve this issue is through caching. We could set up a Memcached store like we did in episode 380 and cache the results of the terms_for
method, like this:
def self.terms_for(prefix) Rails.cache.fetch(["search-terms", prefix]) do suggestions = where("term like ?", "#{prefix}_%") suggestions.order("popularity desc").limit(10).pluck(:term) end end
We use a combination of the string “search-terms” and the entered search phrase to create a unique key for each search term. The results are then cached using whatever store we specify. If we try this out now by searching for the same term more than once we now see response times of under a millisecond.
Using Redis To Improve Performance
This caching technique can be effective but if the user’s searching for a term that isn’t cached the response will take longer as not only does the database query need to be made, the results also need to be written to the cache store. At this point we might start looking for a different storage engine for the autocompletion to replace the SQL database. One such option is Redis which is an advanced in-memory key-value store which is very fast and which has some features that will suit our needs perfectly, such as sorted sets. With it we can add records to a sorted set and assign a value to each one so that it’s sorted. We can then use ZRANGE
to fetch the set in the right order later. Let’s say that we want to index the term “food”. We could make a separate set for each partial match of that word (“f”, “fo” and “foo”) and when a user starts typing the word “food” we can look up a set with that name. If we index multiple terms that start with “foo” these would also be returned, sorted by their score. We can use ZREVRANGE
to do this to return them in reverse order of score so that the most popular are returned first. To add an item to a set we use ZADD but we can also use ZINCRBY
to increment the score for an item that already exists. This will be useful to us as a way to increase the popularity of a given term.
To try this out we’ll need to install Redis. If you’re running OS X the easiest way to do this is to use Homebrew. Redis can then be installed by running brew install redis
. We can then start it up by running this command.
$ redis-server /usr/local/etc/redis.conf
We can now set up our application to work with Redis. First we’ll add the redis gem to our gemfile then run bundle
to install it.
gem 'redis'
Next we’ll create an initializer file where we’ll set up our Redis connection. We’ll store this in a global variable so that we can access it easily in the rest of our app.
$redis = Redis.new
Now we just need to configure our SearchSuggestion
model so that it no longer uses ActiveRecord but Redis instead.
class SearchSuggestion def self.terms_for(prefix) $redis.zrevrange "search-suggestions:#{prefix.downcase}", 0, 9 end def self.index_products Product.find_each do |product| index_term(product.name) product.name.split.each { |t| index_term(t) } index_term(product.category) end end def self.index_term(term) 1.upto(term.length - 1) do |n| prefix = term[0, n] $redis.zincrby "search-suggestions:#{prefix.downcase}", 1, term.downcase end end end
We fetch matching terms by calling $redis.zrevrange
, passing in the name of a set and asking for the first ten items. The set’s name is made up from the string “search-suggestions” and the search term. In a production application we’d do some escaping here to make sure that the key is sanitized, but we won’t do that here. To do the indexing we make a separate set for each portion of the search term, starting with the first letter then adding a letter each time. We call $redis.zincrby
to increment the score for each term, using the same set name we used to read the terms before.
We’ll need to run rake search_suggestions:index
again to index our records in Redis and after we’ve restarted our server again we can test out our auto-completion again. When we enter a search term now we get similar results as before and the speed is similar to what we were getting when we were using caching before.
To learn more about auto-completion with Redis take a look at the Soulmate gem. The source code was a big help in building the Redis solution for this episode. You might even consider using this gem in your applications if it suits your needs.