#306 ElasticSearch Part 1
Add full text searching using ElasticSearch and Tire. Here I will show the steps involved in adding this search to an existing application. This is the first part in a two part series.
- Download:
- source code
- mp4
- m4v
- webm
- ogv
What are the advantages of using this over Sunspot and Solr?
@Thomas: I've used Sphinx, Solr and ElasticSearch in production and my favorite is ElasticSearch for a few reasons:
After working with different search engines for a while now, most of them require lots of time tweaking and configuring to fit your needs. The biggest advantage for ElasticSearch is the built-in functionality that usually requires lots of configuration. Less configuring means fewer opportunities to break and more time to spend concentrating on more important things like building your website! Also check out ElasticSearch's percolate queries...another cool feature that you may find useful.
@Daniel: How about using ElasticSearch with Heroku?
For something like Sunspot and Solr, you can use the WebSolr add-on, but is there any hosted ElasticSearch solutions out there, that works nicely with Heroku?
@Thomas
Not that I know of. Maybe the WebSolr guys would be interested in providing an ElasticSearch solution?
Using elasticsearch with Heroku is now possible. See the following article :
http://adventuresincoding.com/2012/05/using-elasticsearch-with-heroku
Thanks for summarizing for us @Daniel, I was going to ask the same question. I've been struggling with which engine to use for a production app. I watched this screencast with a bit of an attitude - oh crap another search engine. But after reading your reasoning I think I might give ES a go now, rather than the others. You've managed to turn me around.
No problem @Dom. Ryan gives a nice overview in the screencast but there are some awesome features that aren't covered like date histogram facets and percolate queries that are worth looking into. The date histogram facet can group a field's total by month, week, day etc. For example, if you have a website with items and they belong to users, you can group the user's items by month with the date histogram facet. And, from ES's website, percolate queries...
Also, the documentation for the tire gem is somewhat lacking/confusing in my opinion so I had to do some extra research to find out how to add date boosting for queries and use different stemmers like KStem (KStem is less agressive than snowball and the other stemmers if you need stemming). It's really easy to customize your index settings to optimize for faster queries or faster indexing, setting up custom analyzers and changing your index schema.
How to use " More Like This " in ElasticSearch ?
Thomas:
ElasticSearch is (near) real time and has built-in sharding and clustering support.
Ryan:
Thanks a lot for those screencasts. =D
Thanks Ryan, it is possible to release the second part next week,to have the entire series?
"Sphinx and Solr both index documents periodically (with a large delay or manual re-index by default) so it's more difficult to index documents near real time. ElasticSearch has a default delay of one second."
Sphinx has a feature called "delta indexing" which provides real-time updates to the index. So Sphinx doesn't have to rely on periodic updates.
@Jarron,
In sphinx, delta indexing acts like a secondary index where you can index a smaller number of documents (such as new documents added today) and your site will search the main + delta index. You need to merge the delta index with the main index frequently (once a day or once a week at least) by using a cron job or other periodically running task. Also, the delta index doesn't imply real-time indexing either, you still have to periodically update the delta index as well. It's not real time. Last time I used sphinx about a year ago, they were experimenting with an update API where you can just update the document in the index, which would be real time.
Ryan, thanks for all the screencasts. I found them very useful.
I have a question about tire/elasticsearch working with ActiveRecord. Your episode talked about using a filter to put constraints on the records. I was wondering if it was possible to combine tire/elasticsearch with scopes. E.g. in the show notes, instead of using a filter to filter out articles not yet published, would it be possible to use an activerecord scope on
Article
?
Technically, it should be possibly, when you'd use the
:load
option. Using filters and not using:load
is of course much more preferable.Hey all,
thanks to Ryan for the awesome screencasts. I've published a couple of refactorings and suggestions here:
Thanks Ryan!
I'm new to rails so sorry if this is a dumb question, but working on an application similar to this except that every article will have a text document attached to it (pdf, doc, etc.) does a search enginer like ElasticSearch search the actual text of the files? if not, does anyone know of a tool that would help me?
Thanks!
Hey all !
I've been struggling for a few days, first installing ES as a service (the 0.19.6.deb wouldn't work on both LMDE and Ubuntu server 12.04), and then displaying results with pagination (I found the ES website quite outdated).
Anyway, I though I should share my small findings:
First for those of you who want to run ES as a service, you should check the service wrapper once you have compiled and installed ES. Then follow the documentation to run it as a non root user
Moreover, my queries didn't return more than 10 results at a time. I fixed this adding a
match_all:{}
parameter in thetire.search
function :I also have a question: how can I retrieve the number of hits return by query when using pagination?
Thanks again Ryan for your amazing screencasts !
Update: the newly released version 0.19.7 of the deb package fix the installation issue
I'm answering to my own question: just use the
total
attribute on the search result:Thanks again for the great screencasts !
OK, EHM how can i start:
im newb, but:
I followed your instructions and after i started 'rake db:setup ' MY WHOLE WEBSITE BROKES APART!!! WHAT THE HELL HAPPENEND??? I CANT EVEN SIGN IN OR UP TO ME WEBSITE AND I GET "TEMPLATE IS MISSING" AND WHEN I LOOK IN PGADMIN ALL COLUMNS ARE GONE. YOU HAVE TO HELP ME OR I WILL GO INSANE I SWEAR
I am going to respond to your comment assuming that you are a true beginner. If I am saying something you already know or find the tone of my response offensive then I am sorry, I am only trying to help.
Before I continue to answer your comment I ask you to please not type messages in all-capitals, it appears as if you are shouting and comes across quite rude. Also saying things like "you must help me or I will go insane" doesn't inspire others to help you out, its more likely that you'll get ignored. No matter how desperate you are, if you want someone's help the best thing to do is to be polite and clearly explain your problem and adding as much relevant information as possible. You will find that this applies to communications on most tech resources (mailing lists, forums, etc.)
Now on to your problems:
I'm afraid you made a beginner's mistake, running
rake db:setup
will setup a fresh database, so that's what happened (that's a standard rake task in Rails). If you ran that command on your production environment the only thing you can do is restore your latest database backup. If you don't have a backup there's really nothing that can be done.As for the template missing error, its hard to say without knowing your codebase. If you've been editing code on your production system (which you really shouldn't be doing, development should be done on in its own place, for example on your own machine) you will have to debug the error and fix it.
I hope that my reply gives you some insight into your problems.
Props on keeping your cool. I do terrible in those situations.
http://stackoverflow.com/questions/12120414/ruined-database-through-rake-dbsetup-ror
This Railscast should prob. be updated to use the Flex gem.
It's a great great gem because it has model mapping & scopes like AR does.
I've been using ElasticSearch it's going fine, but I'm wondering if it's possible make 'operations' I mean for example how can I get the average or plus two values etc. ?? I'm not sure if can I do this using Tire??
If does anyone knows any solution I really appreciate
Thanks!!
Wonderful screen cast. The instructions on the github page are sort of daunting but you make things crystal clear!
Hi, I am newbie in rails.
I am trying to use ElasticSearch. But when I run rake db:setup, I don't see my articles any more. I guess the data is not loaded from seeds.rb. How can I solve that problem?
This Railcast really should include a warning **NOT ** to use
rake db:setup
if you already have a database full of rows that you don't want to lose! It's obvious for experienced devs but I can see how a beginner might misinterpret the instructions and cause himself big headaches.I had a similar issue. I got it all stood up but had an index error.
I had to do something like:
rake environment tire:import CLASS='Article'
to re-index (or just index for the first time) everything in the db.
This worked for me
and if i did? can I rollback somehow???
It would be nice if Ryan can do a screencast on new elasticsearch-rails gem, tire seems to get retired nowadays.
Railscasts rocks!
Agree! It would be awesome! Been trying to setup Elasticsearch with the new Elasticsearch rails gem, but not finding enough/clear documentation on how to migrate from Tire. Railscasts are awesome!
I also agree!
agreed
Agree
Those trying to do this in modern-times:
I found that you need to make sure your multi_json gem is version 1.7.8 or earlier.
The Tire gem has been retired in favor of the official
elasticsearch-rails
gem. Is there any chance we can get an updated cast with that? The new gem has decent documentation, but very few examples of real-world usage.Thanks for this video. You've saved me loads of time over the years. Hope all is well.
This episode has been updated to Rails 5 as a blog post Full Text Search using ElasticSearch in Rails 5