Outstanding job, Ryan. I think this is your best Railscast yet! Great job showing simultaneously how simple it is to use Thinking Sphinx and how much power you can leverage by drilling down into it. I'll be watching this one again. :)
As always you cover something real applications actually need instead of some obscure and useless function.
Long time fan here. Thanks for the continued outstanding quality of your railscasts.
Ferret used to be the de-facto choice. Any reasons not to use Ferret? Why having chosen Sphinx over other solutions?
Is it possible to perform searches across several models?
Thanks for the great screen casts. Keep them coming!
Is it possible to do result highlighting with Sphinx to show the search terms?
Crazy timing! Just implemented Thinking Sphinx into a major project in place of Solr.
@Swami, I haven't tried Ferret, so I can't comment too much about that. I have heard it has some stability issues, but that's all I know.
As for searching across multiple models, I believe it's possible with "ThinkingSphinx::Search.search", but I haven't tried it.
@andrej, I don't think Sphinx gives an easy way to highlight the search terms. However you could use the "highlight" helper method Rails provides.
Thanks for the screencast Ryan, I just implemented TS last week.
Also, instead of thinking_sphinx:index command, you can just type ts:in
ts:start
ts:stop
@MikeInAZ, thanks for pointing that out. I normally use auto-completion for rake tasks as you may have noticed in this episode. If anyone's interested in that, see my dotfiles:
http://github.com/ryanb/dotfiles
Thanks for the 'cast, Ryan.
I just converted a large application from Ferret to Sphinx last week, and ran into a whole bunch of problems. I wrote a long <a href="http://blog.lrdesign.com/2008/07/fixing-problems-with-sphinx-search/">blog post</a> about the trials and tribulations I went through to get it working, including the fixes.
Though we were using ultrasphinx instead of thinking_sphinx, some of the struggles will be relevant to everyone using sphinx, for example the fact that foxy fixtures generates IDs that are often too large for sphinx's indexing strategy.
Whoops, sorry about that, didn't realize your comments don't allow links. (suggestion: a preview button)
Here's the bare URL I mentioned in my previous comment:
http://blog.lrdesign.com/2008/07/fixing-problems-with-sphinx-search/
Outstanding screencast !
Just one question, how would you manage a more complicated search form.
Let's say that the user can say if he wants to search only by title, author or comments.
Is there an easy way to filter the indexes to use ?
@Swami: I just read this post (http://freelancing-gods.com/posts/thinking_sphinx_reborn)
Multi-model searching is (obviously) supported:
ThinkingSphinx::Search.search "help"
Haven't tried it though. Hope it helps!
@Evan, preview button is coming very soon. :)
@jblanche, you can use the :conditions hash to search by specific fields. I think that will do what you need.
How about a episode showing how to use you dotfiles...looks nice on this episode :P
Hey, just a suggestion, what about put the last comment in the top of the page. So, we can read from the more recent to the oldest. =P
My English listening is really weak :-), can you please write somewhere what provides instant indexing?
@Rafael, thanks for the suggestion. Most blogs do the earlier comments first, so I think it's best to follow that convention. But I'll consider it.
@Xavier, are you referring to delta indexing? See the bottom of the Thinking Sphinx Usage page for more info on that.
http://ts.freelancing-gods.com/usage.html
@Ryan, I'd agree with Rafel's idea.
Maybe the comment form could also be moved up (and made collapsable to save space?). This might encourage users to post (even) more comments :)
First of all... thanks for the amazing episode again! Thanks for you r help with all the episodes you are making.
I have one thing what i dont understand. How can you manage this on a webhost server you dont maintain yourself? You have to restart it with ssh every time somebody wrote an article or smt else?
@Michael, it's best to set up a "cron" job to trigger the rake index task. If you aren't able to do this then you may be able to get your Rails app to spawn a background process to redo the index at a regular interval. I'm not exactly sure how that would work though.
Well done, I really learnt a lot from this cast and so from others. You have made it very simple to me to understand it without reading a single line on thinking sphinx. Thank you, waiting more interesting topics from you
Thanks for the podcast Ryan, I've been looking into the various search technologies for a while. As far as I can tell, one advantage to Ferret is it supports a "fuzzy" string match for search parameters. Any suggestions on how to handle this with Sphinx?
@Jeff, sphinx has some "fuzzy" matching for phrases, where there need to be a certain number of words a given distance a part. See the "extended" match mode for this behavior.
But I don't think this is what you're looking for. If you want fuzzy searching on a given word, where similar words are also found, then I don't think Sphinx supports this.
You may want to look into Xapian. It has some powerful features regarding this. Including stemming which can find variations of a given word (run, runs, running, etc.). it also supports spelling correction to find terms which are similar to a given word.
Is there anybody lucky with installing sphinx on a Windows computer?
if i use the install link here above it says that vendor/plugin/sphinx is removed again. And after that it wont start sphinx
Hi, Ryan!
I always liked all your episodes! I'm buildind a Rails programming team at my company and I'm showing some of your episodes a day. Congratulations for the great work! :D
Best regards,
Felipe Giotto.
hi Ryan....is there any possibility that you can zip and publish this project on your site (and all future projects)...
Jeff: Sphinx has support for fuzzy matching, but it's not enabled by default.
If you're using Thinking Sphinx, you can enable this by adding allow_star: true for every environment in config/sphinx.yml (ie: in a similar format to database.yml).
@rob you can find the full code by clicking the "Full Source Code" link at the bottom of the code samples.
@andrej: Don't you guys use Javascript for anything? :) Thanks for the great episode Ryan!
If anyone does work out the best route for highlighting search terms, please post back. I intend on giving them a solid background colour rather than using the highlight visual effect.
Sean Hess, I'm a new to js and Rails, are you suggesting an application.js script that takes the params and highlights the DOM elements that match?
Indexing is taking ages on my MBP...
p.s. thanks again, Ryan.
Is there an error in the video?
I have been having trouble getting the :field_weights parameter to behave. I can't find it mentioned in the thinking_sphinx API entries on the various search methods:
http://ts.freelancing-gods.com/rdoc/
The best I can find is this:
http://ts.freelancing-gods.com/rdoc/classes/ThinkingSphinx/Index/Builder.html
Where it says :field_weights is an argument to the set_property method.
This seems to imply that field weighting happens when the index is built, not when the search is performed. I'd rather be able to declare my relevancy preferences at search time (like the video shows), not at index time (like the API seems to say).
Am I missing something here?
Oops. Forgot to attach my email address to that message.
Great work, Ryan. Your generous contributions of work have saved me a great deal of aggravation ever since I discovered them.
Okay, I think I found the problem.
The match_mode => :boolean ignores all field weighting, but match_mode => :extended honors them. This is a bit puzzling, since Extended seems to be just like Boolean, but with even more query features. Extended lets you use boolean ORs and NOTs, without losing the relevancy rankings.
Here's where Sphinx says it does that:
http://sphinxsearch.com/doc.html#weighting
And here's a code sample:
--- In app/models/article.rb
define_index block
...
# Declare a default relevancy table, which can be overridden at search
set_property :field_weights => { :title => 3, :subtitle => 2 }
...
end
--- In app/controllers/article_controller.rb
...
# Grab a page-load of articles, using title-centric field weights.
@articles = Article.search query_string,
:include => [:author, :comments], :match_mode => :extended,
:field_weights => { :title => 10, :subtitle => 6 },
:page => params[:page], :per_page => 20
I was thinking about the best way to run a daemon to reindex TS at a certain interval. Would it make sense to use the daemon gem in an initializer to kick off this process?




