Sunspot makes it easy to do full text searching through Solr. Here I show how to search on various attributes and add facets for filtering the search further.
Thanks! Did anyone work with solr/thinking sphinx in combination with user-based content?
Example: I want to create a faceted search of boy-scout camping-sites that also has a facets like "places that I visited" and "places that my friends visited". Those facets should work like any other facet
I've been using Sunspot for over a year now and it is very nice, however one thing that seems to be avoided anywhere Sunspot is mentioned is that getting it setup in a production environment can be very tedious. Its actually a real PITA.
Thanks for the plug, Linus! We're web developers at Websolr, and also felt the same pain of setting up and monitoring Solr servers for our client projects. Hence the birth of Websolr.
For those interested in trying out Websolr, you can use the coupon RAILSCAST278 at signup for your first month free of our Silver plan. (Or $25 off any other.)
Exactly... I feel like these tutorials are somewhat incomplete seeing as how the most difficult part of the process is left out.
Not to sound ungrateful, but I would really love to see a bigger emphasis on Rails deployment in general, because IMHO it is by far Rails' biggest drawback.
The biggest obstacle with setting up Sunspot in a production environment was just setting it up with a process monitor, like monit. I recently submitted a patch for Sunspot so that you can set the directory for the PID file, and it was accepted, so there's really nothing left to stop you from using 'rake sunspot:solr:start' as a startup task.
when i try to search the words which has special characters like(computer & laptops,compuers)and plural word searching is also became problem for me using sunspot.kindly anyone help me to solve this issue.
I too switched from Thinking Sphinx over to Sunspot, mainly because at the time I was developing an app needing fulltext search, Heroku did not support TS. Please note that Heroku does support TS now.
And no rake tasks existed with the sunspot_mongoid gem, so I used Aaron Qian's version that can be found on Github, and it works sweet.
Anyway Ryan, I hope you can show us a little more sunspot but used with Mongoid...I'm sure you will have a better solution than the one I have just described!
Just been 2 weeks old with using thinking_sphinx AND Now This looks more doable already:(
I guess I'll stick with thinking the sphinx for now... untill next project
I advise to pay attention to ElasticSearch and Tire. He also ??based on Lucene, supports real time indexing and easy scalability. But if you don't need it, then better use the Sunspot :)
Real-time indexing is available in recent versions of Lucene, and can be accessed in Solr if you are willing to roll up your sleeves and write some Java. We're beta testing our own flavor of it over at Websolr
From my experience, Sphinx is about 1000x faster than Solr when I have to index several millions rows from MySQL. And Sphinx can use case insensitive searching with different languages. I could not find Lucene collation files nor it was clear how to create them.
Sphinx is faster for indexing, certainly, because Solr has a lot more overhead built in to that process. The 'client' software (Sunspot) has to fetch data from the database, format it into XML, then HTTP POST that XML back to Solr.
I like how Mat Brown (author of Sunspot) put it:
In my unbiased opinion, Solr is better than Sphinx in every way, except Sphinx is faster at reindexing the entire data set, which you pretty much never need to do. Unless you use Sphinx.
Hey Ryan, I have been trying to see if there is a way to somehow integrate cancan's ability model into this Sunspot search. Any thoughts on how it might be done?
I'm trying to build a search form capable of full text searching as well as narrowing down based on booleans using Sunspot. However, i'm not able to get it working and the filter doesn't happen. Any input would be appreciated:
@search = Project.search do
fulltext params[:search]
facet(:master_bedroom)
facet(:dining_room)
facet(:bath)
with(:master_bedroom, params[:mb]) if params[:mb].present?
with(:dining_room, params[:dr]) if params[:dr].present?
with(:bath, params[:p_bath]) if params[:p_bath].present?
end
i have the fields in the model:
searchable do
text :description
boolean :dining_room
boolean :bath
boolean :master_bedroom
end
it seems like the @search always has a full collection of all the products no matter what string I pass it.
Here is how I declared searchable in my product.rb file
searchable do
text :title
end
and here is what my index action looks like in the products_controller.rb
def index
@search = Product.search do
fulltext params[:search]
end
@products = @search.results
respond_to do |format|
format.html
format.xml { render :xml => @products }
end
end
Ok found the issue. Apparently the gem ActiveAdmin (covered in a previous episode) causes conflict because it uses meta_search gem in the gemfile.lock, which makes Product.search use meta_search. A workaround to this is
For others getting Errno::ECONNREFUSED: Connection refused - connect(2) errors when testing Solr-enabled models with Cucumber, put this in your env.rb file:
Sunspot::Rails::Server.new.start
at_exit do
Sunspot::Rails::Server.new.stop
end
This will start up your solr server when your tests start, and shut it down afterwards.
I thought 'facets', while nicely rubyish, was a confusing term until I mapped it (mentally for now) to 'filter_groups'. Always looking for good names to aid obviousity ;)
To everyone needing to implement Solr in production mode, I have a couple of hints and warnings:
1) Use the official Apache guide to setting up Tomcat/Solr. It was by far the easiest for me to get working. It is found here: http://wiki.apache.org/solr/SolrTomcat
2) Remember to set the path: in your sunspot.yml. In the case of the example from the guide, this is: path: /solr-example
Here's my newb question. I'd like to search the a related table on multiple attributes, so in the case of comments, search on the content as well as maybe the submitter. However, I'm not sure how to do the mapping to grab two attributes and concatenate them or join them in one array.
But I want to know how we can build relavant search engine?
Ex: If we are searching for keyword "beer" if there is not results for the keyword beer it should atleast return relavant results such as "Alcohol","Whisky".
Another Example would be Keyword "mobile phones" it should return results as "Nokia", "Samsung" and so on.
solr (sunspot) supports synonyms. It should be easy to set up "mobile phones" and "nokia" as synonyms. I assume conditional logic like returning alcohol or whiskey if beer is not found is a lot more complex. Don't know if solr/lucene can do that out of box.
I've been using solr for a few weeks now and am pleasantly surprised how well it's coming together for our needs.
I am having trouble with constructing price facet for retrieving products. Query facet seems to be the way to go but I have to pre-define all price range in the block which is not good enough as the prices vary considerably for different product types. Has anyone found a finer-grained solution to this problem?
I am trying to get highlighting working with the railscast code.
I have set text :content,:stored => true. And tried
@search = Article.search do
fulltext params[:search] {minimum_match 1} do
hightlight :content
end
I read that ngrams causes trouble, so I've tried with and without, but no success. I can't tell if trouble is in my setup for solrconfig, schema, articles model or controller. Stopping and restarting solr with each change is key since it reads schema and solrconfig only when it starts.
Anyone have a working version to share or figured out steps to make it work?
A minor update is required for sunspot_rails 1.3.3 (rails 3.2.6)
$ rake sunspot:solr:start
Note: This task has been moved to the sunspot_solr gem. To install, start and
stop a local Solr instance, please add sunspot_solr to your Gemfile:
group :development do
gem 'sunspot_solr'
end
Unfortunately it seems to be impossible to set up Sunspot and Solr on Windows. Running rake sunspot:solr:run works fine, however, when trying to reindex or if on server development mode and perform the search method, you get the exception "No connection could be made because the target machine actively refused it. - connect(2)". I installed sunspot_rails and sunspot_solr today, so it should be the latest versions. Updated java as well today to 1.7.0_05. Now, here some people get it working by changing line 104 in server.rb from exec(Shellwords.shelljoin(command)) to system(Shellwords.shelljoin(command)), but this doesn't work for me either. I don't really see any possible solution left to try. Does anybody have an idea how to get it work on windows 7?
hii,I'm khamar, I've a problem with solr sunspot search engine configuration at the production level. While performance of the search engine at developing level is good. but when it comes to production it is taking lot of time to execute the query results. I've server in rackspace with 512 ram. can anyone help me pls where i went wrong, and please let me know the required ram for 1lakh records at production.
Working with a group of developers we had some issues after doing git pull on our local machines. Just wondering if we missed something in our .gitignore to make the workflow easier?
Thanks!
HI i am using solr search. in which i want to give less priorty for some words. say if i search for dance lessons.it should get more results for dance in top and then lessons..etc.like wise dance lessons in music means dance and music in top of the result and then lessons
Good cast Ryan. Any reasons to prefer this over thinking_sphinx?
I was wondering the same thing? I have thinking sphinx on my production server, but i think it takes a lot to always making sure its running...
Thanks, good solution, exactly what I was looking for!
I think I misunderstood one thing:
Will this require reindexing of comments on every comment update?
Yes. One solution is to add a comment observer that updates the article model when needed.
Great cast. Think I like this solution better than thinking sphinx. Is it also better?
Yes, by a mile.
Solr does not depend on a DB, unlike thinkng sphinx
The auto indexing looks nice/simpler than sphinx. It'll be interesting to see how they compare in production. Thanks Ryan!
Thanks! Did anyone work with solr/thinking sphinx in combination with user-based content?
Example: I want to create a faceted search of boy-scout camping-sites that also has a facets like "places that I visited" and "places that my friends visited". Those facets should work like any other facet
Sunspot is greater than sphinx, thanks for this episode! :)
@Ryan
Thanks for this episode!
Please add link for thinking_sphinx episode to "Similar Episodes"
I've been using Sunspot for over a year now and it is very nice, however one thing that seems to be avoided anywhere Sunspot is mentioned is that getting it setup in a production environment can be very tedious. Its actually a real PITA.
A good idea, if you have the money, is to use Websolr.
Thanks for the plug, Linus! We're web developers at Websolr, and also felt the same pain of setting up and monitoring Solr servers for our client projects. Hence the birth of Websolr.
For those interested in trying out Websolr, you can use the coupon
RAILSCAST278
at signup for your first month free of our Silver plan. (Or $25 off any other.)Ditto on this experience. I wish they'd somehow tune the version that comes with the gem so it could be used in production.
Exactly... I feel like these tutorials are somewhat incomplete seeing as how the most difficult part of the process is left out.
Not to sound ungrateful, but I would really love to see a bigger emphasis on Rails deployment in general, because IMHO it is by far Rails' biggest drawback.
Not everyone has the money for Heroku & Websolr!
The biggest obstacle with setting up Sunspot in a production environment was just setting it up with a process monitor, like monit. I recently submitted a patch for Sunspot so that you can set the directory for the PID file, and it was accepted, so there's really nothing left to stop you from using 'rake sunspot:solr:start' as a startup task.
See:Configuring Solr
when i try to search the words which has special characters like(computer & laptops,compuers)and plural word searching is also became problem for me using sunspot.kindly anyone help me to solve this issue.
thanks in advance
instead of spamming this comment section, you should post or search on stackoverflow.com ...
For users in Windows precaution:
This didn't work for me argument (0 for 1) was thrown ...
I had to get around the index action like this..
The second version is actually what worked for me on OS X. The Article.search method threw an error while trying to access @search.results
Yeah, thanks Ryan, great cast!
I too switched from Thinking Sphinx over to Sunspot, mainly because at the time I was developing an app needing fulltext search, Heroku did not support TS. Please note that Heroku does support TS now.
Also, I used Sunspot with Mongoid, but this required a monkeypatch to have access to variables such as current_page, total_pages, etc...those used in pagination. Here is a link to the approach I took http://techbot.me/2011/01/full-text-search-in-in-rails-with-sunspot-and-solr/.
And no rake tasks existed with the sunspot_mongoid gem, so I used Aaron Qian's version that can be found on Github, and it works sweet.
Anyway Ryan, I hope you can show us a little more sunspot but used with Mongoid...I'm sure you will have a better solution than the one I have just described!
I used a Sunspot with Mongoid like:
gem 'sunspot_mongoid'
Just been 2 weeks old with using thinking_sphinx AND Now This looks more doable already:(
I guess I'll stick with thinking the sphinx for now... untill next project
I would be interested to see the differences or advantages of Sunspot with Solr to Sphinx with Thinking Sphinx.
I haven't used Sunspot/Solr but from this video it seems to be quite similar in features & implementation with TS.
I advise to pay attention to ElasticSearch and Tire. He also ??based on Lucene, supports real time indexing and easy scalability. But if you don't need it, then better use the Sunspot :)
Real-time indexing is available in recent versions of Lucene, and can be accessed in Solr if you are willing to roll up your sleeves and write some Java. We're beta testing our own flavor of it over at Websolr
From my experience, Sphinx is about 1000x faster than Solr when I have to index several millions rows from MySQL. And Sphinx can use case insensitive searching with different languages. I could not find Lucene collation files nor it was clear how to create them.
Sphinx is faster for indexing, certainly, because Solr has a lot more overhead built in to that process. The 'client' software (Sunspot) has to fetch data from the database, format it into XML, then HTTP POST that XML back to Solr.
I like how Mat Brown (author of Sunspot) put it:
Another well timed Railscast. Had to look at Solr today as the last thing to implement on a project.
Thanks very much!
I think that the current Sunspot gem uses geohashing for spatial search which is inaccurate in certain scenarios.
There's some ongoing work to support the new, official Solr 3 spatial search APIs.
boosting a single attribute is simple
text :composition_name do
composition.name
end
```
how can a solr search be combined with an activerecord find_by_sql call?
Thanks for the great screencast.
A side note: I went through your github notes on how this website is configured and they were extremely useful to me.
I think it would be great to have a screencast showing how to configure a linode VPS for rails.
Great tutorial/example, like so many before :)
Is there any way (like already available through Sunspot or other gem/plugin) to "colorize" the results?
Example:
i search for term "sunspot"
There is bunch of results (title, content...), but every word in the result list that contains "sunspot" is lets say green or something.
Thx!
You're looking for the
highlight
method:Great, thanks for this!
For anyone interested in a much more advanced sunspot search, you can check out my demo here.
I was wondering how sunspot compares with picky? http://florianhanke.com/picky/
Thanks Ryan.
Just curious how would you test the search and/or fake it for other tests?
Try:
gem "sunspot_matchers"
gem "sunspot-rails-tester"
Great screencast!
I noticed it's really easy to implement this with the virtual attribute tagging tutorial (with various taggings with tags). Like so:
anyone know how to get this working with Kaminari?
Yeh I'm trying to figure out the same thing
I have this working with Kaminari, what is the problem you are having?
i get an undefined method:
undefined method page for #<Array:0x109881fa8>
@images = @search.results.page(params[:page]).per(24)
try the latest master from github, this changeset added generic(Kaminari AND WillPaginate) pagination support:
merge from pull #67
I <3 Kaminari w/ Sunspot.
Anyone faced with this problem?
When I do like this:
And then try to destroy article in model of which has line:
Comments are not deleted!
How on earth did you get this gem installed? I'm running into all sort of trouble on OSX Lion. I keep getting stuck on this nokogiri dependency where it complains about missing libiconv. I've followed the directions on the nokogiri site and this guy http://pinds.com/2011/08/06/rails-tip-of-the-day-rails-os-x-lion-rvm-nokogiri/comment-page-1/ but no such luck
Hey Ryan, I have been trying to see if there is a way to somehow integrate cancan's ability model into this Sunspot search. Any thoughts on how it might be done?
Hey Kevin, did you ever figure out a way to integrate cancan with Sunspot?
Any idea how to enable partial matching? Make it work like; "select * from posts where title like 'query%'"
You set this up on SOLR side by switching to the EdgeNGram filter
Maybe I'm crazy but the above snippet only worked for me when I camel cased the field type tags. As in:
rather than:
I'm trying to build a search form capable of full text searching as well as narrowing down based on booleans using Sunspot. However, i'm not able to get it working and the filter doesn't happen. Any input would be appreciated:
i have the fields in the model:
searchable do
text :description
boolean :dining_room
boolean :bath
boolean :master_bedroom
end
and i have the following for my view:
<%= form_tag projects_path, :method => :get do %>
<%= text_field_tag :search, params[:search] %>
<%= check_box_tag :bath, 'true'%>
<%= submit_tag "Search", :name => nil %>
<% end %>
First, thanks for the awesome tutorial!
I am running in to this error when I try to access the @search.results in my controller: undefined method `results' for #MetaSearch::Searches::Product:0x129d527d8
it seems like the @search always has a full collection of all the products no matter what string I pass it.
Here is how I declared searchable in my product.rb file
searchable do
text :title
end
and here is what my index action looks like in the products_controller.rb
def index
@search = Product.search do
fulltext params[:search]
end
@products = @search.results
respond_to do |format|
format.html
format.xml { render :xml => @products }
end
end
any idea why I keep getting that error?
I am using Rails 3.1 by the way if that matters
Ok found the issue. Apparently the gem ActiveAdmin (covered in a previous episode) causes conflict because it uses meta_search gem in the gemfile.lock, which makes Product.search use meta_search. A workaround to this is
Sunspot.search(Product) do
@search = Product.solr_search do
What is the way to use Sunspot/Solr with mutiple fields ?
It works fine with a simple form, as explained in the screencast.
For example, I'm looking for how to make a search engine :
<%= form_tag products_search_path, :method => :get do %>
<%= label_tag "Location ?" %>
<%= select_tag :address, "new york".html_safe %>
<%= label_tag "Type ?" %>
<%= text_field_tag :model, params[:type] %>
<%#= select_tag :model, "testtest".html_safe %>
<%= label_tag "Category ?" %>
<%= text_field_tag :category, params[:category] %>
<%= submit_tag "Search" %>
<% end %>
Thanks guys!
I have the same question
Me 2!
For others getting Errno::ECONNREFUSED: Connection refused - connect(2) errors when testing Solr-enabled models with Cucumber, put this in your env.rb file:
Sunspot::Rails::Server.new.start
at_exit do
Sunspot::Rails::Server.new.stop
end
This will start up your solr server when your tests start, and shut it down afterwards.
Anybody know how to configure sunspot/solr to imitate textmate's command-t like searches?
I managed to figure it out :)
When it comes to Autocomplete and Autosuggestion, what are people using?
I too would like to know what people are using for Autocomplete and Autosuggest.
Also, for those that can't find the wiki from the links, the new active link for Sunspot is:
https://github.com/sunspot/sunspot/wiki
i cant get sorting to work for text field (not
string
one)i have an exception:
Sunspot::UnrecognizedFieldError (No field configured for Article with name 'name')
if i reindex with
string :name
- it become sortable but i loosing possibility to nGram search on that fielddid anyone have same issue and got solution to it?
Make sure you re-index and try using string again because I know for decimal you have to use float and for date you have to use time.
I thought 'facets', while nicely rubyish, was a confusing term until I mapped it (mentally for now) to 'filter_groups'. Always looking for good names to aid obviousity ;)
To everyone needing to implement Solr in production mode, I have a couple of hints and warnings:
1) Use the official Apache guide to setting up Tomcat/Solr. It was by far the easiest for me to get working. It is found here: http://wiki.apache.org/solr/SolrTomcat
2) Remember to set the path: in your sunspot.yml. In the case of the example from the guide, this is: path: /solr-example
3) If you're going to search for special characters, such as æ ø å (Danish characters), you need to add UTF-8 encoding to your tomcat config file, as described here: http://e-mats.org/2008/04/solving-utf-8-problems-with-solr-and-tomcat/
and here: http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
I have installed apache-tomcat/solr server in centos , I can able to see the solr page in this link ,
http://localhost:8080/solr/
reindexing the models alos done all the log pointing to the catalina log file
how to connect the solr server to rails application
I have included sunspot.yml code here
sunspot.yml
development:
solr:
hostname: localhost
port: 8982
path: '/solr/'
log_level: WARNING
solr_home: /opt/solr
auto_commit_after_request: false
test:
solr:
hostname: localhost
port: 8981
log_level: OFF
production:
solr:
hostname: localhost
port: 8080
path: '/solr/'
log_level: WARNING
pid_dir: '/var/run'
auto_commit_after_request: false
Please help me to get resolved
Thanks in advance..
Here's my newb question. I'd like to search the a related table on multiple attributes, so in the case of comments, search on the content as well as maybe the submitter. However, I'm not sure how to do the mapping to grab two attributes and concatenate them or join them in one array.
Okay, figured that out quickly enough. Just used << to append one array to the other.
I faced some difficulties on Sunspot (Solr). At the development, it runs fine. However, when i start running at production, it is connection refused.
My server is centos 6.0 with Nginx.
do i need to install the tomcat for the solr server? And solr is apache, can it run at nginx at production ?
Many Thanks
Great Tutorial.
But I want to know how we can build relavant search engine?
Ex: If we are searching for keyword "beer" if there is not results for the keyword beer it should atleast return relavant results such as "Alcohol","Whisky".
Another Example would be Keyword "mobile phones" it should return results as "Nokia", "Samsung" and so on.
solr (sunspot) supports synonyms. It should be easy to set up "mobile phones" and "nokia" as synonyms. I assume conditional logic like returning alcohol or whiskey if beer is not found is a lot more complex. Don't know if solr/lucene can do that out of box.
I've been using solr for a few weeks now and am pleasantly surprised how well it's coming together for our needs.
hmm, you could use wordnet in order to extract synonyms and hypernyms and add them to the search
Awesome tutorial, thanks Ryan!
I am having trouble with constructing price facet for retrieving products. Query facet seems to be the way to go but I have to pre-define all price range in the block which is not good enough as the prices vary considerably for different product types. Has anyone found a finer-grained solution to this problem?
Thanks.
I am trying to get highlighting working with the railscast code.
I have set text :content,:stored => true. And tried
@search = Article.search do
fulltext params[:search] {minimum_match 1} do
hightlight :content
end
I read that ngrams causes trouble, so I've tried with and without, but no success. I can't tell if trouble is in my setup for solrconfig, schema, articles model or controller. Stopping and restarting solr with each change is key since it reads schema and solrconfig only when it starts.
Anyone have a working version to share or figured out steps to make it work?
Great tutorial. Any example on config Sunspot for polymorphic associations?
A minor update is required for sunspot_rails 1.3.3 (rails 3.2.6)
The rake task remains the same after this change
Unfortunately it seems to be impossible to set up Sunspot and Solr on Windows. Running rake sunspot:solr:run works fine, however, when trying to reindex or if on server development mode and perform the search method, you get the exception "No connection could be made because the target machine actively refused it. - connect(2)". I installed sunspot_rails and sunspot_solr today, so it should be the latest versions. Updated java as well today to 1.7.0_05. Now, here some people get it working by changing line 104 in server.rb from exec(Shellwords.shelljoin(command)) to system(Shellwords.shelljoin(command)), but this doesn't work for me either. I don't really see any possible solution left to try. Does anybody have an idea how to get it work on windows 7?
got it to work by downloading and manually installing the zip from github sunspot.
I cant' turn on stemming:
If somebody searches for 'Caphs' it should find this too: 'Caph\'s'
Doesn anybody know how to do this?
how to search by single character using solr sunspot
What is the way to use Sunspot/Solr with mutiple fields ?
It works fine with a simple form, as explained in the screencast.
hii,I'm khamar, I've a problem with solr sunspot search engine configuration at the production level. While performance of the search engine at developing level is good. but when it comes to production it is taking lot of time to execute the query results. I've server in rackspace with 512 ram. can anyone help me pls where i went wrong, and please let me know the required ram for 1lakh records at production.
Thanks in advance...
I have installed apache-tomcat/solr server in centos , I can able to see the solr page in this link ,
http://localhost:8080/solr/
reindexing the models alos done all the log pointing to the catalina log file
how to connect the solr server to rails application
I have included sunspot.yml code here
sunspot.yml
development:
solr:
hostname: localhost
port: 8982
path: '/solr/'
log_level: WARNING
#solr_home: /opt/solr
auto_commit_after_request: false
test:
solr:
hostname: localhost
port: 8981
log_level: OFF
production:
solr:
hostname: localhost
port: 8080
path: '/solr/'
log_level: WARNING
pid_dir: '/var/run'
auto_commit_after_request: false
Please help me to get resolved
Thanks in advance..
~
If you are using Paginate/kaminari with Sunspot use this code with in the index in the Controller.
Thanks for this comment, really helped out.
Working with a group of developers we had some issues after doing git pull on our local machines. Just wondering if we missed something in our .gitignore to make the workflow easier?
Thanks!
Error: too many boolean clauses
Hey,
I've read many articles and concluded that I am getting this error due to 1024 option in solrconfig.xml file in solr/config directory.
I have tried to increase it upto 16384
and also tried to comment it out by doing this.
but getting same error.
yes i am using thousands of ids in search like
with(:id).all_of vids
where vids are the array of those ids.
is there any other option that i can fix this issue or can use IN kind of clause with search method ?
kindly guide me its really urgent,
thanks and advance,
best,
MK.
HI i am using solr search. in which i want to give less priorty for some words. say if i search for dance lessons.it should get more results for dance in top and then lessons..etc.like wise dance lessons in music means dance and music in top of the result and then lessons
Awesome tutorial!
Great episode
How can I implement checkbox instead of facet link?