For what it's worth, if you want to use Torquebox (I heart Torquebox) but want the hands off style of infrastructure that you can get with Heroku it might be worth it to check out OpenShift from RedHat.
It's very similar to Heroku but a little more flexible for your configuration needs. It's worth noting that it's also in the same Amazon data center that Heroku resides in, so any 3rd party services or databases should be able to connect just as easily within the network.
Yea, you just specify the dictionary in your tsvector and tsquery calls. tsquery('english','my search terms') or tsquery('simple','search this stuff'). Same with tsvector('simple',field).
Regarding the ranking and performance. Based on Postgres documentation you can store the results of the to_tsvector calls in a column, complete with weighting and then put your index on that specific column. Use a database trigger to keep that column up to date (containing as many search fields as you like and any other relevant data you'd like to throw in, author names, etc).
Calculating rank against that column will give you a relevance rank against everything you're searching for. I just implemented this approach on a huge site and the performance was lightning fast.
For what it's worth, if you want to use Torquebox (I heart Torquebox) but want the hands off style of infrastructure that you can get with Heroku it might be worth it to check out OpenShift from RedHat.
It's very similar to Heroku but a little more flexible for your configuration needs. It's worth noting that it's also in the same Amazon data center that Heroku resides in, so any 3rd party services or databases should be able to connect just as easily within the network.
https://github.com/openshift-quickstart/torquebox-quickstart
Yea, you just specify the dictionary in your tsvector and tsquery calls. tsquery('english','my search terms') or tsquery('simple','search this stuff'). Same with tsvector('simple',field).
Regarding the ranking and performance. Based on Postgres documentation you can store the results of the to_tsvector calls in a column, complete with weighting and then put your index on that specific column. Use a database trigger to keep that column up to date (containing as many search fields as you like and any other relevant data you'd like to throw in, author names, etc).
Calculating rank against that column will give you a relevance rank against everything you're searching for. I just implemented this approach on a huge site and the performance was lightning fast.
http://www.postgresql.org/docs/8.3/static/textsearch-features.html
See the "triggers and automatic updates" section.