#243 Beanstalkd and Stalker
Dec 06, 2010 | 9 minutes | Plugins, Background Jobs
Beanstalk is a fast and easy way to queue background tasks. Stalker provides a nice wrapper interface for creating these jobs.
- Download:
- source code
- mp4
- m4v
- webm
- ogv
Excellent! Thank you!
I was using magent for my mongoDB application. Will try out this for sure.
Thank You.
I love mondays ;)
Thank you.
Hi Ryan, you've done several episodes regarding background processing. Have you ever considered doing a cast about Resque. I've been using it for a while but the lack of extensive documentation leaves much space for experimenting.
Personally, I'd love to find out more. Thanks. Oktav
Hm, it seems to me more convenient to use Delayed Job for those kind of background tasks. Opinions? What would be the advantages of Beanstalkd over DJ?
Use beanstalkd with caution. Currently it lacks basic authentication and basic queue management (like pulling job state via job id). If there a queue problem you have to flush the whole queue bin. There is also lack of logging. So just be careful. That said its good but just know what your getting your self in to.
@Oktav, I'm planning on doing an episode on Resque some time in the future. Thanks for the suggestion.
@Nico, there is quite a bit of difference between Delayed Job and Beanstalkd and it's important to choose the right tool for the job.
Delayed Job is Rails specific and expects your Rails environment to be loaded for each worker. I prefer to keep the workers as light on memory as possible. CORRECTION: found out this isn't entirely true because it can be used in something like Sinatra with ActiveRecord or MongoMapper.
Delayed Job also polls for new jobs every 5 seconds by default. In this case I wanted it to respond as fast as possible with a new move and 5 seconds is quite a wait. It's also not an in-memory store so it's slower.
That said, Delayed Job is much more convenient and if it fits your requirements I definitely recommend it. It also has much better support for managing the jobs.
@Knodi, thanks for the warnings. In this case any pending moves or city names can be requeued so nothing is lost, however that is not the case for every scenario. I should have mentioned this in the episode.
Ryan, At about 1:16 you say "brew install" and type "gem install" - just if some people are confused. What you type is obviously correct, not what you say :)
Stalker adds logging at a layer above beanstalkd, so it can help mitigate one of the problems Knodi describes.
It seems like this will only connect to beanstalkd if it is running on the same host. How would you connect to a remote beanstalk queue?
Anyone have problem with installing 'brew install beanstalk' on Mac 10.5.8?
Hi Ryan, I have two questions:
1. Why do you like putting jobs inside the config? It looks a bit odd to me.
2. How would you compare it with delayed_job?
Thanks!
Sweet Jeebus in a minivan stay away from beanstalkd!
* no authentication
* no persistence
* very little queue management
* tiny user/support base
* minimal interop
Ruby users have at least 4 better options for queue management:
* DelayedJob
* Resque
* amqp/nanite
* xmpp4r
Please please you owe it your children, and their children, to evaluate those other options first.
@Luke, you can certainly connect to any server running Beanstalkd, it's not necessary to do it all over the same server. However authentication is up to you.
@Sohan, I place the jobs in the config directory because it is very similar to deploy.rb for Capistrano. It requires an external dependency (stalk command) to run it and in a sense it is configuration for Stalker.
See my response to Nico above for how this compares to Delayed Job.
@Marquisdebad, Beanstalkd does have persistance with the "-b" command, however I prefer to use it in situations where persistence isn't necessary. That is where you can requeue anything based on the database.
Also authentication isn't a problem if Beanstalkd is running on the same server as your app. Just block external traffic to that port.
Beanstalkd is lightweight and fast. It is event driven which means it responds instantly to the jobs and doesn't do constant polling. I find that to be its biggest advantage.
@Yorick, I just switched to GitHub authentication for commenting here which should take care of the spam problem.
@Marquisdebad It really depends on what you want from a queue system. I personally use beanstalkd to process incoming email on an app. It's not supposed to be as powerful as the other options you suggest; it's a simple and very fast queue system.
A couple of corrections for you:
* it does have persistence; launch the daemon with -b
* it may not have the size of userbase as DJ, but it does have some very high traffic users. I believe http://ravelry.com does for a start (and obviously Causes on Facebook, for which it was created).
I love it because it's incredibly simple :)
Comment while signed in from Github. Thanks Ryan!
Love the new Github sign in and thanks for your hard work.
Great cast Ryan.
@ryan bates
@robert
Thanks for correction re persistence.
Does anyone have any links on writing test cases with beanstalker?
Here's a blog post I wrote on writing RSpec test cases for Stalker
Testing Stalker (Beanstalkd) Jobs With RSpec In Rails
Please send over any feedback.
Strange thing. When i put a job into queue when stalker is running, it repeats the same job over and over again and only stops if the job fails. Someone else with this?
Found the error. Seems that
record.save :validate => false
does this.