And of course: THANK YOU very much for these ultra high quality screencasts. I am so glad that I have this very convenient source of know how.
One question I have:
So far I am not very comfortable with the concept of the Ruby symbols. Most of the time I know how to modify existing code, but so far I was not able to find a text explaining the concept of Ruby symbols sufficiently.
...
This text that I just found helps somewhat: http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols
and comment 1 and 12 on mentioned page indicate that there are special Rails aspects of Ruby symbols, but the article is not intended to cover Rails.
@Thomas, as mentioned near the beginning of this episode, there are other caching solutions which may work better in this scenario. This is more about showing off another technique for your caching repertoire.
The most optimized approach here is page caching. However, that requires you set up a sweeper on both article and comment and is a bit more complex.
The solution shown here works best when you have a heavy page where you are fetching a lot of related data that relies all on a single model. It might be a user profile page for example.
While this seems neat, is there really a big benefit to it? You will be hitting the article table everytime to get the cache key, right?
So you're still hitting the database a lot; You do avoid the comment queries, but it seems silly to be fetching the entire article model just to get the caching key. Seems a rather naive way of trying to scale up, particularly considering article is the one thing that is likely NOT to change, making it the perfect thing to cache in the first place.
Maybe you could provide some insights as to why you would use this, as it seems to be a pretty short-sighted way of making the page scale; why not just add the caching key expiration in the first place, it's not as though it's hard or messy?
Hello Ryan, I have this solutin in an application, but since I updated to rails 2.3.3 the parameters are note passed as an array, but as a hash like this:
expected
"task_ids"=>["1","2","3","4"]
now I receive =>
"task_ids"=>[{"1"=>nil}, {"2"=>nil}, {"3"=>nil}, {"4"=>nil}]
do you know if it is an issue or the default new behavior should be as is....
Has anyone else received a NoMemoryError using Passenger? Whenever I access any action it takes a while to load, finishing off with displaying a NoMemoryError. In Activity monitor, one of the `ruby` processes sky-rockets its memory usage from about 20MB to ~2 GB! It stays at 2GB until a second request is made. When another is made, the process uses up 20MB of memory for a second, and then jumps back up to 2GB. The only error message is "failed to allocate memory". The development log is at http://pastie.org/573432
It seems like the entire request is being processed, except for actually rendering the page. The views are written in ERB, although I do require haml for other views.
Everything was working fine with Passenger last time I used it. I've restarted Passenger several times, and have restarted my computer. Everything works fine when I use script/server. Anyone else having this issue?
I'm trying to use workling and starling to expire_fragment() using a regexp. I'm using FileStore as my cache and running the expire fragment in an async workling worker. I've checked that the RAILS_ROOT is the same in my worker as it is in my rails server. I've checked that the Rails.cache.cache_path is the same in both. I puts'd the finished regexp to the console and it looks correct. But the cache fragments are not being destroyed. What am I doing wrong?
I've been playing with scRUBYt and FireWatir lately, they've given me much joy. I'll be looking forward to your screencast on scRUBYt when you do get it to run. Salute!
Ryan, could you make a cast that combines delayed_job with multiple ajax actions (redirects or updates)?
See my problem is the following: In the index action of the controller I show all entries of a database. Background jobs are started within the index action. Now on every entry listed I have a .gif that shows loading. After a job is finished this .gif image should be replaced by an arrow image for success or by a X image for failed.
Now obviosly I would have to call ajax for every lib job, but umfortunately I wasn't able to achive this until now cause its tricky or probably not possible to call an action or a rjs file from outside the controller itself that would replace the corresponding div id.
I'm with @RORgasm on Hpricot. It uses CSS or Xpath selectors and has great block handling for multiple elements. Behaves similarly to jQuery on the traversal end.
I get a problem though running Product.fetch_prices
"
You have a nil object when you didn't expect it!
You might have expected an instance of ActiveRecord::Base.
The error occurred while evaluating nil.[]
"
Thanks for another great 'cast. I have been scraping with mechanize/nokogiri and like it (except installing is painful). I was (and still occassionally) use watir to scrap. As always, it is good to see a new tool.
I would love to see a 'cast where you navigate and scrap a site that includes Javascript.
Just installed ruby 1.9.1, passenger 2.2.4, rails 2.3.3 and have exactly same problem!
script/server is running fine, rails running fine, passenger runs fine untill I destroy, update or create then I get 500 errors. When using script/server and port 3000 no problems at all...
hey Ryan, I would actually suggest taking a look at Hpricot... I've done a few applications that required quite a bit of scraping (legal of course :) ) and fount Hpricot to be a stable, good solution. The Hpricot API also uses the familiarity of CSS selectors for convenience ... unless I'm missing something is there something else that ScrAPI offers that Hpricot doesn't?
If you do not want to replace FireBug with FireQuark you can use http://www.selectorgadget.com/ bookmarklet to interactively build a unique CSS selector for any element on a page. This works also in Safari.
its a 64-bit problem.
This guy has a quick and dirty fix for it. I did not use it. I will wait until the gem is improved to not include tidy/tidylib.dll tidy/tidylib.so as they hopefully are in the middle of removing tidy/.
Firstly thanks! I look forward to each weeks episode.
I'm not sure what goes wrong but this is the output of the scrapitest.rb
/usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/reader.rb:216:in `parse_page': Scraper::Reader::HTMLParseError: Unable to load /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/../tidy/libtidy.dylib (Scraper::Reader::HTMLParseError)
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:865:in `document'
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:749:in `scrape'
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:347:in `scrape'
from scrapitest.rb:10
Any suggestions on how to allow permalink's with periods (".") in them? Assuming product is routed using "map.resources :products", then anything after a period will be treated as a format.
The main drawbacks some of the above suggestions are:
1) Link to pages that shouldn't be indexed will have a negative impact for you when google does it's link matrix magic - PageRank (and no, rel="nofollow" does not help you!).
2) Stuff like hidden links and javascript inserts might cause problem for programs used by the blind.
3) Cluttering your html with a lot of stuff you'll instantly hide and keep hidden during the entire stay on the page just isn't clean - KISS, Broadband, Rendering, (and some would even argue, Security through obscurity).
Keep up the nice work Ryan, you make Mondays a bit nicer!
Great tutorial as always. Quick question regarding second submittals. If a user gets the error message (which i've added) but then needs to resubmit their order again, I assume they go to the new action again. but doesn't this create another Order record in the DB? should we redirect them to an EDIT method, so we can reuse that same line in the Order table?
This screencast really saved my day after recent spam comment attacked my blog.
I just put on the code in my blog and will see how good Akismet can stop the spam comments.
A big thank you to you, Ryan!
Great introduction to cucumber! I already knew the framework and noticed, that you're pointing out the power of it very clear. A great starting-point for your audience.
@Ryan: Nice video.
@all: I am rails evangelist and currently using it to make my personal web blog. I want to print friends blogging activity on user home page. Is there any recommended tutorial or pointers to accomplish that. I thought of using cache but thinking there could be a better way.
If you encounter "EOFError (end of file reached)" on the consumer, and <Mongrel::HttpParserError: Invalid HTTP format, parsing fails.> on the REST provider, you might need a leading slash on the ActiveResource prefix. In this RailsCast, it shows
Very nice Ryan, I'm especially looking forward to the follow-up episode with the changing URL and history maintenance etc.
I did just the same thing last week!
Except that I used also history plugin for jQuery - which makes also back/forward buttons work :)
http://www.balupton/projects/jquery_history/
I am moving snorby.org to jQuery and this was a huge help - very clean. Thank you.
And of course: THANK YOU very much for these ultra high quality screencasts. I am so glad that I have this very convenient source of know how.
One question I have:
So far I am not very comfortable with the concept of the Ruby symbols. Most of the time I know how to modify existing code, but so far I was not able to find a text explaining the concept of Ruby symbols sufficiently.
...
This text that I just found helps somewhat: http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols
and comment 1 and 12 on mentioned page indicate that there are special Rails aspects of Ruby symbols, but the article is not intended to cover Rails.
Excellent tutorial. I am smiling for the first time today. Because it makes sense now, and it works! Thank you!
@Thomas, as mentioned near the beginning of this episode, there are other caching solutions which may work better in this scenario. This is more about showing off another technique for your caching repertoire.
The most optimized approach here is page caching. However, that requires you set up a sweeper on both article and comment and is a bit more complex.
The solution shown here works best when you have a heavy page where you are fetching a lot of related data that relies all on a single model. It might be a user profile page for example.
While this seems neat, is there really a big benefit to it? You will be hitting the article table everytime to get the cache key, right?
So you're still hitting the database a lot; You do avoid the comment queries, but it seems silly to be fetching the entire article model just to get the caching key. Seems a rather naive way of trying to scale up, particularly considering article is the one thing that is likely NOT to change, making it the perfect thing to cache in the first place.
Maybe you could provide some insights as to why you would use this, as it seems to be a pretty short-sighted way of making the page scale; why not just add the caching key expiration in the first place, it's not as though it's hard or messy?
Very good cast and good solution. I am sure that many developers forget about data in logs
Alexandre, I got this problem some year ago and fixed it somehow. Can u please provide us with your exact view-code and full params at submit...
Yes, how does hpricot compare to ScrAPI? How about their speeds in comparison?
Thanks comment 79. for saving me time :)
Hello Ryan, I have this solutin in an application, but since I updated to rails 2.3.3 the parameters are note passed as an array, but as a hash like this:
expected
"task_ids"=>["1","2","3","4"]
now I receive =>
"task_ids"=>[{"1"=>nil}, {"2"=>nil}, {"3"=>nil}, {"4"=>nil}]
do you know if it is an issue or the default new behavior should be as is....
thanks
Hello Ryan,
While uploading a file, I also need to scan whether the uploaded file is a virus or not. Can you please tell me how to do it?
Thanks,
Madan Kumar Rajan
Has anyone else received a NoMemoryError using Passenger? Whenever I access any action it takes a while to load, finishing off with displaying a NoMemoryError. In Activity monitor, one of the `ruby` processes sky-rockets its memory usage from about 20MB to ~2 GB! It stays at 2GB until a second request is made. When another is made, the process uses up 20MB of memory for a second, and then jumps back up to 2GB. The only error message is "failed to allocate memory". The development log is at http://pastie.org/573432
It seems like the entire request is being processed, except for actually rendering the page. The views are written in ERB, although I do require haml for other views.
Everything was working fine with Passenger last time I used it. I've restarted Passenger several times, and have restarted my computer. Everything works fine when I use script/server. Anyone else having this issue?
I'm trying to use workling and starling to expire_fragment() using a regexp. I'm using FileStore as my cache and running the expire fragment in an async workling worker. I've checked that the RAILS_ROOT is the same in my worker as it is in my rails server. I've checked that the Rails.cache.cache_path is the same in both. I puts'd the finished regexp to the console and it looks correct. But the cache fragments are not being destroyed. What am I doing wrong?
Code sample at http://gist.github.com/162963
I've been playing with scRUBYt and FireWatir lately, they've given me much joy. I'll be looking forward to your screencast on scRUBYt when you do get it to run. Salute!
Awesome work again Ryan!
Ryan, could you make a cast that combines delayed_job with multiple ajax actions (redirects or updates)?
See my problem is the following: In the index action of the controller I show all entries of a database. Background jobs are started within the index action. Now on every entry listed I have a .gif that shows loading. After a job is finished this .gif image should be replaced by an arrow image for success or by a X image for failed.
Now obviosly I would have to call ajax for every lib job, but umfortunately I wasn't able to achive this until now cause its tricky or probably not possible to call an action or a rjs file from outside the controller itself that would replace the corresponding div id.
Would be nice to see how you would solve that!
@wysRd, @ryan
For namespaced rake tasks, it looks like call_rake "abc:def:ghi" should work.
It just gets fed into a command line call (system).
I have the latest version of scrapi installed, however for some reason when I try running the scrapitest.rb code, I receive the following error:
NameError: uninitialized constant Scraper
at top level in scrapitest1.rb at line 5
copy output
Program exited with code #1 after 0.14 seconds.
Any ideas?
@Nakul thanks!!!
@elad New version of firequark (compatible with ff3.5 is here): http://www.quarkruby.com/assets/2009/8/4/firequark-3.5.2.xpi
I'm with @RORgasm on Hpricot. It uses CSS or Xpath selectors and has great block handling for multiple elements. Behaves similarly to jQuery on the traversal end.
As always, thanks for the great screencast!
Great screencast.
I get a problem though running Product.fetch_prices
"
You have a nil object when you didn't expect it!
You might have expected an instance of ActiveRecord::Base.
The error occurred while evaluating nil.[]
"
Any clues?
Excellent screencast, scaping with ruby and scrAPI sees just so much fun. Cant wait to try it out tomorrow! Big Thanks!
Clarification:
It seems to work fine on site, but it wasn't working when I tried to download from the RSS feed.
Hey Ryan,
This episode seems to freeze both audio and video around 3:12. Just thought you might want to know!
@Henning, thanks for the selectorgadget link, just what i needed, cause some how FireQuark can't work on latest Firefox ver 3.5.1
Thanks for another great 'cast. I have been scraping with mechanize/nokogiri and like it (except installing is painful). I was (and still occassionally) use watir to scrap. As always, it is good to see a new tool.
I would love to see a 'cast where you navigate and scrap a site that includes Javascript.
@Chris Barnes
Just installed ruby 1.9.1, passenger 2.2.4, rails 2.3.3 and have exactly same problem!
script/server is running fine, rails running fine, passenger runs fine untill I destroy, update or create then I get 500 errors. When using script/server and port 3000 no problems at all...
Anyone have this solved?
hey Ryan, I would actually suggest taking a look at Hpricot... I've done a few applications that required quite a bit of scraping (legal of course :) ) and fount Hpricot to be a stable, good solution. The Hpricot API also uses the familiarity of CSS selectors for convenience ... unless I'm missing something is there something else that ScrAPI offers that Hpricot doesn't?
If you do not want to replace FireBug with FireQuark you can use http://www.selectorgadget.com/ bookmarklet to interactively build a unique CSS selector for any element on a page. This works also in Safari.
for me Nokogiri (http://github.com/tenderlove/nokogiri/tree/master) does the job pretty well.
@Ivan: You need to install the daemon gem:
sudo gem install daemons
Back again,
its a 64-bit problem.
This guy has a quick and dirty fix for it. I did not use it. I will wait until the gem is improved to not include tidy/tidylib.dll tidy/tidylib.so as they hopefully are in the middle of removing tidy/.
http://anti.teamidiot.de/nusse/2009/05/scrapi_libtidyso_fail/
regardless it's a quite nice gem and another good episode.
Hi,
Firstly thanks! I look forward to each weeks episode.
I'm not sure what goes wrong but this is the output of the scrapitest.rb
/usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/reader.rb:216:in `parse_page': Scraper::Reader::HTMLParseError: Unable to load /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/../tidy/libtidy.dylib (Scraper::Reader::HTMLParseError)
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:865:in `document'
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:749:in `scrape'
from /usr/lib/ruby/gems/1.8/gems/scrapi-1.2.0/lib/scraper/base.rb:347:in `scrape'
from scrapitest.rb:10
gems list scrapi gives: 1.2.0
I will try to fix it and post my solution here.
I like this episode, it inspires me a lot, really thanks!
I would have to agree with elad. You must have a crystal ball for rails developers. Thanks for the episode.
Cool! Thanks.
You are actually a mind reader !!! Thanks..
Any suggestions on how to allow permalink's with periods (".") in them? Assuming product is routed using "map.resources :products", then anything after a period will be treated as a format.
The main drawbacks some of the above suggestions are:
1) Link to pages that shouldn't be indexed will have a negative impact for you when google does it's link matrix magic - PageRank (and no, rel="nofollow" does not help you!).
2) Stuff like hidden links and javascript inserts might cause problem for programs used by the blind.
3) Cluttering your html with a lot of stuff you'll instantly hide and keep hidden during the entire stay on the page just isn't clean - KISS, Broadband, Rendering, (and some would even argue, Security through obscurity).
Keep up the nice work Ryan, you make Mondays a bit nicer!
Ryan;
Thanks for covering Delayed Job.
I spent a few evenings trying to get workling/starling up (based on an earlier Railscast) and running but ultimately passed on completing the work.
Delayed Job fit in nicely to my existing, was easy to deploy. Taken care of in a couple of hours.
Great tutorial as always. Quick question regarding second submittals. If a user gets the error message (which i've added) but then needs to resubmit their order again, I assume they go to the new action again. but doesn't this create another Order record in the DB? should we redirect them to an EDIT method, so we can reuse that same line in the Order table?
or am i missing something?
Thanks for the great screencast!
I recently started using configatron for this kind of configuration:
http://github.com/markbates/configatron/tree/master
Highly recommended!
This screencast really saved my day after recent spam comment attacked my blog.
I just put on the code in my blog and will see how good Akismet can stop the spam comments.
A big thank you to you, Ryan!
Hi guys, nice screen but I prefer job_fu more clean and easier
http://github.com/jnstq/job_fu/tree/master
cheers!
I'm unable to download the screencasts, too. Seems that the media server is down.
Great introduction to cucumber! I already knew the framework and noticed, that you're pointing out the power of it very clear. A great starting-point for your audience.
@Ryan: Nice video.
@all: I am rails evangelist and currently using it to make my personal web blog. I want to print friends blogging activity on user home page. Is there any recommended tutorial or pointers to accomplish that. I thought of using cache but thinking there could be a better way.
Thanks for this helpful railscast!
If you encounter "EOFError (end of file reached)" on the consumer, and <Mongrel::HttpParserError: Invalid HTTP format, parsing fails.> on the REST provider, you might need a leading slash on the ActiveResource prefix. In this RailsCast, it shows
self.prefix = 'foo/bar/'
Try:
self.prefix = '/foo/bar/'