Great work Ryan,
It would be great if you could come up with a screencast on making screencasts. Till then could you share the tools you use to make your screencasts.
Ryan, you are a star. I've been fiddling about with the :layout => false option for a while trying to achieve this however it fails when using content_for, this solution is ideal... Thanks!!
You got a real spam problem with your comments. Why don't you add some CAPTCHA or a simple question field?
Thanks for the casts. I've built quite a few hardworking apps with no testing whatsoever. I'm not sure if I should feel proud or ashamed. I don know that they work and every time I look at testing it seems like an unnecessary hindrance.I guess the only way to find out what I'm missing is to get stuck in and see how much it benefits me.
For anyone struggling with Mechanize's memory usage like I was, you can limit the maximum number of pages it retains in memory by setting the agent's "max_history", for example:
Actually I need to forge a cookie. My rails application is a kind of proxy between the user and another webapp. I need to preserve the session of the end-webapp, through the entire user session on MY Rails app.
hence, I'm creating a new agent object for each new request, and I need to re-create the cookie with the previous session ID.
I'm struggling with Mechanize::CookieJar and stuff, but no luck yet...
Rookie question: I've followed the tutorial through to the mid-way point where Ryan says "Let's take a look at our code...", refreshes the UI and we see 3 tasks below the Project. I refresh my project screen and my code shows a "NoMethodError" in the Controller with an undefined method for the tasks that were added. Not sure what's up since I've pretty much followed the demo.
What I don't know is that I've created a Tasks model and I don't know what that should look like since it's not covered in the demo. Could this be the issue? Or did I miss scaffolding something? Let me know what to post that might be helpful to debug. I'm using Rails 2.3.4. & am new to this...
I've been using Mechanize to scrape web content for a while and it's extremely convenient.
What I noticed though is if you keep the agent alive for multiple requests (like looping through pagination) it starts consuming more and more memory. My guess is Mechanize agent is storing the pages previously loaded even if you clear the variable holding said page.
Anyone know how to deal with this? Can you clear the 'cache' so to speak?
I have been experimenting with lockdown on my current project and liked it alot, configuration is all in one file, links are automatically hidden if the user doesn't have permission and you don't need to add any code to the controllers. However I found it quite difficult setting up persmissions based on the current user and also got annoyed by the requirement to restrat the server for permissions to take effect. It is a good system, but I think I will give declarative authorization a go it looks far more intuitive, it would be handy to have the lniks automatically hiden as with lockdown though, so I might have a go at implementing that in a fork.
After installing the thinking_sphinx plugin i proceeded to install riddle because that is required for the latter plugin and found that I could not install riddle because of the following error.
WARNING: RubyGems 1.2+ index not found for:
RubyGems will revert to legacy indexes degrading performance.
ERROR: While executing gem ... (NoMethodError)
undefined method `gems' for #<Array:0x11ad58fc>
I have tried updating gems
installing new gems
i reinstalled rubygems and ruby and my os (OSX) with no luck.
Don't know what to do, and now I'm back to rails 1.2.3
Running the above example I get the following error:
undefined method `text' for nil:NilClass (NoMethodError)
If I just do following:
'puts doc' then I get the following text which does not include the title and only seems to display the commented out code in source html
I'll just leave this here aswell (already commented in #190).
If anyone needs to scrap AJAX-heavy sites and html parsers just don't cut it, you might want to take a look at a HtmlUnit library. Sadly, it's only available for Java, but it's the only library capable of Javascript that I found.
Most of the time you wouldn't need this, but if a site uses a lot of ajax and some obfuscated javascript, and changes a lot, it might be the only way.
Sometimes you need to scrap AJAX-heavy sites and scrapping using traditional methods is not an option.
I would like to mention HtmlUnit here, it's a Java tool for website testing, and it implements a GUI-less browser with pretty good Javascript support. If anyone runs into a problem where they need to scrap an AJAX-heavy site and they can't manage with approaches like those mentioned in this railscast, i would recommend they take a look at HtmlUnit. The way I use it is with crontab once a day, I fetch IDs/URLs (which change often) and write them to a file or DB and use nokogiri to really scrap the data.
I must note though, that HtmlUnit isn't really fast, so avoid when you can.
Excellent tutorial but I got this problem I have followed all the steps of your tutorial but it still refreshes the page. . . Can u please help me... I'm stuck with this.
That irb copy function is pretty neat, thanks for the tip!
It was very cool when it all came together and you proved how easy it was to simply add products to your site and almost instantly update the prices, put a smile on my face.
Looks awesome and I think this will save me a lot of time. Does this work with Javascript at all or only straight websites - Some stupid websites do the form send through javascript only, if it could handle that it would be amazing...
Hi ryan.
I've created a plugin to seed data acording to the environment being loaded.
I've been using it to seed testing, staging and production databases with different kinds of data without having a big seeds.rb.
Check it out!
http://github.com/franciscotufro/environmental_seeder
I have the problem that we use 'attachment_fu' in one of our projects. There is a class named 'Attachment' and this is associated to different models. e.g. StaticPage has_many attachments.
Now I want to add paperclip too and run into a name-conflict.
StaticPage.attachments...... doesn't work since paperclip is installed and brings up errors like:
undefined method `quoted_table_name' for Paperclip::Attachment:Class
StaticPage doesn't use paperclip at all and it's just the presence of Paperclip causes this errors.
I wonder if there is a way to 'rename'/'refactoring' the used 'Attachment-Model' to something like 'BinaryAttachment' throughout the entire project. of course RexEx through the project would be work - but I'm not a RegEx-specialist and therefore I'm looking for a more simpler way to do this.
Does anyone know about a refactoring tool like XCode has to rename Classes? Or how would you solve this?
It's probably not a great idea to store large objects in a[:session] but it seems wasteful to before_filter a session[:user_id] every time if the session[:user] is never altered and used to verify presence such as user level. Thanks for the Rcast!
Just a slight comment on the regular expression used in the screencast: /[0-9\.]+/. First, a . is not a special character inside a character class ([…]), so you can drop the slash. Also, there's a shortcut in regular expression for the character class [0-9], usable inside or outside of a character class. Thus, the expression can be simplified to /[\d.]+/. Just an FYI.
Great work Ryan,
It would be great if you could come up with a screencast on making screencasts. Till then could you share the tools you use to make your screencasts.
Thanks
DES3 algorithm is not supported under jruby_openssl (v. 0.6) so I cannot use PayPal transactions...
Do you know if I can use a different cipher algorithm under jruby ?
Ryan, you are a star. I've been fiddling about with the :layout => false option for a while trying to achieve this however it fails when using content_for, this solution is ideal... Thanks!!
Hi Ryan,
You got a real spam problem with your comments. Why don't you add some CAPTCHA or a simple question field?
Thanks for the casts. I've built quite a few hardworking apps with no testing whatsoever. I'm not sure if I should feel proud or ashamed. I don know that they work and every time I look at testing it seems like an unnecessary hindrance.I guess the only way to find out what I'm missing is to get stuck in and see how much it benefits me.
Regards,
Kevin.
Great. You are showing how powerful Mechanize is and a "first touch" of it in a clear way.
Thanks for the screen casts. The quality is top-notch.
The only way I can populate the roles is if I do this:
<%= f.select :roles, (controller.authorization_engine.roles + (@user.roles || [])).uniq, {}, {:multiple => true} %>
but I cant save it... :(
Just did a tutorial on password resets in Authlogic: http://github.com/rejeep/authlogic-password-reset-tutorial
For anyone struggling with Mechanize's memory usage like I was, you can limit the maximum number of pages it retains in memory by setting the agent's "max_history", for example:
agent = WWW::Mechanize.new
agent.max_history = 20
hi, I'm not getting the roles in the sign-up page...
I might be something very obvious, can someone point me in the right direction?
Thanks!
Great job!
Thanks. That's cool
Wow lots of spams despite your Rails-captcha...
Anyway, I found a way to hack around my problem :
agent.cookie_jar.jar['mydomain'] = {'/' => {'PHPSESSID' => WWW::Mechanize::Cookie.new('PHPSESSID', previous_session_id)}}
However, #jar is not documented... I wonder if it will stay ok with upgrades...
Thanks Ryan, great one !
I'd like to ask for some help...
Actually I need to forge a cookie. My rails application is a kind of proxy between the user and another webapp. I need to preserve the session of the end-webapp, through the entire user session on MY Rails app.
hence, I'm creating a new agent object for each new request, and I need to re-create the cookie with the previous session ID.
I'm struggling with Mechanize::CookieJar and stuff, but no luck yet...
Any idea ?
chinese guy can see this posts : http://www.blogjava.net/fl1429/archive/2009/08/25/292522.html
Rookie question: I've followed the tutorial through to the mid-way point where Ryan says "Let's take a look at our code...", refreshes the UI and we see 3 tasks below the Project. I refresh my project screen and my code shows a "NoMethodError" in the Controller with an undefined method for the tasks that were added. Not sure what's up since I've pretty much followed the demo.
What I don't know is that I've created a Tasks model and I don't know what that should look like since it's not covered in the demo. Could this be the issue? Or did I miss scaffolding something? Let me know what to post that might be helpful to debug. I'm using Rails 2.3.4. & am new to this...
can cascading select be implemented using auto_complete.
i mean, i have country and state fields and i want auto_complete options to show from state field based on the value of country field.
is it possible?
thanks in advance
Excellent, thank you, only four hours ago I came up with an idea that needed exactly this.
I've been using Mechanize to scrape web content for a while and it's extremely convenient.
What I noticed though is if you keep the agent alive for multiple requests (like looping through pagination) it starts consuming more and more memory. My guess is Mechanize agent is storing the pages previously loaded even if you clear the variable holding said page.
Anyone know how to deal with this? Can you clear the 'cache' so to speak?
I have been experimenting with lockdown on my current project and liked it alot, configuration is all in one file, links are automatically hidden if the user doesn't have permission and you don't need to add any code to the controllers. However I found it quite difficult setting up persmissions based on the current user and also got annoyed by the requirement to restrat the server for permissions to take effect. It is a good system, but I think I will give declarative authorization a go it looks far more intuitive, it would be handy to have the lniks automatically hiden as with lockdown though, so I might have a go at implementing that in a fork.
know we have them in
How would you get access to Nokogiri object of a give web page off once you are "Authenticated" to scrape non form / link date form the page?
I'm trying to use the history command but I'm getting:
NoMethodError: private method `split' called for #<Array:0x10122ea28>
This commit from you dotfiles does work somewhat better, but it lists everything in my .irb_history, so it doesn't contain exit lines to split at?
http://github.com/ryanb/dotfiles/commit/78c149fb7e9ac1f2d89ed3a7518aee293b63b747
After installing the thinking_sphinx plugin i proceeded to install riddle because that is required for the latter plugin and found that I could not install riddle because of the following error.
WARNING: RubyGems 1.2+ index not found for:
RubyGems will revert to legacy indexes degrading performance.
ERROR: While executing gem ... (NoMethodError)
undefined method `gems' for #<Array:0x11ad58fc>
I have tried updating gems
installing new gems
i reinstalled rubygems and ruby and my os (OSX) with no luck.
Don't know what to do, and now I'm back to rails 1.2.3
How can i make this
<%= f.collection_select :role, User::ROLES, :to_s, :titleize %>
In formtastic? The part for :humanize or :titleize specially.
Thanks!
Awesome screencasts, thanks a lot.
Could someone explain how the command:
ruby script/generate migration add_price
takes 20 seconds to execute, when all it creates is:
class AddPrice < ActiveRecord::Migration
def self.up
end
def self.down
end
end
Not complaining, I just want to understand WTF it is doing!
Maybe it is a stupid question but would this not be possible to do with web rat ?
@Andi
I ran into the same namespace issue with my own attachments join table. I just set-up my asset association like this and it seems to work fine:
has_many :attachments, :class_name => '::Attachment', :as => :from, :dependent => :destroy
The key here being the class name.
Running the above example I get the following error:
undefined method `text' for nil:NilClass (NoMethodError)
If I just do following:
'puts doc' then I get the following text which does not include the title and only seems to display the commented out code in source html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<!--[if lt IE 7]>
<link href="http://i2.walmartimages.com/css/global_ie6.css" rel="stylesheet" type="text/css">
<![endif]--><!--[if IE 7]>
<link href="http://i2.walmartimages.com/css/global_ie7.css" rel="stylesheet" type="text/css">
<![endif]--><!--[if lt IE 7]>
<link href="http://i2.walmartimages.com/css/pagination_ie6.css" rel="stylesheet" type="text/css">
<![endif]--><!--[if IE 7]>
<link href="http://i2.walmartimages.com/css/pagination_ie.css" rel="stylesheet" type="text/css">
<![endif]--><!-- start /include/static/kill_frames.jsp --><!-- end /include/static/kill_frames.jsp --><!--[if lt IE 7]>
<iframe id="overlay" src="/overlay/overlay_iframe_default_src.jsp?bv_enabled=false" name="overlay" frameborder="0" scrolling="no"></iframe>
<![endif]--><!--[if IE 7]>
<iframe id="overlay" src="/overlay/overlay_iframe_default_src.jsp?bv_enabled=false" name="overlay" frameborder="0" scrolling="no" allowTransparency="yes"></iframe>
<![endif]--><!-- Start: Module G0040: Primary Navigation --><!-- Site Header start --><!--[if lt IE 7]>
<iframe id="dropmenuiframe" src="/blank.html" style="z-index:20;display:none;position:absolute"></iframe>
<![endif]--><!--[if IE 7]>
...
You can do \d in stead of 0-9 to match digits in regular expressions. \D is non-digit characters.
I put this at the bottom of my ~/.irbrc file to quickly access this command history:
def hist
puts Readline::HISTORY.entries.split("exit").last[0..-2].join("\n")
end
HTH,
Chip
Need to invoice? http://invoicethat.com
Ryan does it again. This is exactly what I need for an app I'm working on now.
Is there any way to deal with captchas with mechanize? Seems like more and more pages particularly any type of form submission have captcha.
Because someone is sure to bring up the black hat type stuff that can be done with mechanize, I assure you my intentions are purely white hat.
I'll just leave this here aswell (already commented in #190).
If anyone needs to scrap AJAX-heavy sites and html parsers just don't cut it, you might want to take a look at a HtmlUnit library. Sadly, it's only available for Java, but it's the only library capable of Javascript that I found.
Most of the time you wouldn't need this, but if a site uses a lot of ajax and some obfuscated javascript, and changes a lot, it might be the only way.
@Sam, @Godfrey
I felt the same way! I couldn't help but laugh -- I worked with Mechanize on Python a while back and it just did not seem that easy :)
@Ryan
You have a real talent for presenting this stuff. I really appreciate the time you put into it!
Sometimes you need to scrap AJAX-heavy sites and scrapping using traditional methods is not an option.
I would like to mention HtmlUnit here, it's a Java tool for website testing, and it implements a GUI-less browser with pretty good Javascript support. If anyone runs into a problem where they need to scrap an AJAX-heavy site and they can't manage with approaches like those mentioned in this railscast, i would recommend they take a look at HtmlUnit. The way I use it is with crontab once a day, I fetch IDs/URLs (which change often) and write them to a file or DB and use nokogiri to really scrap the data.
I must note though, that HtmlUnit isn't really fast, so avoid when you can.
Didn't realize layout names can be treated like methods. goodness!
Excellent tutorial but I got this problem I have followed all the steps of your tutorial but it still refreshes the page. . . Can u please help me... I'm stuck with this.
@Sam
That was exactly my reaction. Reminds me of how I felt when I first watched the create a blog a 15 minutes screencast :)
That irb copy function is pretty neat, thanks for the tip!
It was very cool when it all came together and you proved how easy it was to simply add products to your site and almost instantly update the prices, put a smile on my face.
Looks awesome and I think this will save me a lot of time. Does this work with Javascript at all or only straight websites - Some stupid websites do the form send through javascript only, if it could handle that it would be amazing...
@Steve, nope, not a real wish list so don't get me anything from in. ;)
@Chris, oops, it's up there now.
Hi Ryan,
Love you work! Small thing though - you promised to stick the line for getting the console history in the show notes, and I don't see it ...
Sweet! Thanks again for your consistent screen casts!
Is that your real wish-list?
Ive been wanting to use mechanize for a while! Great to see a screencast on it!
Thanks ryanb!
If you need authorization please see episode 188...
Hi ryan.
I've created a plugin to seed data acording to the environment being loaded.
I've been using it to seed testing, staging and production databases with different kinds of data without having a big seeds.rb.
Check it out!
http://github.com/franciscotufro/environmental_seeder
I have the problem that we use 'attachment_fu' in one of our projects. There is a class named 'Attachment' and this is associated to different models. e.g. StaticPage has_many attachments.
Now I want to add paperclip too and run into a name-conflict.
StaticPage.attachments...... doesn't work since paperclip is installed and brings up errors like:
undefined method `quoted_table_name' for Paperclip::Attachment:Class
StaticPage doesn't use paperclip at all and it's just the presence of Paperclip causes this errors.
I wonder if there is a way to 'rename'/'refactoring' the used 'Attachment-Model' to something like 'BinaryAttachment' throughout the entire project. of course RexEx through the project would be work - but I'm not a RegEx-specialist and therefore I'm looking for a more simpler way to do this.
Does anyone know about a refactoring tool like XCode has to rename Classes? Or how would you solve this?
It's probably not a great idea to store large objects in a[:session] but it seems wasteful to before_filter a session[:user_id] every time if the session[:user] is never altered and used to verify presence such as user level. Thanks for the Rcast!
Just a slight comment on the regular expression used in the screencast: /[0-9\.]+/. First, a . is not a special character inside a character class ([…]), so you can drop the slash. Also, there's a shortcut in regular expression for the character class [0-9], usable inside or outside of a character class. Thus, the expression can be simplified to /[\d.]+/. Just an FYI.
Thanks for another great episode!
I've been able to implement the search form on my model's index.html.erb file