#168 Feed Parsing
Jun 29, 2009 | 10 minutes | Active Record, Plugins
Learn two different techniques for parsing an RSS feed using Feedzirra in this episode!
- source codeProject Files in Zip (93.8 KB)
- mp4Full Size H.264 Video (18.8 MB)
- m4vSmaller H.264 Video (11.9 MB)
- webmFull Size VP8 Video (32.1 MB)
- ogvFull Size Theora Video (30.3 MB)
Been using FeedZirra for some time now. It's been really great. But long/complex urls on Ruby Enterprise Edition have caused problems recently.
any idea how to combine multiple feeds into one master feed?
Look into Yahoo Pipes
It is starting to become kinda creepy … it's as if you were reading my mind. Always coming with stuff I need just in time!
Love your screencasts!
Thank you Ryan!
That's exactly what I needed!
So far, I used Nokogiri. The performance of XML processing is really better than Nokogiri and than Hpricot?
I second the notion that Ryan has ESP. His screencast's always seem to be right on target with what I am working on. We joke that he has bugged our office space :D.
Thanks Ryan another great episode.
Great! Thanks for this Ryan
ps - I think feedzirra is supposed to rhyme with Godzilla... it's a joke
Anybody dealing with RSS parsing for a lot of feeds and who don't want to have the hassle of polling should check out http://superfeedr.com
Let me know what you think! (You included Ryan ;))
This is awesome .. I had to do this a couple weeks ago.. I had the same moving parts as you but not as clear and concise!! Thanks for totally showing me the right way to do this!
I agree with Nils.. when I first saw the title I had deja vu!
I'm getting errors installing it on a Windows box because of curl. Is there a way to install curl on windows?
@ryan what problems were you having with updating? I cant seem to get it to work either. It fetches and parses fine. Anyone else getting this?
I found that this plugin works very well for any xml, but the documentation is not available. Just do the following:
- freeze the feedzirra gem
- in the parser directory clone the rss and rss_entry files, rename them and change the mappings to match you xml structure
- locate feedzirra.rb and add references to your new parser files
- in feed.rb update @feed_classes array with your new parser
In general you should be able to use the example files as reference. Also make sure to update the specs.
Is there a way to install curl on windows? does feedzirra work with windows or not ?
Can you give guidelines for how you would unit-test this code?
For example, can you use a local file URL to ensure it properly parses and stuffs in the database various test data?
Feedzirra needs information from the stored feed entries to get the update working. You can read more on this here: http://groups.google.com/group/feedzirra/browse_thread/thread/6eb16d9a6d4d168e#
Does anyone have a code sample for a daemonized version of this? I'm not quite sure how to get the initial statement to run only once within daemon_generator's while clause.
I used the code as is, but it will never terminate since it will never complete the while($running) loop because of the update loop.
You might want to look at vtd-xml, the latest and most advanced xml processing model, far better than DOM or SAX
Might be a bit late to post here, but I can't get this to work at all...
I followed step by step but get this error:
NoMethodError: undefined method `update_from_feed' for #<Class:0x103804fb8>
I don't really see where I am going wrong, as I have that method defined in the model ...
I'm a bit of a newbie to all this and after a few hours of experimenting today, I couldn't for the life of me work out how to the Feedzirra in Cron with Whenever. Any ideas?
How to parse custom feed attributes? There is a way to do that for entries using
Feedzirra::Feed.add_common_feed_entry_element("wfw:commentRss", :as => :comment_rss)
But I want the same feature for the feed object, not just its entries. Is there a way to do that? Like say
Feedzirra::Feed.add_common_feed_element("geo:lat", :as => :latitudes)
I have been trying to follow along with this example in a rails 3.0.8 site I'm currently working on. I successfully got the feedzirra gem installed and am able to manipulate feeds in the rails irb console. But I can't seem to get the view to render the information stored in the DB.
I'm new to rails, so may not be any fault of the instructions provide here. In fact from the console an attempt at a mock view work as expected:
runs the db query and delivers the the list of entry names. I have run
from the console and fired up webrick but everything I've tried in the view results in the same sorry condition; nothing to see. Am I missing something about how webrick is working? How the db is called? Any insights will be appreciated.
*** very frustrated... want to use this and it is now feedjira and these issues are all over the internet with no solutions>>> please help... thanks
$ gem install feedjira
Temporarily enhancing PATH to include DevKit...
Building native extensions. This could take a while...
ERROR: Error installing feedjira:
ERROR: Failed to build gem native extension.
checking for curl-config... no
checking for main() in -lcurl... no
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers. Check the mkmf.log file for more details. You may
need configuration options.
Provided configuration options:
extconf.rb:18:in `': Can't find libcurl or curl/curl.h (RuntimeError)
Try passing --with-curl-dir or --with-curl-lib and --with-curl-include
options to extconf.
extconf failed, exit code 1
Gem files will remain installed in c:/RailsInstaller/Ruby2.0.0/lib/ruby/gems/2.0.0/gems/curb-0.8.6 for inspection.
Results logged to c:/RailsInstaller/Ruby2.0.0/lib/ruby/gems/2.0.0/extensions/x86-mingw32/2.0.0/curb-0.8.6/gem_make.
You need to install curl and make sure curl.h is available to your compiler path.
Generally, your Rails application is probably not the best place to consume RSS feeds. Most of it needs to happen asynchronously to prevent your app from becoming slow. It's also a mess to deal with multiple formats and breaking feeds.
For this reason, APIs like Superfeedr exist and we recently introduced a Rails Engine which makes all this very simple and elegant.
Does anyone know of any updated places to learn about Feedjira? I am running into errors, but can't find a resource to learn about how to set it up correctly in 2015.
very beginner question, you enter the URL through the ruby terminal. If you auto update it every 15 minutes dont you need to store the url somewhere under a variable?