#89 Page Caching (revised)

Feb 02, 2012 | 9 minutes | Performance, Caching

Page caching is an efficient way to cache full content to be served by the front-end web server. Learn how to deal with pagination, expiration with sweepers, and user-specific content in this episode.

Click to Play Video ▶

Download:
source codeProject Files in Zip (91.4 KB)
mp4Full Size H.264 Video (21.1 MB)
m4vSmaller H.264 Video (11 MB)
webmFull Size VP8 Video (13.9 MB)
ogvFull Size Theora Video (24.5 MB)

Nico Ritsche over 13 years ago

I'm wondering why the cache sweepers are tied to the controller.
Some time ago I was writing about this here: http://www.railstoolkit.com/posts/rails-cache-sweeper-confusion

Francesc Pla over 13 years ago

I agree, I think it would be more practical to be able to attach the sweeper to a model, you may want to expire the same pages or fragments in several controllers, rake tasks ... I suppose the problem is that the views are related and only accessible by the controllers and not models but it feels a little bit weird.

Patrick Slevin over 13 years ago

Ran into this exact same issue a few weeks ago, where I had ActiveAdmin updating records but the Sweeper wouldn't observe the model on normal CRUD. I attempted to have the sweeper get called as an observer but I still couldn't figure my way around it, so I fell back to using Rack::Cache (and updated with suggestions from Ryan's screencast earlier in the week).

pctj101 over 13 years ago

From what I can tell, sometimes, if you do a massive update over all records in a table, you may not want to run the sweeper on every record save. So you only activate the sweeper when you know you need it, for example during a certain controller.

However, if your more typical use is CRUD in ActiveAdmin, well then... perhaps doing a model observer would work (never tried).

Stefan Wintermeyer almost 13 years ago

I agree with your point of solving this in a model instead of a sweeper in the controller. It took me a couple of hours to solve this problem the IMO proper way.

Add the following to your model in a Rails 3.2 app (here the Class is Company):

          ruby
        
after_create   :expire_cache
after_update   :expire_cache
before_destroy :expire_cache

def expire_cache
  ActionController::Base.expire_page(Rails.application.routes.url_helpers.company_path(self))
  ActionController::Base.expire_page(Rails.application.routes.url_helpers.companies_path)
end

krismeister over 12 years ago

Perhaps model changes are logistically too far away from the templates, where as a controller knows intimately how it needs the views prepared.

Kristian Gerardsson over 13 years ago

Great episode, but not really useful in my current project. I would love an episode about fragment caching instead!

marckohlbrugge over 13 years ago

There's this one, but it's over 4 years old:
http://railscasts.com/episodes/90-fragment-caching

Would love to see a 'revised' version of that one. I'm guessing it's coming up though.

marckohlbrugge over 13 years ago

Our wishes came true. And so soon!

http://railscasts.com/episodes/90-fragment-caching-revised

Thanks Ryan!

Xavier Noria over 13 years ago

Just a comment about the call to rm_rf.

Doing a recursive rm is not atomic and weird things may happen if at the same time you get requests that are generating files again in the same tree.

In order to safely expire an entire directory it is better to move it. Moving a directory is atomic. After moving you can rm_rf safely.

Ryan Taylor Long over 13 years ago

I am curious why one may want to use this type of caching over HTTP Caching (with Rack::Cache, for example)...

Xavier Noria over 13 years ago

@Ryan in HTTP caching you can only fix expiration once.

So, if you set a TTL of 30 minutes for a page, you're done. No way to tell the users and intermediate proxy caches that fetched that one they need to revalidate right now because it is stale. They won't do it after 30 minutes have passed.

So, the key of this approach vs HTTP headers is that you have total control over the expiration of the cache.

pctj101 over 13 years ago

What's the best way to limit cacheing to only one format?

For example, I only want to cache JSON responses.

Alex Aguilar about 13 years ago

small typo in transcription (ASCIIcast)

/app/sweepers/products_sweeper.rb

should be singular

/app/sweepers/product_sweeper.rb

Akash Kamboj about 13 years ago

I really liked the page caching, because page serving is 100 times faster when using nginx compared to fragment caching and action caching. But I have some doubts, and I would really appreciate if you can answer these.

I have a large site and a lot of content like most read, recent content etc on almost each page. If i use page caching Do i delete all page cache like mentioned in this tutorial
If I use the above technique, I guess for users it will work fine but when Google will come to read the site, won't it take too long to read pages, because they might not be in cache at all that time.
If I use delayed jobs to generate page in cache in backend. Will that be a reasonable solution?

Any help will be really appreciated.

Hagen Volpers about 13 years ago

Hi,

no. 2 shouldn't be an issue as long as content is generated via the page itself - the one creating the content will also regenerate the cache :)

Regards,
Hagen

Hagen Volpers about 13 years ago

Hi,

kind of off-topic, but I try to figure out how to use the route manually without using pagination. Is it pagination figuring out if it has to add the page number as get parameter or as ressource?

          ruby
        
[...]
get "/posts/:id/page/:page", to: 'posts#show'
resources :posts
[...]

I wasn't able to figure out how to make the link_to helper to use of the new route (in this episode it looks like pagination does the "magic"?!).
I don't like hard wired links, so perhaps someone can help me out, I already tried post_url(@post, page: 2), but that leads to the expected /posts/1?page=2 ...

Regards,
Hagen

Hagen Volpers about 13 years ago

Hi,

I missed the as option to create the helper:

          ruby
        
get "/posts/:id/page/:page", to: 'posts#show', as: post_page
resources :posts

So calling post_page_path(@post, 2) now generated the link /posts/1/page/2 as expected.

But would be great to know how pagination handles that (or if I missed something else in the cast).

Regards,
Hagen

Michael Cook almost 13 years ago

One thing that bothers here is the need to change the URL parameters to be part of the path; Google (Webmaster Tools/URL Parameters) won't pick them up when they are.

Is there anyway round this so that we can keep them a parameters?

Ryan, I notice that you still have them as parameters. Does that mean you're not caching the Episodes index pages?

Maximilian Tagher over 11 years ago

Just a warning to anyone using page caching for more than HTML pages (e.g. JSON responses), the Rails will default to saving the cached file with the HTML extension, so requests hitting the cache will get the wrong format. To get around this, just append the format you're using in your request: curl railsapp.com/method.json. Rails will then save the file with the correct extension.