Django HTTP headers: Controlling caching on cn.ubuntu.com

Also published on design.canonical.com.

It’s becoming more and more important for websites to carefully consider how their resources are cached in users’ browsers. Get the caching wrong, and you either end up with a woefully slow experience for the user, or a very strange looking website as users are left with stale CSS files and images.

Or often both.

For our China site, we’ve decided that the HTML pages should be cached for 5 minutes, and the CSS and JavaScript can be cached for a year - as every time we update them we change the URL.

Caching headers in Django

Telling the browser how long to cache a resource is done with one of two headers:

To control these headers in Django is less simple than you might think. If you’re happy to use the cache framework then it will take care of these headers for you, but as we have a separate Squid cache in front of our application, this was a more heavyweight solution than we needed.

Modifying HTML responses using View classes

In our case, all of our HTML pages are served with an extended version of the TemplateView class:

from django.views.generic.base import TemplateView

class OurTemplateView(TemplateView):
    # Setup our custom template data

To add headers, we need to modify the HTTPResponse, which we can intercept by extending the render_to_response method.

Django also provides patch_response_headers a handy helper function to generate our caching headers for us and attach them to the response:

class OurTemplateView(TemplateView):
  def render_to_response(self, context, **response_kwargs):
      # Get response from parent TemplateView class
      response = super(CmsTemplateFinder, self).render_to_response(
          context, **response_kwargs
      )

      # Add Cache-Control and Expires headers
      patch_response_headers(response, cache_timeout=300)

      # Return response
      return response

And now we can see our extra caching headers in the HTTP response:

$ curl -I cn.ubuntu.com
...
Date: Fri, 12 Feb 2016 22:48:38 GMT
Expires: Fri, 12 Feb 2016 22:53:35 GMT
Last-Modified: Fri, 12 Feb 2016 22:48:35 GMT
Cache-Control: max-age=300

Browsers and proxies will now cache the HTML pages for 5 minutes.

Controlling caching for static files

Django recommends serving static files separately from the rest of your application.

However, for simplicity and dev-prod parity we’ve been using DJ-Static to serve static files with the Django WSGI app, as introduced by Kenneth Reitz. This was also, at the time we implemented it, the method recommended by Heroku for managing static files in Django.

However, as it turns out DJ-Static doesn’t offer any control over caching headers. And Heroku now recommend using WhiteNoise instead.

Serving static files with WhiteNoise is pretty simple (as it was with DJ-Static):

# myapp/settings.py
STATIC_ROOT = 'static'
STATIC_URL = '/static/'

# myapp/wsgi.py
from django.core.wsgi import get_wsgi_application
from whitenoise.django import DjangoWhiteNoise

application = DjangoWhiteNoise(get_wsgi_application())

WhiteNoise will add a Cache-Control header, although it doesn’t support set the older Expires header. By default, the Cache-Control header is initially set to no caching:

$ curl -I localhost:8000/static/css/styles.css?v=d5d2934
...
Cache-Control: public, max-age=0

We wanted our static files to be cached for a year, so we set the WHITENOISE_MAX_AGE setting in settings.py:

# myapp/settings.py
WHITENOISE_MAX_AGE = 31557600

This will set the max-age in the Cache-Control header to achieve the browser caching we’re looking for:

$ curl -I http://cn.ubuntu.com/static/css/styles.css?v=d5d2934
...
Cache-Control: public, max-age=31557600

Now we have control

Leveraging browser caching is an invaluable tool in performance, and so understanding how we can control the user’s cache with Django is very helpful.

Hopefully I’ve demonstrated some ways that this can be achieved, which we’ve just implemented on cn.ubuntu.com.

By @nottrobin