Building a Queue-Backed Feed Reader, Part 2

Wed Apr 15 23:02:46 -0700 2009

In Part 1, we built QFeedReader, a simple and scalable web-based RSS reader backed by DJ.

But using the app as it stood after step 2, you’ll find that the user experience (or UX, as the cool kids call it) could use some improvement. Since the work is queued in the background, the user has to hit their browser’s refresh button to find out if the feed they’ve requested be updated has been been fetched yet. This is ok, but we can do better.

Step 3: UX Improvements with Ajax Polling

One technique to handle this is the one used by Campfire, Github, and countless other sites: send a periodic ajax poll to ask whether updated information is available yet. When it that information becomes available, render it.

We’ll use this same technique in QFeedReader - and even borrow the Github spinner (yoink!).

In the third and final step of the app, clicking any of the individual refresh links sends a refresh request, then displays a spinner and begins polling for updates. Clicking the “refresh all” link is particularly illustrative: each spinner begins polling individually, and the results are entered into the page as they come in. How quickly they complete will be based on the speed of the feeds being fetched, the number of feeds, and then number of workers (that’s rake jobs:work) that you’re running.

If you don’t already have a checkout of the code, grab one and follow along:

git clone git://github.com/adamwiggins/qfeedreader.git

Switch to the master branch if you had a previous checkout:

cd qfeedreader
git checkout master

Digging In

There are three parts to the ajax polling process: the view, which uses HTML attributes; the javascript, which uses Prototype for Ajax polling; and the controller, which uses an HTTP conditional GET header (If-Modified-Since) to find out when fetching job is complete.

The view, app/views/feeds/_feed.html.erb, creates a link to the refresh function like this:

link_to_function "refresh", "refresh_feed(this)",
                 :feed_id => feed.id, :last_modified => feed.updated_at.httpdate

This is a standard javascript function link, but check out the second line. We pass two custom HTML attributes: the ActiveRecord ID of the feed, and the timestamp that the feed was last updated at the time the page was rendered. These attributes will be available in the DOM, to be accessed the javascript.

The link shown above calls a javascript function from public/javascripts/application.js:

function refresh_feed(link) {
  new Ajax.Request('/feeds/' + link.getAttribute('feed_id') + '/refresh', { method: 'post' })
  spin_and_wait(link)
}

It sends an HTTP request to ask that the feed be refreshed, just like clicking the link in our non-ajax version. (Server-side, this throws a fetch request onto the DJ queue.) Afterward, it calls spin_and_wait(), to display the spinner and poll for the results.

function spin_and_wait(link) {
  link.addClassName('refreshing')
  poll_for_update(link.getAttribute('feed_id'), link.getAttribute('last_modified'), link)
}

The polling loop happens via tail recursion, calling itself again every time it gets a “not yet” response from the server. The server decided whether content is ready or not via an If-Modified-Since header sent by the ajax request. If-Modified-Since is a validator for conditional GET. This header tells the server: don’t send me new content unless something has changed since the version of the content that I already have.

function poll_for_update(feed_id, last_modified, link) {
  setTimeout(function() {
    new Ajax.Request('/feeds/' + feed_id, {
      method: 'get',
      requestHeaders: { 'If-Modified-Since': last_modified },
      onComplete: ...
    }) },
    1000
  )
}

Imagine that you want to buy the latest edition of a newspaper. Once an hour you go to the newsstand and say, ”I’ve got the paper for Monday. Is there a new one yet?” The newsstand operator either says “Nope, not yet - check back in an hour”; or, “Yeah, Tuesday’s edition is out, here it is.” In this analogy, the newsstand is your Rails server, the newspaper buyer (you) is the web browser, and If-Modified-Since is ”I’ve got the paper for Monday.”

* A "newspaper" is an old-fashioned type of blog. Ask your parents about it.

This call hits the feeds controller, here:

def show
  @feed = Feed.find(params[:id])

  if stale?(:last_modified => @feed.updated_at)
    render :partial => 'feed', :locals => { :feed => @feed }
  else
    response['Cache-Control'] = 'public, max-age=1'
  end
end

stale? is an ActionController convenience method (pointed out to me by Ryan Tomayko). It compares the If-Modified-Since header against the data we provide it, in this case the updated_at timestamp of the feed record. If the feed is fresh, it renders a the feed partial and returns a 200 - all done.

If the feed is stale, Rails sends a 304 Not Modified status code, with no body. We additionally include a Cache-Control header, which is a message to the client saying “There’s no data now, and there won’t be for at least another second. But you can check back with me after that.”

This is interpreted on the client-side by this portion of the javascript code:

      onComplete: function(transport) {
        if (transport.status == 304) {
          poll_for_update(feed_id, last_modified, link);
        } else if (transport.status == 200) {
          $('feed_' + feed_id).innerHTML = transport.responseText
        } else {
          link.innerHTML = 'error'
        }
      }

If the server returns a 304, it will queue up another poll request to be sent in 1000ms. But if it returns a 200, that means we have new content, that the body of the response will be that content, and that we can stop our tail-recursion loop. The results are rendered into the page using good ol’ innerHTML replacement, which will include the updated timestamp, and a refresh link sans spinner.

Watching the Action with Firebug

A great way to see how this works is to watch the Firebug console (making sure you have “Show XmlHttpRequests” under the Options menu checked). Clicking a refresh link on one of the feeds produces output like this:

The steps shown in this screenshot are:

POST /feeds/1/refresh puts the fetch job onto the queue and returns immediately.
GET /feeds/1 includes the If-Modified-Since header. Since the job hasn’t finished executing yet, it returns a 304 Not Modified with an empty body.
By the second call to GET /feeds/1, the DJ rake jobs:work process has finished fetching the feed. A 200 is returned, and if you click the triangle just left of this call, you’ll see that the body contains the new HTML fragment for the feed.

I encourage you to try it out for yourself - it’s much more instructive to watch the results come in realtime. Your results may vary: a quick fetch might return a 200 on the very first call, whereas a slower one (perhaps because of a slow feed server, or an overloaded worker process) might do three, four, or five requests before it gets the results.

Conclusion

A bit of ajax can take our scalable, queue-based app and turn it into something with a slick user experience. You don’t have to sacrifice scalability for useability - have your cake and it eat it too. But it does require a little more work than queue-only solution from step 2.

If you want to play with the big boys, you need to know how to use queueing. What apps do you have today that you could port to using queues?

a tornado of razorblades