Replace Cron with Clockwork

cron clockwork processes

Wed Jun 30 20:18:02 -0700 2010

If your app needs to poll a remote API once an hour, or send out an email report every evening, what tool do you reach for? Probably cron. Triggering events at a given wall clock time is what cron is for, but it works better at the system layer (e.g. rotating logs on a server) than at the app layer (e.g. sending out a daily report to your app’s users). I’ve described all the ways cron could be improved for app clock events in a previous post.

My wishlist for an app-focused cron replacement, described in that post, can be fulfilled by a little hackery with a few available Ruby libraries (rufus-scheduler and resque-scheduler). But both of these libraries have weaknesses; so I decided to write my own, following their example of the lockless, single-process scheduler pattern.

The result is Clockwork.

Using Clockwork

First, the syntax for scheduling events:

every 1.hour, 'apis.poll'
every 1.day,  'reports.email', :at => '00:00'

A time period and a job name are the only required parameters. Options may include an hour and minute to run for daily jobs.

The job name is passed to your queueing system to enqueue a job, to be worked in one of your background job workers. (An important part of the lockless scheduler process pattern is that it never does any work itself, only queues up jobs for the workers to handle.) In order to make Clockwork queueing system-agnostic, the second bit of code you need is a small handler block that declares how to enqueue a job.

For example, if you’re using my favorite combo, Beantstalk+Stalker, your handler block will look like this:

require 'stalker'
handler { |job| Stalker.enqueue(job) }

Put these two segments together into a file named clock.rb:

require 'stalker'
handler { |job| Stalker.enqueue(job) }

every 1.hour, 'apis.poll'
every 1.day,  'reports.email', :at => '00:00'

Running the Clock Process

To run, install the clockwork gem (gem install clockwork, or specify it in your Gemfile), and then run with the clockwork binary:

$ clockwork clock.rb
[2010-06-28 11:27:42 -0700] Starting clock for 2 events: [ apis.poll reports.email ]

Or with Bundler: bundle exec clockwork clock.rb

More details about the use and operation of Clockwork can be found in the readme.

A Sample Application

To illustrate what Clockwork would look like in a full application, I’ve written a sample app which fetches the Dow Jones index from Google Finance once every three minutes. The clock process enqueues the fetch job. The worker works the job, pulling down the index from the remote API, and storing the result in the database. The web app pulls from the database, showing the user all historic data points.

I wrote the same app with two web framework / database / queue combos, so pick the one that suits your style:

In both cases, the app has three processes: the web process (serving web requests to the user), the clock process (enqueuing jobs periodically), and the worker process (working the job to fetch data from the remote API and store it in the database).

I can’t overemphasize the importance of the clock process being separate from your worker process. The reason for this is that the clock is not horizontally scalable (and doesn’t need to be); but your worker processes are fully parallelizable. In a real app, you’d run two, four, ten, or a hundred workers. You will only ever have one clock. The clock process can and must stay lightweight, doing no more than queueing jobs when the appropriate wall clock time is reached.

Conclusion

Replacing a tried-and-true tool like cron is not something to be undertaken lightly. However, after years of dissatisfaction with cron as a tool for app-level scheduling, I truly believe it’s time to try something different. I’ve been using Clockwork in a number of my own personal and work apps, and I’ve been very pleased with the results so far. Give it a try and tell me what you think.