Ruby Test Framework Roundup and Musings

bdd

Wed Jan 16 13:44:00 -0800 2008

Last week’s icanhasruby had a series of presentations themed around test setups. The main lesson I took away from this is that a single best-practice solution to test/behavior-driven development has not yet been found. But I get the sense that the community is zeroing in on some core concepts that may one day be as ubiquitous as MVC or the HTTP request/response cycle. Even more interesting is that this seems to be happening in a completely decentralized way. I’m not sure where the Rails core team stands, but, given that they are continuing to put work behind Test::Unit (which, as near as I can tell, has been unmaintained since 2003), they don’t seem to be participating much in this quiet BDD revolution. But part of the beauty of Rails and Ruby is that they don’t need to.

Some Frameworks

RSpec was the pioneer on reworking BDD development in the land of Ruby, and remains both the most mature option and the one to beat. (That’s why it’s available by default for Heroku apps.) Most people like the plain-english descriptions of individual specs. But many of those same people dislike the magic-heavy syntax of the DSL. user.should have(1).apps seems nifty at first, but once the novelty wears off, you might find yourself pining for the days of assert_equal 1, user.apps.size.

I like the idea of a rich selection of matchers, but I find that I just can’t seem to remember them. I’ll say this for the assert / Test::Unit approach: once I had written two or three tests with it, I never looked at the docs again. I’ve been using RSpec on and off for close to a year now, and I still have to look up matcher syntax with surprising frequency.

There are some benefits to the matcher syntax beyond just a more english-like syntax, however: the specification of your desired results in this format gives the test framework more information about what went wrong, which means it can give clearer output. Generally, I find that when a spec breaks, I’m much more likely to be able to tell what went wrong from the error than an assert failure. When an assert fails, I generally ignore the results and just go to the line number of the failure. From there I try to figure out what might have been wrong. RSpec’s clearer messages mean that I’m more likely to make a diagonsis from the test output itself, which strikes me as a lot more agile.

If you do prefer asserts, there’s the relative newcomer Shoulda. It offers contexts and plain-english descriptions, but sticks with good ol’ asserts for specifying expected results. It seems to be well-supported and gaining traction quickly.

There’s also test-spec, which provides a compatability layer between RSpec and Test::Unit. You can use this to mix together Test::Unit tests with context-wrapped, plain-english specs, as well as a simple should-style DSL. Personally I like to avoid mixing together different coding styles, but this might work well to transition a large and complex battery of tests over an extended period of time.

Browser-Side Testing

One of the most interesting presentations was JSpec, an RSpec-alike for Javascript. One can hardly even call this a framework, since it’s just a single 100 line javascript file which sends its output to the Firebug console; but often, the best tools are the simplest ones. I liked what I saw here quite a bit:

jspec.describe("Math", function() {
  it("calculate square roots", function() {
    (Math.sqrt(4)).should("==", 2)
  }
}

How about full-stack integration testing? There’s Selenium, which is about as full-stack as you can get: the tests run in Firefox, clicking links and checking rendered results based on recorded scripts. That’s great, and you can even launch it from rake, but it’s so heavy-weight that I tend to shy away from it.

An intermediary solution is Webrat. Using a Mechanize-style scripting language, you can specify a full user story, as played out in the browser. For example:

def test_sign_up
  visits "/"
  clicks_link "Sign up"
  fills_in "Email", :with => "good@example.com"
  select "Free account"
  clicks_button "Register"
end

The only thing this won’t test is your javascript, which may be significant if your site is ajax-heavy.

Sample Data

Mocks and stubs have their own area of theoretical debate. There’s the question of the best library - for example, RSpec's built in mocks versus Mocha. But there’s also the question of when to use mocks and stubs versus building up real object trees and letting them behave normally. Too little mocking and stubbing means you end up with every single spec being an integration test. Too much, though, and you’re not testing the real behavior of your code, and creating a lot of overhead on maintaining the mocks.

That brings us into the realm of fixtures, which have historically been a significant point of pain for Rails developers. I was in the midst of some serious fixture woes when I attended the fixture scenarios talk at RailsConf last year, and it convinced me that this was a good way to go. However, this component doesn’t seem to have taken off in popularity like expected. I assume this is because fixtures are something that people seem to want to avoid in general. When to use fixtures vs mocks vs stubs vs just building the object manually in the spec setup is not well-defined in my brain at all, and I suspect I’m not the only one that has this problem.

And that highlights an important fact of this whole exploration of the BDD space that’s currently taking place. The problem is not really a technical one; it’s about methodology. Rails showed us how to encode a methodology into a framework. Now Rubyists are trying to do the same thing with BDD. We’ll keep trying these frameworks on for size until we find one that feels right for the most common scenarios of application development.

Summary

Most of the points being debated here reflect the central question of BDD: rigidity. You want your app to have some rigidity, so that when a developer makes any sort of significant change to the implementation or the technical design without updating the specs, running the test battery fails loudly. This prevents things from changing unintentionally or through unintended side-effects.

On the other hand, too much rigidity is the very antithesis of agility. If doing something simple like renaming a field means I have to update not only the database schema and the code, but also the specs, the fixtures, the mock objects… well, the developer might be disinclined to make the change at all. Codebases need to be supple enough that developers are never demotivated from making worthwhile changes.

As I warned in the beginning, BDD/TDD in Rails is nowhere near a resolved question. I hate to be a two-handed professor, so let me summarize with some simplified recommendations by situation.

  • If you're new to testing and/or just overwhelmed and confused by the amount of activity in this area, RSpec is probably your best bet. Install the two plugins, run the rspec generator, and then generate some specs with the rspec_model generator.
  • If you're working on an existing project and/or on a large team and/or in a corporate environment, you'll probably need to stick to the standard vanilla Rails testing based on Test::Unit. In all honesty it works just fine, and is certainly far better than writing no tests at all. In other words, don't be afraid to write Test::Unit tests just because there's so much going on with the development of new test frameworks.
  • If you're really bothered by the should syntax magic of RSpec, use Shoulda.