Brittle Tests

Posted by Tammer Saleh

Sep 06

A friend and I were talking about the scaffold mailer tests that are generated by rails, and it got me on a bit of a rant. Here’s what you start out with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class MailerTest < Test::Unit::TestCase
  # ...stuff...
  def setup
    # ...more stuff...
    @expected = TMail::Mail.new
    @expected.set_content_type "text", "plain", { "charset" => CHARSET }
    @expected.mime_version = '1.0'
  end

  def test_invitation
    @expected.subject = 'Some subject'
    @expected.body    = read_fixture('invitation')

    assert_equal @expected.encoded, Mailer.create_invitation(args).encoded
  end

  private
  # ...even more stuff for reading the fixture and doing the encoding...
end

completely pointless image

Basically, you’re generating the email, and then comparing it to an exact copy in your fixtures directory. Not only will this fail whenever you’ve got anything interesting like the current time in body of your mail, but it’s also just a super brittle way of doing things.

Would you write a controller test that compared the rendered view to a pre-generated version in your fixtures directory. Any changes made by the designer, any typos found, anything at all would have to also be made to that fixture.

A much better way of testing emails is like such:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class MailerTest < Test::Unit::TestCase
  def setup
    # ActionMailer settings
  end

  def test_invitation
    user = User.find_first
    email = Mailer.create_invitation(user)

    assert_equal "admin@politiquotes.com", email.from.first
    assert email.to.include?(user.email)
    assert_match(/welcome to politiquotes/i, email.subject)
    assert_match(/please follow the link below/i, email.body)
    assert_match('http://politiquotes.com/tour', email.body)
  end
end

another pointless image

Test only those things which are necessary. You should not be testing that there are two spaces in between each sentence, but you should be testing that you have included the tour url.

This is a great rule to follow in all testing scenarios. Don’t test that the rendered view has a div with class “person” (unless your rjs depends on it), but do test that the rendered form will post to /users. Don’t test that your flash has a :notice key with “Permission denied! Please login as an administrator and try again!”, but do test that any flash key matches against /denied/. Always strive to make your tests more focused and less brittle. I guarantee, your coworkers will thank you for it.

Update: Looks like I’m not the first person to notice the ActionMailer default tests. In fact, I guess this argument is made in the Agile Web Development with Rails book (page 420-421).



Comments on this post

sandofsky

Sep 06

sandofsky said,

Thank you. I couldn’t have said it better myself.

Anyone care to define 1:1 coverage? It seems rather silly to me.

Matt Jankowski

Sep 06

Matt Jankowski said,

Interesting. I actually had a similar conversation recently, and I came to the opposite conclusion. Specifically…

Basically, you’re generating the email, and then comparing it to an exact copy in your fixtures directory.

Yes – and this is what’s good about the @expected method. The task your code is performing is to generate an email—and that fixture is the standard by which your code should be measured. Most clients actually are really particular about having a character-perfect solution for the sample email copy/template which they provided you.

Not only will this fail whenever you’ve got anything interesting like the current time in body of your mail, but it’s also just a super brittle way of doing things.

If you have dynamic elements like current time which are not predictable ahead of running your tests, just mock out Time.now to return a value that’s in your fixture.

Would you write a controller test that compared the rendered view to a pre-generated version in your fixtures directory.

The reason we don’t do that is not because it’s not a good or because it wouldn’t be helpful. I mean think about it—the designers provided HTML and CSS is the “perfect output” which your controller action is tasked with generating. What’s the downside to being able to compare this in a character by character way? It would be awesome. Think about how stupid it is when you get burned by a spacing issue or a line break issue or something weak like that in your HTML. In theory, a designer has provided HTML code which solves that problem and it would be great if you could test your output against that.

We typically DON’T do that not because it wouldn’t be useful to us—but because the rate of change for view files in a rapidly developing application is fast enough that the value having these tests in place would provide is outweighed by the time it would take to create and maintain them.

Any changes made by the designer, any typos found, anything at all would have to also be made to that fixture.

Yeah, you’d do that first. And then you’d call it “test driven development”.

Test only those things which are necessary. You should not be testing that there are two spaces in between each sentence, but you should be testing that you have included the tour url.

The client’s marketing department all have english degrees, and they absolutely want two spaces between each sentence. If you want to have opinions on what the right or wrong way to write and care about email copy is, you should go into branding and charge people for those opinions.

...

Point being, email notifier templates change relatively infrequently and have higher “precision” demands than views do. I say it’s worth getting them right.

Tammer Saleh

Sep 06

Tammer Saleh said,

Well, now this is just getting weird. That’s the exact same stance my friend had.

But I maintain that in the projects I’ve worked on, email copy changes just as much as views do. And I haven’t yet had to deal with clients who were as demanding about the content of their emails as they are about the look of their pages.

I do agree that being able to compare your page or email against the designer’s version would give you a gold standard seal of approval, but I just don’t think it’s worth the wasted development time. Having to find and repair broken tests because you changed one character in an email just leads to developer frustration.

Jay Levitt

Sep 06

Jay Levitt said,

We’re having this same discussion over on rspec-users as well. (Not that it doesn’t pop up everywhere every three months, but still.)

I’m in Tammer’s camp (and we’re making lanyards tomorrow). I think of TDD/BDD as N-Version programming, where N=2, or maybe N=3 with integration tests or RBehave.

Using a different vocabulary and mindset to write the test/spec code, you are verifying that the same result is computed each way. But if all your test does is restate the code verbatim, you have lost the independence you need for N-version programming.

On rspec-users, this came up in the context of

it “should validate_presence_of digits” do PhoneNumber.expects(:validates_presence_of).with(:digits) load ”#{RAILS_ROOT}/app/models/phone_number.rb” end

To me, that is equivalent to:

it “should validate_presence_of digits” do my_model.line(7).should_read “validates_presence_of :digits” end

which is itself just as useful as:

it “shouldn’t change unless I change the spec too” do MD5.new(my_model).should == “0xDEADBEEF0FFD2FFE4…” end

Test for the things you want to test, not for the existence of the code that you already wrote. If you are concerned about two spaces after period (which, btw, is wrong in a proportionately-spaced font, even though I still keep doing it because I grew up with typewriters), then test for that. If you’re concerned about transitions between the client-provided HTML in the view and the code-generated output, maybe test for the transition. But testing the whole thing verbatim is just a time-waster.

Craig

Sep 07

Craig said,

The “pointless photo” in the middle of your article still has the iStockPhoto branding on it – the copyright owner might not love that!

Tammer Saleh

Sep 07

Tammer Saleh said,

Jay: Absolutely. And I’m almost positive I’ve actually seen the MD5 example used somewhere.

Craig: I’m afraid I’m now going to have to delete your comment and “disappear” you from this site. I’ve replaced the image with something that hopefully isn’t copyrighted.

Matt Jankowski

Sep 07

Matt Jankowski said,

Tammer, this is indeed getting really weird—you have the same exact stance that the person I was talking to did!

Having to find and repair broken tests because you changed one character in an email just leads to developer frustration.

See…that’s just backwards. The correct way to state that is “having to find and repair broken code because you changed one character in the test just leads to developer frustration”. But then you wouldn’t agree with it anymore, because it would lead to code quality, not developer frustration.

When are you ever going in and mucking around in your mailer classes just for kicks? You start with some requested change to the copy, update the fixture, watch that isolated test fail, make the change, watch it pass, commit.

Jay – I take your larger point, but…that specific example of testing that a class calls a particular method (or has a line of code, however you do it) is not the same as testing that the class behaves the right way (ie, that you should not be able to save the record without some digits). Similarly, testing that a mailer calls the ‘subject’ method would be pretty worthless, but testing that the correct subject winds up in the mail is not.

Craig – if iStockPhoto had a more comprehensive test/copyright suite which trawled the entire internet and identified possibly infringing images by doing an exact byte by byte comparison – that would be a non issue and Tammer wouldn’t have to disappear you.

jare care

Sep 07

jare care said,

While we’re on the subject of brittle tests,

this

1
2
3
4
5
6
7
8
9
10
11
12
13
  def  test_should_create_a_new_user_object_on_POST_to_create
  post :create, :user => { :email => 'asdf@asdf.com',
                                        :password => 'password',
                                        :first_name => 'first',
                                        :last_name => 'last' }

    is_existing_user = User.find :first,
      :conditions => ['email = ? and password = ? and first_name = ? and last_name = ?',
      'asdf@asdf.com', 'password', 'first', 'last']
    assert is_existing_user
    assert_response :redirect
    assert_redirected_to users_path
  end

or this

1
2
3
4
5
6
7
8
9
10
11
12
  def  test_should_create_a_new_user_object_on_POST_to_create
  old_count = User.count

  post :create, :user => { :email => 'asdf@asdf.com',
                                        :password => 'password',
                                        :first_name => 'first',
                                        :last_name => 'last' }

    assert_equal old_count + 1, User.count
    assert_response :redirect
    assert_redirected_to users_path
  end
Jon Yurek

Sep 10

Jon Yurek said,

Yeah, you’d do that first. And then you’d call it “test driven development”.

Zing!

@jare care: I’d say #2, as the title doesn’t seem to imply how they get created, just that they do, and #1 ties you to a particular DB structure.

Gabe

Sep 27

Gabe said,

When are you ever going in and mucking around in your mailer classes just for kicks? You start with some requested change to the copy, update the fixture, watch that isolated test fail, make the change, watch it pass, commit.

Personally this strikes me as a huge waste of time. I like the mental shift of TDD, but realistically I still only want to test application behavior.

Copy changes happen, all the time, and I don’t want to double up my work just to get a fuzzy green TDD feeling. The test should verify that the content contains the critical data that the email is required to deliver.

Having to update a test when no behavior is changing is just busy-work. Why should we test that some text in a template stays the same in between the template and the output? If there is some transformation going on then by all means test that, but why make a brittle test that verifies something that never breaks?

matt jankowski

Sep 27

matt jankowski said,

The test should verify that the content contains the critical data that the email is required to deliver.

What if the entire copy of the email is considered the critical data that it must deliver?


Sorry, comments are closed for this article.

© 2000 - 2009 by thoughtbot, inc.
written by a bushel of tiny robots