Massaging Data
Tuesday, August 22nd, 2006
Update: It seems that this behavior is no longer exhibited in the current version of Rails (1.1.6), as each test is wrapped in a transaction. So this entire post is pretty much moot now.
Rails’ tests (unit, functional, and integration) use a database with read-only values. Anything which modifies the database is not saved. This allows the tests to run in isolation from each other, which is a Good Thing.
In the real world, however, I often find myself needing to write methods which go and massage a bunch of data in the database. A simple example would be a nightly cron job which creates invoices for accounts whose billing cycle are due.
In Rails this is done by creating a static method like Account.create_invoices, and invoking it from the crontab via “script/runner -e production Account.create_invoices”. Since it is a method, it can be tested in a unit or integration test. (I think the latter is more appropriate, since a high-level method like this often touches a number of models.)
But this method’s main “output” is not a return value, but rather an adjustment (or lots of them) to the database. For example, I may want to go find every account with an open billing item, create an invoice in the invoices table, and then mark the item closed. What I really want to test when this method is done is that there are some number of new invoices in the invoices table, that the billing items have been marked closed, and that the account has no open billing items. But you can’t check most of this output, because it exists as changes to the database, not direct return values.
My impression is that the Rails way views this as a Bad Thing. Every method should return its results, rather than going in and massaging a bunch of data in the database. This makes it more orthogonal and easier to test.
I agree with this philosophy in theory, I’m not sure that this is realistic for real-world applications. I’m like to think that that’s because I’m still trapped in the SQL paradigm, so maybe someone can enlighten me on the pure Rails way to do this.
Here’s a concrete example:
class Account < ActiveRecord::Base
has_many :invoices
has_many :billable_items
def open_billable_items
items = []
billable_items.each do |item|
items << item if item.open?
end
return items
end
def create_invoice
open_billable_items.each do |item|
item.close # creates the invoice and marks the item closed in the db
end
end
def self.create_all_invoices
Account.find(:all).each do |account|
account.create_invoice
end
end
end
class BillingTest < ActionController::IntegrationTest
fixtures :accounts
def test_create_all_invoices
assert_equal 5, accounts(:first).open_billable_items.length
assert_equal 0, Invoice.count
Account.create_all_invoices
assert_equal 0, accounts(:first).open_billable_items.length
assert_equal 5, Invoice.count
end
end
Maybe the right thing to do here is not to test at such a high level, but instead test only BillableItem.close by having it return the created invoice. This bothers me though. It needs to work at the high level, so why can’t I test that?
And this is a very simple example. In reality, the nightly cron job may be touching dozens of tables and thousands or even millions of rows. Returning all affected rows as a result doesn’t make much sense, and may be completely impossible due to memory limitations. (The whole reason we use a database is so that we can operate on large sets of data without having to instantiate every record at once!)