Jamie Gaskins

Ruby/Rails developer, coffee addict

Perpetuity Object Declarations

Published Feb 27, 2012

So far, in Perpetuity, this is what I've got setup for Mapper declarations:

class ArticleMapper < Perpetuity::Mapper
  attribute :title, String
  attribute :body, String
  attribute :author, User

  id { title.gsub('\\W+', '-').downcase } # SEO-friendly URLs

What ends up happening is that it sees that it can serialise the title and body attributes of the Article model because they're String objects. But it knows it can't serialise the author attribute because it's a User instance, so it saves the id of the author object (it must already be persisted or Perpetuity throws an error … for now) in its place. When loading the association from the database, it calls UserMapper.find(model.author) and places the result into the Article instance.

This is not the cleanest idea, since we would have an actual value there before the association is loaded … and it would be wrong. I've thrown around several ideas for this:

Load all associations when the object is retrieved from the database

This has the obvious pro of never having to worry about associations in user code, but we end up retrieving extra data from the database which we may never use. Clearly, a more efficient way of handling it would be to do lazy loading, but then we end up moving away from the Data Mapper pattern and start implementing the Active Record pattern by telling the object how to work with the database.

Granted, this is done through metaprogramming (we would inject the code into the object at load time) so the written code is pure Data Mapper, but it still feels wrong. We do currently apply a little magic to the object to assign it an id when it's been persisted into/loaded from the database, but I'm trying to limit it as much as humanly possible.

Force a user to load all associations manually

This is the way it is currently implemented and seems to be the most true to the Data Mapper pattern. In order to get associated objects, a user would have to call something like so:

article = ArticleMapper.find(params[:id])
ArticleMapper.load_association!(article, :author)

This code executes a database query and assigns the proper User object to the author attribute of the article. The disadvantage of this is that we must be mindful of every single database query, but the advantage is that … we end up being mindful of every single database query. :-)

How this is an advantage is that we don't allow queries to get away from us. ActiveRecord-style associations mean that we can link up objects without caring about the consequences. I've caught myself doing this, only to look at the logs and notice that I've executed dozens of queries when I only meant to execute 3 or 4 simply because I was treating associations as data instead of separate database rows.

The verdict

If you haven't already figured this out, I'm strongly leaning toward manual loading of associations, but I want to do something a little cleaner. Calling two class methods in two lines is ugly (I understand that everything being a class method on the mapper class is a code smell in itself, as well, but I'm working on that), so I think something more like this is in order:

article = ArticleMapper.find(params[:id], associations: [:author])

That should help keep things clean while still forcing you to think about each query you're doing. We could also expand this to the Mapper#retrieve method so that we don't end up doing N+1 queries. Both MongoDB (currently the only supported DB) and SQL (would like to get this one in, but I think I'll work on that later) support selecting on inclusion of values in lists, so we could optimise from N+1 down to 2 queries.

For example, if we're displaying a list of blog articles that will be shown with comment information within the list, we could do something like this:

articles = ArticleMapper.retrieve(published: true, associations: [:comments])

Instead of iterating over each article retrieved and loading their respective associations (N queries for comments + 1 original comment query), it gets all the articles for the specified criteria and then retrieves all comments whose article ids are in that list of articles — two queries, just as with retrieving a single article and its comments. Just as importantly, it only requires a single line of clean code.