Creating Valid Records with Populator and Faker

Posted by Trevor in Ruby/Rails on September 27, 2008

I'm an avid Railscasts watcher, and I was quite happy with one of the recent episodes called Populating a Database. The episode shows you how to load a database with lots of records for demos, design, and/or acceptance testing in an efficient way, but it doesn't cover making sure you're creating valid stuff. I love speedy rake tasks just as much as the next guy, but I found that creating invalid records lead to problems pretty quickly. Luckily, this is something that's very easily fixed.

Here's an example populate task. It's pretty self explanatory, so I won't go into much detail. The highlight is the random(model) bit, which allows you to grab a valid object of any kind. In this case, I use this to pull a random User that creates a valid Post. Note how you can still use the cool stuff in Faker and Populator without following the same technique as described in the Railscasts episode.

 
# lib/tasks/populate.rake
namespace :db do
  task :populate => :environment do
    require 'populator' # http://populator.rubyforge.org/
    require 'faker' # http://faker.rubyforge.org/rdoc/
 
    [User, Post].each(&:delete_all)
    Rake::Task['db:fixtures:load'].invoke
 
    def random(model)
      ids = ActiveRecord::Base.connection.select_all("SELECT id FROM #{model.to_s.tableize}")
      model.find(ids[rand(ids.length)]["id"].to_i) unless ids.blank?
    end
 
    puts 'creating users...'
    10.times { |i|
      User.create(:login => Faker::Lorem.words(1).to_s, :email => Faker::Internet.email, :password => 'monkey')
    }
 
    puts 'creating posts...'
    40.times { |i|
      random(User).posts.create(:body => Faker::Lorem.words, :created_at => Populator.value_in_range(10.years.ago..Time.now).to_s)
    }    
 
  end
end
 

Of course there are other ways to achieve this random(model) business without quite as much overhead. For example, you could load all of the users with one query and then choose a random one like this:

 
@users = User.all
 
40.times { |i|
  @users.rand.posts.create(:body => Faker::Lorem.words, :body => Populator.value_in_range(10.years.ago..Time.now).to_s)
}
 

But I'm not a big believer in premature or unnecessary optimization. In any case, long running Rake tasks make a good excuse to take a walk while it's still nice outside here in Chicago :)

7 Comments

 Chris Eppstein

pluralize.downcase should be tableize

 DrMark

I am sure that you have seen it, but FactoryGirl makes this virtually painlessly. You can also build tests for all of your factories to ensure they are valid.

Best of luck

[...] Creating Valid Records with Populator and Faker – For filling up your test database with stuff. [...]

 Trevor

@Chris, thanks – I’ve updated this post to reflect that.

@DrMark, I’ve been meaning to check out Factory Girl. I’m heading over to the site now to take a closer look. For others who might be looking for the same thing, here’s a link:

http://giantrobots.thoughtbot.com/2008/6/6/waiting-for-a-factory-girl

 Mario Zigliotto

Long rake tasks also make a great excuse for getting more coffee :)

 Casper

Shouldn’t you be checking to make sure the task is only run in the test environment? Looks like a pretty efficient way to accidentally blast your dev or production db :-/

 Trevor

@Mario – of course, that’s what I’m taking a walk to get :)

@Casper – yeah, I guess you could raise an error if your RAILS_ENV is production, but I consider this to be the kind of thing that you’d never run in production on accident. It’s like running “rake db:drop” – anybody can do it with access to the production machine, if you know what I mean. Perhaps you could put this file into your project, but exclude it from your repo (e.g. .gitignore) and then it would never even make it into your production app.

Leave a comment

WP_Big_City