Long running rake tasks on heroku

Heroku puts a limit on how long you can run a process for which means that long running rake tasks will be killed before they complete. The recommended solution is that any long running processes should be run in the background as delayed jobs. This means writing some wrapper code to run your rake task as a delayed job. After doing this a few too many times I went looking for a generic solution.

The solution I arrived at lets you queue up any rake task by prefixing the task name with delay: e.g.

$ rake db:seed
becomes 
$ rake delay:db:seed
rake db:seed will now be run in the background using one of your delayed_job dynos.
The code for this was suprisingly simple. First a job class to hold the rake task.
module DelayedTask
  class PerformableTask < Struct.new(:task)
    def perform
      system "rake #{task}"
    end
  end
end

We then need to define a delayed rake task for each of the existing rake tasks

Rake::Task.tasks.each do |task|
  task "delay:#{task.name}" do
    Rake::Task["environment"].invoke
    Delayed::Job.enqueue DelayedTask::PerformableTask.new(task.name)    
    puts "Enqueued job: rake #{task.name}"
  end
end

I'm now using this on several projects so I've packaged it up into a gem. You can find it at https://github.com/opsb/delayed_task.

Introducing Butler-IO

Some of those IO tasks can simply be the height of tedium. Why not ask the butler to take care of it? He's rather clever don't you know, he even understands a thing or two about that apache VFS.

Butler will fetch

  • byte []
  • String
  • String in utf8
  • InputStream
  • File

from

  • VFS locations - allows loading from any of; local files, http, https, ftp, sftp, temporary files, zip, jar, tar, gzip, bzip2, res, ram, mime. Take a look at some examples.
  • InputStreams
  • Files
  • File in same package as a Class

Installation

Just update your maven settings with

<dependency>
   <groupId>uk.co.opsb</groupId>
   <artifactId>butler-io</artifactId>
   <version>0.3</version>
</dependency>

Usage

First let's call for the butler:

import static uk.co.opsb.butler.ButlerIO.*;

Now let's put him to task

Fetching text

String fromClasspath          = textFrom( "res:articles/steve_jobs.txt" );
String fromUtf8File           = utf8From( "file:///path/to/steve_jobs.txt" );
String fromUtf8FileOnWindows  = utf8From( "file:///c:/path/to/steve_jobs.txt" );
String fromInputStream        = textFrom( inputStream );
String overHttpsUsingVfs      = textFrom( "https://username:password@domain_name.com/article.txt" );
String fromFtpZipUsingVfs     = textFrom( "zip:ftp://username:password@domain_name.com/file.txt.zip" );
String fromFileNextToClass    = textFrom( "name_of_file_in_same_package_as", YourClass.class );

Fetching bytes

byte [] fromClasspath          = bytesFrom( "res:articles/steve_jobs.txt" );
byte [] fromUtf8File           = bytesFrom( "file:///path/to/steve_jobs.txt" );
byte [] fromUtf8FileOnWindows  = bytesFrom( "file:///c:/path/to/steve_jobs.txt" );
byte [] fromInputStream        = bytesFrom( inputStream );
byte [] overHttpUsingVfs       = bytesFrom( "https://domain_name.com/article.txt" );
byte [] fromSftpGzipUsingVfs   = bytesFrom( "gz:sftp://username:password@domain_name.com/file.txt.gz" );   
byte [] fromFileNextToClass    = bytesFrom( "name_of_file_in_same_package_as", YourClass.class );

Fetching properties

Properties fromClasspath          = propertiesFrom( "res:articles/steve_jobs.txt" );
Properties fromUtf8File           = propertiesFrom( "file:///path/to/steve_jobs.txt" );
Properties fromUtf8FileOnWindows  = propertiesFrom( "file:///c:/path/to/steve_jobs.txt" );
Properties fromInputStream        = propertiesFrom( inputStream );
Properties overHttpsUsingVfs      = propertiesFrom( "https://username:password@domain_name.com/article.txt" );
Properties fromJarUsingVfs        = propertiesFrom( "jar://username:password@domain_name.com/outer.jar!inner/file.txt" );   
Properties fromFileNextToClass    = propertiesFrom( "name_of_file_in_same_package_as", YourClass.class );

Opening an InputStream

InputStream fromClasspath          = inputStreamFrom( "res:articles/steve_jobs.txt" );
InputStream fromUtf8File           = inputStreamFrom( "file:///path/to/steve_jobs.txt" );
InputStream fromUtf8FileOnWindows  = inputStreamFrom( "file:///c:/path/to/steve_jobs.txt" );
InputStream overHttpsUsingVfs      = inputStreamFrom( "https://username:password@domain_name.com/article.txt" );
InputStream fromFtpZipUsingVfs     = inputStreamFrom( "zip:ftp://username:password@domain_name.com/file.txt.zip" );   
InputStream fromFileNextToClass    = inputStreamFrom( "name_of_file_in_same_package_as", YourClass.class );

Getting a reference to a File

File fromClasspath          = fileFrom( "res:path/to/file" );
File fromFileNextToClass    = fileFrom( "file_name", YourClass.class );

Aliases

I often need to fetch articles and reports from the same places. I don't know about you but I rather like my butler to show a little initiative.

#Inside a file at {classpath}/butler_aliases.properties
articles\:=res://path/to/articles    # remember to escape any colons you use before the equals
reports\:=res://path/to/reports

Now when I ask for articles and reports he'll know just what to do

String article = textFrom( "articles:steve_jobs.txt" ); // => res:path/to/articles/steve_jobs.txt
String report  = textFrom( "reports:q4_figures.txt" ); // => res:path/to/reports/q4_figures.txt

Marvellous. He can do better than that though, how about we use a convention

^(\\w*)\:=res:uk/co/opsb/%s/

String article = textFrom( "articles:steve_jobs.txt" ); // => res:uk/co/opsb/articles/steve_jobs.txt
String report  = textFrom( "reports:q4_figures.txt" ); // => res:uk/co/opsb/reports/q4_figures.txt

What a clever chap. He's used the regex to capture articles/reports and then String.format to merge them in.

Fancy a tinker? Fork it at http://github.com/opsb/butler-io

Follow table links in cucumber

On a page you'll quite often have a table like the following
Book Author  
Harry Potter and half blood prince J.K. Rowling Delete
The Cuckoo's Egg: Tracking a Spy Through the Maze of Computer Espionage Cliff Stoll Delete
In your cucumber steps you want to say
Given I am on the books page
When I follow the "Delete" link for "Harry Potter and half blood prince"
Then ...
Out of the box webrat doesn't have a step that will allow you to do this. Let's create one that will do the job
When /^I follow the "([^\"]*)" link for "([^\"]*)"$/ do |link, cell_value|
  within "//*[.//text()='#{cell_value}' and .//a[text()='#{link}']]" do |scope|
   scope.click_link link
 end
end
This works great, it even works outside tables, the step will work so long as the text and link have a common parent in the dom. Just one problem, for this step to work webrat needs to understand xpath selectors. Here's a little monkey patch that will get it working in webrat 0.6.0.
#lib/webrat_extensions.rb
module Webrat
  class Scope
    protected
      def scoped_dom
        begin
          @scope.dom.css(@selector).first
        rescue Nokogiri::CSS::SyntaxError, Nokogiri::XML::XPath::SyntaxError => e
          begin
            @scope.dom.xpath(@selector).first
          rescue Nokogiri::XML::XPath::SyntaxError
            raise e
          end
        end
      end
  end
end
Once you've added the patch our new webrat step will work. As a bonus you also get to use xpath selectors anywhere you use css selectors.

rcov for cucumber and shoulda

There seems to be a lot of bad info out there about this. It's really quite simple, you just have to make use of the built in tasks provided by cucumber and rcov. The following task definitions will generate coverage reports for cucumber features and rails tests in coverage.features and converage.tests respectively. You'll also get the overviews for the same reports at the command line when you run the tasks.

require 'cucumber/rake/task'
require 'rcov/rcovtask'

namespace :rcov do
  
  rcov_opts = ['-T','--exclude /Library/Ruby/Site/*,.rip/*,gems/*,rcov*,features/step_definitions/webrat_steps.rb']
  
  desc 'Measures cucumber coverage'
  Cucumber::Rake::Task.new(:features) do |t|    
    t.rcov = true
    t.rcov_opts = rcov_opts
    t.rcov_opts << '-o coverage.features'
  end
  
  desc 'Measures shoulda coverage'  
  Rcov::RcovTask.new(:tests) do |t|
    t.libs << 'test'
    t.test_files = FileList['test/unit/*_test.rb','test/functional/*_test.rb','test/unit/helpers/*_test.rb']
    t.rcov_opts = rcov_opts
    t.output_dir = "coverage.tests"
  end

  desc 'Measures all coverage'  
  task :all do
    ["features", "tests"].each{ |task| Rake::Task["rcov:#{task}"].invoke }
  end
end

Validates uniqueness of multiple columns

We came across a case where we wanted to validate that the combination of first and last names for a new person was unique. Add the following snippet
#config/initializers/validators.rb
ActiveRecord::Base.class_eval do
  def self.validates_uniqueness_of_combined(*attr_names)
    options = attr_names.extract_options!.symbolize_keys
    attr_names = attr_names.flatten
    
    send(validation_method(options[:on] || :save), options) do |record|
      sql             = attr_names.map{ |attr_name| "UPPER(#{attr_name}) = ?"}.join(" AND ")
      values       = attr_names.map{ |a| record.send(a) }.map{ |v| v && v.upcase }
      conditions = [sql, *values]

      db_record = record.class.find(:first, :conditions => conditions)
      if db_record && db_record != record
        default_message = "#{attr_names.map{ |attr_name| record.send attr_name }.join(' ')} has already been added"
        record.errors.add_to_base(options["message"] || default_message)
      end
    end
  end
end
to your initializers then you can do
class Person < ActiveRecord::Base
  validates_uniqueness_of_combined :first_name, :last_name
end
Note that this does a case insensitive match on the column names. If you want case sensitive you should replace
sql             = attr_names.map{ |attr_name| "UPPER(#{attr_name}) = ?"}.join(" AND ")
      values        = attr_names.map{ |a| record.send(a) }.map{ |v| v && v.upcase }
with
sql             = attr_names.map{ |attr_name| "#{attr_name} = ?"}.join(" AND ")
      values        = attr_names.map{ |a| record.send(a) }

Put your team back together with some promiscuous pairing

It's happened so many times. The big release is coming up, we realise there isn't enough time to get all the necessary features in, the business demands that we drop quality to hit the deadline. We reluctantly comply and... we hit the deadline. The business and more importantly our customers are delighted. Some big ticket customers that were going to cancel contracts with us don't. Kinda difficult to argue with that.

Problems

Of course there was a cost and now we, the development team were feeling it, bad. The project is full of broken windows, there's buildings on fire and crack dealers on the corner.
  • inconsistent and poor quality
  • conflicting views
  • ownership - people were protective of code
  • unhappy developers
  • poor team mentality
  • low velocity

The Solution

One night I come across a paper that I've read once before, promiscous pairing and beginners mind. While I found it interesting the first time I didn't persue it. As I reread I started to wonder if perhaps it might provide us with a way to put the team back together and to bring the project back to a state of pride. So how's it work? Promiscuous pairing - pair up, after an hour and a half, the developers in the driving seat moves on to the next story. Back seat developers move into the driving seat and are joined by a new developer, etc.

Result

The effect was noticeable the first day. After a week we had a team with
  • Higher energy
  • Higher velocity
  • Higher quality
  • Happy developers
  • More fun

Why

  • Every time a switch takes place the developer staying on a story must explain the story to the new developer. This is what creates the "beginners mind", it ensures that focus is always on the requirements of the story.
  • Juniors received continuous guidance allowing them to produce same quality of code as seniors.
  • Don't get stuck in rabbit holes, those oh so clever little ideas that turn into sprawling epics get nipped in the bud
  • All design decisions are discussed and agreed upon leading to consensus across the team
  • Discussing all problems in english encourages ubiquitous language and semantic match with business.
  • Remain focused, never get stuck because other person always has ideas, get completely stuck? - every hour and a half you get a fresh pair of eyes on the problem.
  • Each team member has different strengths, rotation means all strengths are applied to all tasks
  • Knowledge spread - every key shortcut, refactoring trick, test technique, domain modelling principle shared between all team members.
  • Risk reduction - all members of team are up to speed with all technologies and features in project so don't need to worry about holidays/illness/turnover.
  • Stress reduction - no one is responsible for delivering a feature, focus is on how best to move story/bug along in next hour and a half.
  • No resentment about features being implemented poorly
  • Team mentality - everyone did everything

Getting started with ruby DBI and mysql

DBI is a database api based on Perls' DBI. It's great for those occasions where you want to interact with a database in a script, or perhaps when you have a really lightweight app that doesn't need an ORM framework.

Install

Ok, it took quite a bit of searching around but this is the magic recipe that will get everything you need installed on ubuntu.

Mysql on ubuntu

sudo apt-get install mysql-client
sudo apt-get install libmysqlclient15-dev

Gems

sudo gem install dbi
sudo gem install mysql
sudo gem install dbd-mysql

Usage

First of all let's load up all the dependencies
require 'dbi'
require 'mysql'
require 'dbd-mysql'
Now let's make a connection to a db, obviously replacing the schema, hostname etc. with your own.
dbh = DBI.connect('DBI:Mysql:schema:hostname', 'username', 'password')
For the examples let's assume that we have a table, people, that contains
id name
1 jim
2 paul

Select

DBI provides two select methods, select_all and select_one. Each will return rows or a row that contain values that can be indexed using the name of a column or the index of the column. DBI will map the values in your columns to ruby classes automatically.
row = dbh.select_one("SELECT * FROM people;")
puts row[:id]                 # 1
puts row[:id].class         # Fixnum
puts row[0]                   # 1
puts row[0].class           # Fixnum
puts row[:name]            # jim
puts row[:name].class    # String

rows = dbh.select_all(statement)
puts rows[0][:id]       # 1
puts rows[0][:name]  # jim
puts rows[1][:id]       # 2
puts rows[2][:name]  # paul

Insert

For operations that update the db you can use the method 'do'. Note that we've used the ? to indicate where our values will go in the query and then supplied them as arguments at the end, this is to avoid SQL injection attacks.
dbh.do("INSERT INTO people (id, name) VALUES (?,?)", nil, 'bob')
id name
1 jim
2 paul
3 bob

Update

We can use 'do' for updates as well
dbh.do("UPDATE people SET name=? WHERE id=?", "mark", 3)
id name
1 jim
2 paul
3 mark

Delete

You get the idea... So there you go, a whirl wind tour to get you up and running. For more in depth instructions I recommend this tutorial. The ruby DBI homepage has extra information including the rdocs. Finally, this github version of the project also has some good information in its' readme, particularly regarding different db drivers.

Pimp my git

Now that I'm starting to use git more regularly I've started looking for ways to make git even better. It turns out that git's really easy to customise.

Aliases

Having recently heard that "git stage" is going to be added as an alias for "git add" another git fan mentioned that you can add your own git aliases to the .git/config file. I particularly like this idea as some of the commands can be a little bit esoteric. To get you started try adding this entry to your .git/config file.
[alias]
  stage = add
  unstage = reset HEAD
Now you can add content to the staging area using
git stage new_file
and then remove it again using
git unstage new_file

Pretty log

This one's an alias, just add it in the same way as stage/unstage
plog = log --pretty=tformat:'%h %Cblue%cr%Creset %cn %Cgreen%s%Creset'
I find this format much easier to read, colour coded information and you can fit lot's more commits on screen. This makes the graph version of log awesome, take a look
git plog --graph
Win!

Colour coded status

Add the following to .git/config and staged files will be shown in green, unstaged in red.
[color]
  ui = auto

Command line prompt

The git-prompt project will let you customise your prompt to include all sorts of git information, the defaults are a bit full on but you can tame it to your taste easily enough I'm sure I'll keep adding to this list, let me know any great git tricks you've got.