Exporting a number of revisions from git

No Comments »

I'm writing a book about RSpec and need to deliver code samples by chapter/section to the publisher. This is a pretty dirty but fast way of doing so; essentially I identify each check in's revision in my check in message (I used the section name as the check in message) and plugged them into the following shell script. Note that there are ../working and ../delivery directories (relative to the GIT_SRC directory) that get plastered and filled during this process!

# package-src.sh
#!/bin/bash

GIT_SRC=~/Projects/book/src
DEST_TAR_GZ=project
REVS="48c7ed8 7c19f89 30e195f 4f75557 9cf5aea"

cd $GIT_SRC
rm -rf ../working
mkdir -p ../delivery
rm -rf ../delivery/*.tar ../delivery/*.tar.gz
git clone . ../working
cd ../working

for REV in $REVS; do
  git co $REV
  FILENAME=$(git log -n1 | tail -n1 | cut -c 5-256 | tr ' ' _)
  tar cf ../delivery/$FILENAME.tar --exclude=".git/" .
done

cd ../delivery
tar czf ${DEST_TAR_GZ}.tar.gz *.tar

gitignore

No Comments »

I create a lot of projects and the first thing I do is often

$ cat > .gitignore
::sound of furious typing::
::Ctrl+D::

This has become untenable with all of the work I do with Eclipse, Java/Scala/Groovy. Fortunately Github publishes a repo with a list of all the well-known ignore patterns for various languages, tools, and frameworks.
I hacked together a gitignore script that allows me to rely on this repo and do things like:

$ gitignore Eclipse Grails

This nets me a .gitignore that looks like:

$ cat .gitignore
*.bak
*.iws
*.launch
⋮
local.properties
tmp/**
tmp/**/*

I wrote this because I couldn't find something else that already did this though I am certain someone out there has already scratched this itch.


Nokogiri, or why I hate XML

1 Comment »

As usual Nokogiri was impossible to install. I had to step back to my old MBP because my MBP retina needed to go in for a panel replacement. Updated MacVim, needed the excellent Command-T plugin. Command-T relies on Nokogiri which instantly means the authors of Command-T hate the human race; regardless this plugin is required for proper vim operation.

Tried all manner of... (yes these numbers are current if you brew install the requisite libraries as of January 15th 2013ish):

$ gem install nokogiri -- \
  --with-xml2-dir=/usr/local/Cellar/libxml2/2.8.0 \
  --with-xslt-dir=/usr/local/Cellar/libxslt/1.1.26 \
  --with-iconv-dir=/usr/local/Cellar/libiconv/1.14

But really what was actually missing was gcc-4.2, no longer shipped as part of Xcode. So once I did:

$ brew tap homebrew/dupes
$ brew install apple-gcc42
$ sudo ln -s /usr/local/bin/gcc-4.2 /usr/bin/gcc-4.2

And ran my gem install described above:

Building native extensions.  This could take a while...
Successfully installed nokogiri-1.5.6
1 gem installed

Victory!

Thanks Carina C. Zora for making vim usable on my substitute MBP for a few days!


Testing ActiveRecord Migrations

No Comments »

I had a requirement to update a field that went into production with no restrictions to include some restrictions and with that the need to update pre-existing data. I wrote a migration responsible for handling the conversion and I wanted to test all the cases. Here's how I solved the problem, using mocks (Mocha specifically, but rspec's mocks would also work) instead of trying to cram legacy data into the database.

Here's the migration:

# db/migrate/20130113181838_update_public_name_values.rb
class UpdatePublicNameValues < ActiveRecord::Migration
  def up
    User.select("id, public_name").each do |user|
      next if user.public_name.nil?
      original_public_name = user.public_name.clone
      user.public_name.gsub!(/[^A-Za-z0-9_]/, '_')
      user.public_name = user.public_name.slice(0,12)
      begin
        if original_public_name != user.public_name
          user.save!
        end
      rescue
        puts "Unable to update #{user.id} (#{original_public_name} => #{user.public_name})"
      end
    end
  end

  def down
    puts "irreversible"
    # a more responsible migration would make this reversible
  end
end

And the spec (no I did not TDD this, I wrote the migration first and the tests second, tsk tsk):

# spec/db/migrate/update_public_name_values_spec.rb
require "spec_helper"
require "#{Rails.root}/db/migrate/20130113181838_update_public_name_values"

describe UpdatePublicNameValues do
  describe "#up" do
    let(:special_char) { User.new(public_name:'Bits & Bytes') }
    let(:long_name) { User.new(public_name:'This is a very long name') }
    let(:normal_name) { User.new(public_name:'SomeUser') }
    let(:error_name) { User.new(public_name:'Causes Error') }
    let(:nil_name) { User.new(public_name:nil) }
    before do
      User.expects(:select).with("id, public_name")
        .returns([special_char, long_name, normal_name, error_name, nil_name])
      special_char.expects(:save!)
      long_name.expects(:save!)
      normal_name.expects(:save!).never
      error_name.expects(:save!).throws(RuntimeError)
      nil_name.expects(:save!).never
    end
    it "performs the appropriate actions" do
      UpdatePublicNameValues.new.up
      special_char.public_name.should == "Bits___Bytes"
      long_name.public_name.should == "This_is_a_ve"
    end
  end
end

I also ran across this post which shows how to verify structural changes during migrations.


Randomly sample lines from large files (in Perl!)

No Comments »

As I work more and more with large text files I often need to generate samples of these files to provide me with test data. Here's a simple Perl program that will sample files that I cobbled together from this StackOverflow answer with a touch of GetOpt and POD documentation.

#!/usr/bin/env perl
use Getopt::Long;
use Pod::Usage;

my $sample = 0.01;
my $help = 0;
my $args = GetOptions("sample=f" => \$sample,
                      "help"     => \$help);

pod2usage(1) if $help;

while (<>) {
  print if (rand() < $sample)
}

__END__
=head1 NAME

random-sample - randomly sample lines from file(s) to STDOUT

=head1 SYNOPSIS

random-sample [options] [file...]

=head1 OPTIONS

=over 8

=item B<-s (--sample)>

Sets the sample rate; when unspecified defaults to 0.01 (1%).

=item B<-h (--help)>

Prints this help message and exits.

=back

=head1 DESCRIPTION
B reads each filename specifed line-by-line and will randomly output a line based on the sample size (default of 1%).

When provided with no command line arguments reads from STDIN.

This program uses random sample so its not guaranteed to output the precise percentage specified, though statistically the larger the file, the closer it will get.
=cut

I'm a bit rusty with Perl but it is the language I grew up on and with its recent 25th anniversary it just felt good to "write once, read never" yet again.