Archive for the ‘Computers’ Category

Project level .pryrc

No Comments »

Building off my previous ~/.pryrc I wanted to automatically load up my core project Ruby file, spec_helper.rb, and fire off some initialization routines whenever I start a new Pry session in my project's directory.

Because the contents of the local directory's .pryrc is evaluated before the :before_session hook from ~/.pryrc timing is a bit more delicate. I get around this by creating a custom function named _pry_before_session (but you could name it anything you want really) and have the ~/.pryrc's before_session hook execute it if it exists.

So my project's .pryrc:

#~/Projects/ActiveAvro/.pryrc
def _pry_before_session
  require 'active_avro'
  require 'spec_helper'
  ActiveAvroHelper.initialize
end

And my updated ~/.pryrc looks like this:

#~/.pryrc
require 'interactive_editor'

Pry.config.editor = "mate"
# add the current directories /lib and /spec directories to the path if they exist
before_session = Proc.new do |out, target, _pry_|
  dir = `pwd`.chomp
  %w(lib spec test).map{ |d| "#{dir}/#{d}" }.each { |p| $: << p unless !Dir.exists?(p) || $:.include?(p) }
  # if a local .pryrc defines a _pry_before_session function, execute it now
  send(:_pry_before_session) rescue nil
end
Pry.hooks[:before_session] = before_session

Pry and .pryrc – add ./lib and ./spec to $LOAD_PATH

No Comments »

I use Pry a lot. If I had to develop without it I'd spend a hell of a lot more time not writing Ruby code. Often I use Pry in a Rails project as part of rails console but I also use it with my non-Rails Ruby projects. I wanted to get around having to append commonly used directories to the $LOAD_PATH each time I fire up a Pry session.

Ruby code you place in your ~/.pryrc is executed when your Pry session begins.

Here's the relevant lines from my ~/.pryrc that add the appropriate paths to $LOAD_PATH during start up.

#~/.pryrc
⋮
# add the current directories /lib and /spec directories to the path if they exist
before_session = Proc.new do |out, target, _pry_|
  dir = `pwd`.chomp
  %w(lib spec test).map{ |d| "#{dir}/#{d}" }.each { |p| $: << p unless !Dir.exists?(p) || $:.include?(p) }
end
Pry.hooks = { :before_session => before_session }

vim [TAB] function definition file not found (in zsh)

No Comments »

Recently I ran into

_arguments:448: _vim_files: function definition file not found

when attempting to tab complete a file name while invoking vim. Easy fix - at least for me: rm ~/.zcompdump. Found the answer here. How it got that way in the first place is another mystery.


Ruby 1.9.3 and Thrift (0.5.0-0.8.0+?)

No Comments »

This post basically sums the bug with the Ruby Thrift bindings where the exception message is "Incompatible character encodings: ASCII-8BIT and UTF-8". This problem is a bit of a bitch to hunt down but once you find it its relatively easy to fix.

While I've got a fork with a pull request I'm fairly certain that the Apache software foundation has other... means of accepting patches so this pull request will be largely irrelevant.

Until the problem is fixed and propagated to the thrift gem you can monkey patch this issue yourself:

# encoding: utf-8
module Thrift
  UTF8_ENCODING = "utf-8"
  class BinaryProtocol
    def write_string(str)
      write_i32(str.bytesize)
      trans.write(str)
    end
  end

  class HTTPClientTransport < BaseTransport
    def write(buf)
      puts "write"
      @outbuf << buf.force_encoding(UTF8_ENCODING)
    end
  end
  class FramedTransport < BaseTransport
    def write(buf,sz=nil)
      buf.force_encoding(UTF8_ENCODING)
      return @transport.write(buf) unless @write

      @wbuf << (sz ? buf[0...sz] : buf)
    end
    def flush
      return @transport.flush unless @write
      out = [@wbuf.length].pack('N')
      out.force_encoding(UTF8_ENCODING)
      out << @wbuf
      @transport.write(out)
      @transport.flush
      @wbuf = ''
    end
  end
  class BufferedTransport < BaseTransport
    def write(buf)
      @wbuf << buf.force_encoding(UTF8_ENCODING)
    end

    def flush
      if @wbuf != ''
        @wbuf.force_encoding(UTF8_ENCODING)
        @transport.write(@wbuf)
        @wbuf = ''
      end

      @transport.flush
    end
  end
end

While I can't vouch for the production worthiness of the above code I can say it at least gets me past an aggravating hurdle.


Manning Hadoop in Action Chapter 1 Example

No Comments »

It took me a little while of digging to get to the baseline source code for the Manning Hadoop in Action (2010) chapter 1 source code.

You can find the WordCount.java here. Here's the 1.0.0 version I used:

/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.hadoop.examples;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

  public static class TokenizerMapper
       extends Mapper{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString(), " \t\n\r\f,.:;?![]'*");
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken().toLowerCase());
        context.write(word, one);
      }
    }
  }

  public static class IntSumReducer
       extends Reducer {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable values,
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      if (sum > 4) {
		  context.write(key, result);
	      result.set(sum);
	  }
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount  ");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

Compilation with Java 6 (1.6) is a bit more involved. I wrote a simple shell script, the important thing here is to get the classpath flag correct. Java veterans will of course have no problem with this, but it took me a few minutes to sort out.

#!/bin/bash
rm -rf output/
javac -classpath "../share/hadoop/lib/commons-cli-1.2.jar:../share/hadoop/hadoop-core-1.0.0.jar" -d classes src/WordCount.java
jar -cvf wordcount.jar -C classes/ .
../bin/hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount input output