It is a truth universally acknowledged that a Rails app in possession of a good server must be in want of a Java process

The Ruby on Rails portion of Gander must, from time to time, create Java processes to do some of the backend work that keeps the email flowing.1 There are several easy ways to fork a process from Ruby, like ``, system, and methods in the Process and IO modules. These are great, but start to cause problems when you've got them scattered across a codebase. In particular, writing good tests around such code becomes an enormous headache.

There are several things we want to do when we test code that creates a Java process. We want to be able to write a test that verifies that the right arguments are passed to the process at the appropriate time. We want to write a test that uses the output of that process (usually data written to stdout) without actually creating the process, so that we can decouple testing the Ruby code from the Java code. Finally, we want to write integration tests that create new Java processes and then wait for them to finish, so that we can verify that our Rails app has the correct behavior while the process is running and after it's finished its work (e.g., populating a database). None of the built in methods for creating new processes allowed to do all of these things easily, so we made a, if I do say so myself, pretty slick abstraction layer to help us.

I've simplified the code slightly to remove some of the app-specific details, but here's the idea:

class Java
  def Java.run(detach, service, *argument, options)
    options.each_pair do |key, value|
      unless value.nil? or value == false
        arguments << "--#{key}"
        if value == true
          # do nothing
        else if value.is_a? String
          arguments << value
        else
          raise "The value for #{key} is neither a string nor boolean; value: #{value}"
        end
      end
    end
    dir = "#{Rails.root}/lib/"
    child_args = [{'RAILS_ENV' => Rails.env},
                  'java'
                  *(CONFIG['java_opts']),
                  '-jar', "#{service}.jar",
                  *(arguments)]
    if detach
      pid = Process.spawn(*child_args, :chdir => dir, [:out, :err] => ['java.log', 'a'])
      Process.detach(pid)
    else
      IO.popen([*child_args, {:chdir => dir, [:err] => [:out, :child]}]) do |io|
        io.read
      end
    end
  end
end

This gives us a single point to call into for all our Java needs, that we can easily stub out during tests to do whatever we want. Because run returns the pid for processes that it detaches from, we can also re-attach to the process during integration tests if we want to ensure that we wait around for the entire thing to finish. Some examples:

If we're writing a unit test for something that can spawn a Java process, we might see something like this (using RSpec mocks):

Java.stub(:run)
Java.should_receive(:run).with(true, 'sync', [], :user => 1234)
# or
Java.stub(:run)
Java.should_receive(:run).twice

That lets us check that we're passing the right arguments to the new process or calling it the right number of times, without actually running it. If we're writing a test that depends on the output of some Java code we might do

Java.should_receive(:run).with(any_args).and_return("9876")

So that we can test that the Ruby code does the right thing in response to any particular output from the Java code, without actually creating a new process. Finally, in integration tests we want to let the code actually create a full Java process and launch it. Most of the time the Ruby will detach from the Java process so that it can run in the background and, for example, sync a bunch of data into our database. In our tests we want to wait for the process to finish so that we can verify that our Rails code can work with the data it created properly. All we have to do is capture the pid when we call run and then rejoin the process:

thread = nil
Java.stub(:run) do |detach, service, *args, options|
# found this method name by putting a binding.pry here and ls Java
thread = Java.obfuscated_by_rspec_mocks__run(detach, service, *args, options)
end

# ... start the test, eventually calling something that invokes Java ...

thread.join

# ... the rest of the test happens after the Java code exits

With some help from the always-excellent pry we can gather the pid of the Java process from the internals of our Ruby code (which probably throws it away) and wait on it at the appropriate moment.


1: Why? Well, all the good libraries to work with things like PST files and ActiveSync protocols are in Java. Re-implementing all of that in Ruby is a non-trivial amount of work, and things like parsing large files and syncing a lot of mail need to happen in a background worker process anyway, so why not?

Comment