LOAD ALL THE RUBIES!!!

Dangerous Cucumber Loading Issue

I recently discovered a potentially dangerous issue with how cucumber loads ruby files. The standard cucumber project expects a features directory in which to place your .feature files. Standard practice is to place supporting ruby files in features/support and to place step definitions in features/step_definitions, but that’s just convention. Cucumber recursively loads all .rb files under features/ (ensuring only that features/support/env.rb is loaded first). This is all well and good, so long as you actually have a features directory (and it’s in the root of your project). So what happens if you name your features directory something else or your features directory is in a sub-directory? If you remember to tell cucumber which directory to load from, everything is fine (`cucumber alternative_folder/`). If you forget, however, cucumber will dutifully load (recursively) every ruby file it can find!

format_hard_drive.rb

This is still not much of a problem, assuming all your ruby files are just plain old ruby classes or benign ruby scripts. The problem really rears it’s head if, say, you have a utility_scripts directory full of potentially dangerous scripts for setting up project directories, manipulating test databases, or pretty much anything. (Who else has an `rm -rf /` ruby script?)

Cucumber - Load All The Things!!!

Lesson

Name your features directory features.

Fetch and Iterate

Tonight I stumbled across a blog post from the future! Or something like that. On 4/17 at 11:30PM I read a blog post that was published on 4/18 at 1:54. Time zones = time travel. Anyway…

It was about finding a cleaner, more succinct, syntax for a standard pattern:
fetch_data, process_data, rinse and repeat until no more data!

Quick, go read the original post. Don’t worry, it’s short.

{elevator music}

Okay, so how do we improve this? Well, in some languages (like JavaScript, Ruby, PHP, et. al), an assignment expression actually returns a value. And since the expression returns a value, the expression itself can be used as the looping condition, like so:

while(data = fetch()){
  process(data);
}​

Notice the single equals sign? We are doing assignment here, not an equality check. We invoke fetch() and the result is returned and assigned to data. Now the value of data itself is used as the while condition. So the data is continually fetched and processed until fetch() returns a falsy value. (In JavaScript, that would be one of false, undefined, null, 0, "", or NaN)

This is a very common pattern in PHP, especially when fetching/querying data from a database. It’s a much less common pattern in JavaScript, but still quite valid. As is with most useful things, this approach can be misused. It can easily be confused for a typo (missing an equals sign?) so it should be used with care. For instance, ensuring that your fetch() function and data variable are named clearly should make it less likely to be misunderstood. When used appropriately, it can certainly increase the clarity of your code.

JRuby on MSYS | MinGW

For many Windows users, like myself, the easiest way to get up and running with Ruby is to install JRuby. If you’re like me, then you may also be a Git user. Now this is just a hunch, but I would wager that if you’re a git user and interested in ruby, then there is a high probability that you’re also a fan of proper *nix shells. If all of the above hold true, keep reading.

As a Windows user without access to a proper command line shell (and too tired of fighting with cygwin), I was delighted to have a bash shell at my command after installing msysgit. The subsystem beneath msysgit is MSYS | MinGW. According to their site, MinGW (“Minimalistic GNU for Windows”) is a collection of freely available and freely distributable Windows specific header files and import libraries combined with GNU toolsets that allow one to produce native Windows programs that do not rely on any 3rd-party C runtime DLLs.[1] In addition to MinGW, MSYS is a collection of GNU utilities such as bash, make, gawk and grep to allow building of applications and programs which depend on traditionally UNIX tools to be present. It is intended to supplement MinGW and the deficiencies of the cmd shell. [2] In simplest terms, MSYS | MinGW is a lightweight Cygwin. It is ‘lightweight’ because it doesn’t provide *nix system calls or a POSIX emulation layer. However, if you’re looking for the standard *nix toolsets on Windows, MSYS | MinGW is a great utility.

NoClassDefFoundError: org/jruby/Main

With MinGW installed along with msysgit, I’ve returned to the bash shell as my primary shell on Windows. However, because MinGW is not quite *nix, nor is it really Windows, the standard JRuby installation doesn’t work out of the box. After running the JRuby installer, you pop open a bash shell and run jruby -v to verify the jruby/ruby version. Or you try to run irb or jirb to get a ruby console. Or you try to install a gem via gem install. Up pops an giant error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/jruby/Main
Caused by: java.lang.ClassNotFoundException: org.jruby.Main
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: org.jruby.Main.  Program will exit.

Of Shells and Executables

So what’s the problem? The JRuby installer for Windows includes quite a bit of stuff in the bin directory. First and foremost is jruby.exe. This is the real executable for Windows. You’ll also see jruby.bat. This is just a wrapper which calls jruby.exe. (I assume this is an backwards-compatibility artifact from before the jruby.exe launcher existed.) You’ll also notice an extension-less jruby shell script. When you execute any of these jruby commands (jruby, irb, jirb, gem, etc) from a Windows command prompt, it will fire off jruby.exe or jruby.bat because those are the file extensions it is configured to look for. However, the MinGW bash shell prefers the jruby shell script and executes that first. Near the top of this shell script, you’ll find a block of code that determines the OS the shell is running under.

# ----- Identify OS we are running under ---------------
case "`uname`" in
  CYGWIN*) cygwin=true;;
  Darwin) darwin=true;;
esac

The shell script assumes we’re either *nix, CYGWIN or Darwin (Mac). This is understandable as the Windows Command Prompt will not attempt to execute this script. However, now that we’re using a proper bash shell with MinGW, we need to tell JRuby to expect MinGW.

The Fix(es)

The simplest fix is to delete the shell script. This way, when MinGW searches for an executable, the first it finds is jruby.exe. Alternatively, you can add the following line to the case statement in the jruby script:

MINGW*) jruby.exe "$@"; exit $?;;

This line simply checks if running on MinGW and, if so, executes jruby.exe passing along any parameters. The shell script returns with the same exit code as jruby.exe. Now the case statement should look like:

# ----- Identify OS we are running under ---------------
case "`uname`" in
  CYGWIN*) cygwin=true;;
  Darwin) darwin=true;;
  MINGW*) jruby.exe "$@"; exit $?;;
esac

OSS and GitHub to the Rescue

Thanks to JRuby being an Open Source project, and GitHub for having awesome collaboration tools, this patch was submitted and accepted to the JRuby project on GitHub. Future installations of JRuby should work on MinGW out of the box.