Please Be Kind With My Path, Callee!

In almost every ruby project, we end up manipulating paths in some way. If not in the application or the library itself, at least in the tests, for isolating fixtures for example. Paths are everywhere, everytime.

In my opinion, ruby itself provides rather poor tools for manipulating paths: File.join, File.dirname, File.extname, etc. Also Pathname. But it belongs to the standard library, not to the core. In addition for some needed refactoring and enhancement, Pathname is not a very idomatic way to capture paths, as far as I know. In the community, the idiomatic way to capture a path seems to be a String, period.

The situation is a bit more complicated. Stuff could hopefully be slightly improved provided that we have an agreement on how path are to be recognized. My proposal can be summarized as follows:

If you write a library that expects a path argument in its public API, please implement the logic below

  def callee(arg)
    path   = arg.path    if arg.respond_to?(:path)
    path ||= arg.to_path if arg.respond_to?(:to_path)
    path ||= arg.to_str  if arg.respond_to?(:to_str)
    raise "Invalid path `#{arg}`" unless path

    # ... do something with path
  end

The code above recognizes most instances of current path manipulation libraries, notably Pathname under both ruby 1.8.x and 1.9.x. This rest of this post explains why I ended up using that particular code, but is in fact a request for an agreement! Thanks, callees!

A caller/callee contract...

To bring some order here, we must first distinguish between building paths and using paths. For instance:

  def callee(path)
    # I'm the callee, I will *use* the path
    ...
  end

  def caller
    # I'm the caller, I *build* a path
    path = ...
    callee(path)
  end

At first glance, Pathname and the like are mostly tools for building paths. In the example above, the way to build the path is the secret of the caller. The latter may use the tool it wants for the job, provided it passes a path argument recognized by the callee. In other words, caller and callee must have an agreement on how path captures a Path at the logical level. When both are in the same program or library, it is a matter of internal developer conventions. However, as passing paths across gem boundaries is very common, an broader agreement should be found.

Most callees expect a path to be passed as a String. This is not ideal, for at least two reasons:

  • It is not necessarily friendly. If the caller works with Pathname, it must take great care of always calling :to_s before passing paths to the outside world.
  • It tends to restrict APIs in situations where String must be logically distinguished from Path (see the Citrus example in next section, for example).

Nicer conventions are in used here and there, but no real agreement seems to exist, at least not one that I'm aware of. Let look at different examples I've found recently.

Different conventions in use

In Tilt for instance, paths seem to be those instances that respond to :to_str. In complex method signatures, for example, it recognizes a path as follows:

  def build_template(*args)
    path = args.find{ |arg| arg.respond_to?(:to_str) }
    ...
  end

Note that a later commit (but not the current release, 1.3.3 as today) let Tilt's master recognize paths via :path as well:

  def build_template(*args)
    path = args.find{ |arg| arg.respond_to?(:to_str) or arg.respond_to?(:path) }
    ...
  end

In ruby 1.9.x, Pathname itself seems to use a different convention:

  RUBY_VERSION
  # => "1.9.2" 

  Pathname.new('foo').path
  # => NoMethodError: undefined method `path' for #<Pathname:foo>

  Pathname.new('foo').to_path
  # => "foo" 

In Citrus, Mickael Jackson's parsing expressions library, paths are expected to respond to :to_path. More precisely, I made a contribution that allows passing a path instead of the text to parse itself. At that time, I thought that :to_path was standard in ruby 1.9 because of Pathname, so I coded something like this:

  def parse(parsable)
    if parsable.respond_to(:to_path)
      @source = File.read(source.to_path)
    else
      @source = parsable.to_s
    end
  end

Benoit Daloze's path manipulation library, that I use almost everyday, implements both :path and :to_path. Unfortunately, it does not work out of the box with the current release of Tilt:

  Path('foo').to_str
  # => NoMethodError: undefined method `to_str' for #<Path foo>  

More recently, in a pull request to Sinatra, I naturally ended up with the following logic:

  def path_as_string(arg)
    path   = arg.path    if arg.respond_to?(:path)
    path ||= arg.to_path if arg.respond_to?(:to_path)
    path ||= arg.to_s    # even more permissive than :to_str here
    path  
  end

... which triggered this request for comments, and my proposal above.

Conclusion

What do you think? Do we agree on this standard way to recognize paths? If no, why and what do you propose?

In the long run, I would also be in favor of having a Path class inside ruby core itself. Not in the standard library, in the core. We use paths everywhere and everytime. I would vote for Benoit Daloze's Path abstraction, but that's not really important ;-)