-- title: Closed algebraic data types for Ruby -- author: pesco -- date: 2009-12-11 13:37 -- lang: en Today, a little detour off the path to world domination. Or maybe not; who knows. I use Ruby[http://www.rubylang.org] at work and I guess it's pointless to deny that I regularly cringe at it for its /rude/ failure at being just like Haskell. Even though, I have to admit that it is rather flexible: Algebraic data types in Haskell look like this: data Expr = Lam String Expr | App Expr Expr | Var String It says that there are three kinds of expressions, called lambdas, applications and variables, consisting of the given data fields (e.g. a variable name and a body for lambdas, and so on). Operations on such a type are defined by giving cases for each kind of argument. Off the shelf, there are no algebraic data types in Ruby. Google didn't find anything, so I thought about it. With the right plumbing, it's possible to do the following: class Expr ctor :lam, :var => String, :body => Expr ctor :app, :op => Expr, :arg => Expr ctor :var, :name => String end Not much longer and only slightly uglier! \o/ And with type checks, too. The result is a class 'Expr' with three singleton constructor methods, similar to the usual 'new'. Each takes a single hash as its argument that assigns values to the relevant fields. All fields must be provided and their types must match the definition. For example: id = Expr.lam(:var=>"x", :body=>Expr.var(:name=>"x")) # (\x.x) The 'ctor' method defines accessors to the constructor "tag" and each field. A typical method definition on 'Expr's would look like this: class Expr def eval(env={}) if is_lam? then self elsif is_app? then f = op.eval(env) if f.is_lam? x = arg.eval(env) f.body.subst(f.var,x).eval else raise "operator is not a function" end elsif is_var? then env[name] || self end end end Obviously, this is in contrast to the "object-oriented" way of distributing operations on different kinds of things over respective (sub-) class definitions. Aside from being a matter of taste, I prefer the above in this case, because the definitions of evaluation for different kinds of expressions have to fit together. Their interaction is easier to view when the definition is in one place instead of scattered across multiple classes. To the plumbing: Everything happens in 'ctor', which is singleton method of 'Class', so it can be used in a class definition (similar to convenience methods like 'attr_reader' and such). The routine looks like this: class Class def ctor(ctorname, argtypes) # add singleton constructor method to self self_ = (class << self; self; end) self_.module_eval do define_method(ctorname) do |argvalues| x = allocate # check argument types # [...] # fill instance variables of x x.instance_eval do @ctor = ctorname @args = argvalues end x end # constructor end # singleton methods of our ADT # inspection methods attr_reader :ctor attr_reader :args # define accessors and convenience methods on self # [...] end end The full source code along with a complete lambda evaluator is in the file attached below. The evaluator also supports destructive in-place updates, effectively yielding graph reduction. Note: The copyright notice states "isc license"[http://opensource.org/licenses/isc-license.txt], meaning do whatever you want as long as you leave the notice intact. **Attachment:** 'adt.rb'[/log/2009/dec/adt.rb] **PS.** Happy Birthday, you know who you are.