Sunday, September 30, 2007

Mixin and State, Any Problems?


Last week we were having a discussion about Ruby with a relative newbie who was wondering
about how Mixins and State work together.
I was sure that the discussion would be brief, explaining that all mixed in methods have full
access to its receivers state and that of course a Mixin could define initialize too.

And What About Class State?
... he said. Well, er, that is only a minor complication, I said. What exactly would you like to achieve?

We came up with a very simple, but useful, example. An Instance Tracker. It should allow us to easily
collect all instances of a class. The collected instances should than be accessible by a class method.

We decided to have some fun with TDD and specified the Tracker as follows:
  class TestInstanceTracker < Test::Unit::TestCase
def test_01_basic
a = Class::new {
include Tracker
}
a1 = a.new
assert_equal [a1], a.instances
a2 = a.new
assert_equal [a1, a2], a.instances
end # def test_01_basic
end # class TestInstanceTracker < Test::Unit::TestCase


Ok it did not take me a long time to figure it out, at first one had to assure that any new instance were added
to the instance collection, thus
  module Tracker
def initialize
self.class.instances << self
end

and than I had to labor that self.class.instances worked, therefore...
    def self.included into_a_class
into_a_class.instance_variable_set "@instances", []
class << into_a_class; attr_reader :instances end
end
end

Needless to say that code passed our testsuite (that was two assertions) brilliantly.
Now even if I had not read Kent Beck's eXtreme Programming,I had known that the testsuite was er incomplete.

As a matter of fact my collegue came up with this in no time
  def test_02_subclass
a = Class::new {
include Tracker
}
b = Class::new a
a1 = a.new
assert_equal [a1], a.instances
b1 = b.new
assert_equal [a1], a.instances
assert_equal [b1], b.instances
a2 = a.new
assert_equal [a1, a2], a.instances
assert_equal [b1], b.instances
end # def test_02_subclass

Yielding this nice little result

1) Error:
test_02_subclass(TestInstanceTracker):
NoMethodError: undefined method `<<' for nil:NilClass
./tracker2.rb:4:in `initialize'
test2.rb:24:in `new'
test2.rb:24:in `test_02_subclass'

And Subclasses?

Ok so he wants subclasses, yes sure, so would I, let us see...
As we are already telling the class we are included into to do some stuff, why not just tell it to include us
into any subclass which is created from itself

Such things can be done with the help of the class callback inherited.
  def self.included into_a_class
into_a_class.instance_variable_set "@instances", []
class << into_a_class
attr_reader :instances
def inherited by_class
by_class.send :include, Tracker
end
end
end # def self.included into_a_class

This again made the tests pass, which means that we had to come up with new tests.

And Module Inclusion?

He cannot be serious, what does he want?
Ah yes well, the following
  def test_03_includes
m = Module::new {
include Tracker
}
a = Class::new {
include m
}
b = Class::new a
a1 = a.new
assert_equal [a1], a.instances
b1 = b.new
assert_equal [a1], a.instances
assert_equal [b1], b.instances
a2 = a.new
assert_equal [a1, a2], a.instances
assert_equal [b1], b.instances
end # def test_03_includes

Actually it is pretty simple, when we are included into a class we do what we did so far:

  • Define the class instance variable

  • Define the class level accessor to the class instance variable

  • Tell the class to include us into all its subclasses via the self.inherited callback


When we are included into a module however, we have to do all the above again, that is in
  def self.included into_module
case into_module
when Module
#(1)

we have to put the exact code of the method at #(1) again.
Infinite code has some serious drawbacks though, very long development time being one that springs into mind first.

Ruby however let us define such a strange recursion without any problem:
  module Tracker
SelfIncludedProc = proc {
| into_a_module |
case into_a_module
when Class
into_a_module.instance_variable_set "@instances", []
class << into_a_module
attr_reader :instances
def inherited by_class
by_class.send :include, Tracker
end
end
when Module
class << into_a_module; self end.send :define_method, :included, &SelfIncludedProc
end # case into_a_module
}

class << self; self end.send :define_method, :included, &SelfIncludedProc

def initialize
self.class.instances << self
end
end

And that passes our tests again.

But there is one important test that does not pass:

This code just does not seem right

I thought about it a little bit and than the scales fell from my eyes, of course it is not right, I was doing two
things at the same place, coding and metacoding, so all I need to do is to tear these two parts apart.

The first and relatively uninteresting part is the application code that creates a class instance variable, its accessor and the automatic addition of new instances.

The second part is simply a kind of a Module that allows us to define class level code for
all subclasses and even via indirect inclusion.

That's what I want, I just have no idea how to call such a beast, so I called it RecModule,
great. I will accept all naming suggestions very cheerfully.



A Home Made Module Class

Let us have a look at the code:


class RecModule < Module
def recursive_included &blk
# Define a block that defines the included
inclusion = proc { | by_module |
super by_module
blk.call by_module if Class === by_module
class << by_module; self end.
send :define_method, :included, &inclusion
}
# Define included in base module
class << self; self end.
send :define_method, :included, &inclusion
end
end # class RecModule

Yep and?


Ok it will make more sense to you if we use it... but please note the repeating pattern of the way to define Module#included by means of a self referring block into our including modules.


M1 = RecModule.new {
recursive_included do |by_class|
class << by_class
attr_accessor :x
end
end
}

M2 = RecModule.new {
include M1
}
class A
include M2
end
B = Class::new A

A.x = 46
B.x = 42
p A.x # --> 46
p B.x # --> 42

I have implemented a recursive_include that will actually only execute its block for including
classes, but other similar tricks could have been used.


And Does It Really Work?


Best way to find out is to try it, I will just reimplement the InstanceTracker with the new idioms...



InstanceTracker = RecModule.new {
recursive_included do |by_class|
class << by_class
def instances
@__instances__ ||= []
end
end
end

def initialize *args, &blk
super
self.class.instances << self
end
}

M2 = RecModule.new {
include InstanceTracker
}
class A
include M2
end
B = Class::new A

A.new
B.new
A.new
p A.instances # --> [#<A:0x2b66028>, #<A:0x2b65fd8>]
p B.instances # --> [#<B:0x2b66014>]

Please note that M2 could even be a regular Module, I however think that using a RecModule reflects the intent of the code better, but that could be a matter of taste. For those who would prefer to use modules down the inclusion chain, it works perfectly too.


I am not sure that this is already the maximum I could have done in abstracting this functionality, but 14 lines of code for implementing such a basic InstanceTracker feels as about ok (10 lines would be better sigh).


There is however one thing that makes me quite nervous, I feel proud and pleased with myself about it...
Hopefully there will be some gentle reader who will tell me why I should not...