z, ?	toggle help (this)
space, →	next slide
shift-space, ←	previous slide
d	toggle debug mode
## <ret>	go to slide #
r	reload slides
n	toggle notes

loading presentation...

Kon’nichiwa (Good afternoon), RubyWorld attendees. I am sure many of you feel like I do, in that we wish we could be experiencing this conference in person. Unfortunately, that is not possible this year due to the global pandemic, but I hope I will be able to see you in person next year.|First, I would like to express my appreciation to the RubyPrize executive committee. I was deeply honored last year to be selected as a RubyPrize Final Nominee, and even more honored this year to be selected as the RubyPrize winner. I only hope I can continue to be worthy of this recognition.

My name is Jeremy Evans. I became a Ruby committer last year on Endoh-san’s recommendation. One of the big changes I focused on last year was implementing the separation of positional and keyword arguments, which is fully complete in Ruby 3. Other than that, since becoming a committer, I have mostly focused on triaging and fixing bugs that have been filed in Ruby’s bug tracker.|One of the privileges of being selected as the RubyPrize winner is the ability to give a presentation to the attendees of RubyWorld Conference.

GitHub: jeremyevans

Twitter: @jeremyevans0

In my presentation, I will be discussing some changes I have worked on this year to improve Ruby’s object model. These improvements were all the result of fixing bugs that had been filed in the bug tracker, many of them years old. You can currently test these improvements in Ruby 3 preview 2, and you will be able to benefit from them on the 25th when Ruby 3 is released.

Object Model

Improvements in

Ruby 3

RubyWorld 2020

One of the most significant improvements to the object model in Ruby 3 is that calling include on a Module affects modules that have already included the receiver.

Module#include

Affects Modules

Including Reciever

Let me show an example of what that means.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end

Here you have a module named EachString that defines an each_string method that yields values as strings.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end

For convenience, you want to include this module in Enumerable, so the each_string method is available to all classes that include Enumerable.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end

This should allow you to call each_string on a hash, since the Hash class includes Enumerable.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end

And in Ruby 3, that works correctly, with each value being yielded as a string. This does show that Hash#each yields a single array argument with the key and value, and not the key and hash as separate arguments.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end
# prints "[:a, 1]"

Kanashi ka na (Alas), in Ruby 2.7, this raises a NoMethodError. This is because before Ruby 3, including EachString in Enumerable does not affect classes that have already included Enumerable. Starting in Ruby 3, when you include EachString in Enumerable, Ruby looks at all classes that have already included Enumerable, and adds the EachString module at the appropriate point in the ancestor chain in those classes.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end
# NoMethodError!

The historical reason that Enumerable.include did not affect the Hash class was that originally Ruby did not keep a reference in Enumerable that it was included in Hash. So there was no way to easily find the classes that had already included Enumerable.|Charlie Somerville implemented an optimization for method cache invalidation in Ruby 2.1 that added this tracking, so that when Enumerable was included in a another class or module, Enumerable kept a reference to that class or module. That list of references is what Ruby 3 uses to find all of the ancestor chains where the EachString module needs to be added.

module EachString
  def each_string
    each do |*x|
      yield(*x.map(&:to_s))
    end
  end
end

Enumerable.include EachString

{a: 1}.each_string do |x|
  p x
end

At the same time I implemented the Module#include fix, I also worked the exact same fix for Module#prepend.

Module#prepend

Affects Modules

Including Reciever

Unfortunately, I was not able to commit that fix at the same time. Not because the fix itself was wrong, but because the fix exposed that there were many issues with internal classes called origin iclasses, which Module#prepend uses.

Module#prepend

Affects Modules

Including Reciever

failure!

One issue was that when a module with an origin iclass was included into a class, the origin pointer in one of the created iclasses was set incorrectly. Here is an example showing the problem.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

Here we create a class named A with methods m and m3.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

We also have a module M, which prepends an empty module, and also has a method m that calls super and a method m2 that calls m3. We only prepend the empty module to force module M to have an origin iclass.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

Then we have a subclass of A named B, which includes module M and then aliases method m to m3.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

When we call the m2 method on an instance of B.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

It should call this m2 method, which calls m3.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

We aliased the m method to m3 in class B, so this should call the m method.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

The method it should call is this one in module M. This m method just calls super, and since this method is defined in M, and M is included in B, it should call the method in the next ancestor.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

That’s this method, in A, the superclass of B.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

And that is what happens in Ruby 3. The method returns the string M.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2
# => "M"

Kanashi ka na (Alas), in Ruby 2.7, this raises another NoMethodError.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2
# NoMethodError

This is because in the m method of module M, the super call here cannot find a super method, because the origin pointers in the iclass were not set correctly.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

Fixing the origin iclass pointers was no small feat, since setting the correct origin pointer cannot be done when iclasses are created, since the origin iclass doesn’t exist at that point. So we needed to add a stack to keep track of the iclasses so that when the origin iclass was created, we could set the origin pointer correctly in the original iclass to point to the origin iclass.|Making that change caused nondeterministic failures in garbage collection, so the garbage collector needed to be modified to treat origin iclasses differently. It also caused virtual machine assertion failures, so the virtual machine needed to be modified to also treat origin iclasses differently. All of this to fix a seemingly small issue, but this was a necessary step.

class A
  def m; "M"; end
  def m3; "M3"; end
end
module M
  prepend Module.new
  def m; super end
  def m2; m3; end
end
class B < A
  include M
  alias_method :m3, :m
end
B.new.m2

The next bug fix was making this code work. This code does not even use prepend, but fixing it used an approach that fixed other issues that allowed Module#prepend to affect existing classes including the receiver.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

We start by defining a module M and a class C that includes it.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

We then define a module R that refines M to add a method, and then uses that refinement.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

We then create another class named A that also includes M.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

Then you try to call the method added by refinement R on an instance of C. This should work because C includes M and R refines M.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

And in Ruby 3, it does work, returning the expected value.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method
# => :rm

Kanashi ka na (Alas), in Ruby 2.7, you get yet another NoMethodError. Internally what happens is the refined method moves to a separate internal iclass that is in the ancestor chain of A but not in the ancestor chain of C.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method
# NoMethodError

Fixing this case in Ruby 3 was accomplished by creating origin iclasses for all modules that were included, prepended, or refined. Unfortunately, that heavy-handed approach bloated memory in some large applications due to the additional iclass objects created. Thankfully, Alan Wu was able to optimize this and only create origin iclasses for modules that are prepended, by looking at all classes that had already included the module and modifying them at the same time. This change to automatically create origin iclasses exposed a multiple additional issues that needed to be fixed.

module M; end

class C
  include M
end

module R
  refine M do
    def refined_method
      :rm
    end
  end
end
using R

class A
  include M
end

C.new.refined_method

One bug it exposed was that duping or cloning a class that had any modules prepended resulted in a incorrect ancestor chain.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b

Here we have a class named C that defines a method named b and prepends an empty module. Again, the module itself doesn’t matter, what matters is this forces the class to have an origin iclass.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b

We then create a copy of class C and name it C2.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b

We then overide the b method on the original class to return a different result.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b

Finally, we call the b method on an instance of C2.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b

This should return the value 2, since at the time we copied C, the method returned 2. And in Ruby 3, that is what you get.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b
# => 2

Kanashi ka na (Alas), in Ruby 2.7, this method returns 1. That is because the ancestor chain for C2 includes the origin iclass of C, instead of a separate origin iclass for C2. Fixing this required significant changes to Module#initialize_copy, making the duped class use a separate copy of all iclasses between the receiver and the receiver’s origin iclass.|In addition to this issue, the switch to automatically create origin iclasses for modules also uncovered method cache invalidation bugs for modules with origin iclasses, and those needed to be fixed as well.

class C
  def b; 2 end
  prepend Module.new
end

C2 = C.dup

class C
  def b; 1; end
end

C2.new.b
# => 1

Thankfully, after fixing those issues, I was able to make Module#prepend affect modules already including the receiver, without any tests breaking.

Module#prepend

Affects Modules

Including Reciever

success!

Another bug in the object model that I was able to fix in Ruby 3 was a case where Module#include ends up operating as Module#prepend, inserting a module before the current class instead of after the current class in the ancestor chain. I think this was the oldest bug in the bug tracker related to the object model, filed by Endoh-san just before the release of Ruby 2.0.

Module#include

operates like

Module#prepend

Here is a example showing the problem.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

We create modules P and M, and class C.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

We prepend module P to both module M and class C.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

We can check and at this point, the ancestor chain for C starts with P, then C, then Object. This makes sense because C prepends P, and the superclass of C is Object.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

We then include module M in class C.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

Then we check what the ancestor chain of C starts with.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]

In Ruby 3, this works as we expect, putting M after C, since C included M and did not prepend M.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]
# => [P, C, M]

Kanashi ka na (Alas), in Ruby 2.7, the module M is inserted before class C instead of after class C in the ancestor chain. This happens because module M prepends module P, and so it first needs to include module P in the ancestor chain before including module M. However, since module P is already in the ancestor chain, it just inserts module M directly after it, even though it is before class C. In Ruby 3, Module#include always waits until after the class before inserting modules in the ancestor chain.

module P; end
module M; end
class C; end

M.prepend P
C.prepend P

C.ancestors[0, 3]
# => [P, C, Object]

C.include M

C.ancestors[0, 3]
# => [P, M, C]

The final bug fix I would like to discuss is related to an interaction between super_method and aliases.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call

Here we setup three classes, C0, subclass C1, and subclass of that C2. We also define m1 or m2 methods that will show which methods are called.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call

Unlike the other methods, in C1, the m1 method is aliased as m2.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call

If we call the m2 method on an instance of C2, we get the expected result, C2_m2, then C1_m1 because C1 aliased m1 to m2, then C0_m1 because the aliased method called super.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call

If we call the method method on the C2 instance to get a Method object, and then call super_method twice, we should get the Method in C0 that will be called.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call

In Ruby 3, if we call that Method, we get the expected result, with C0_m1 being returned.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call
# => [:C0_m1]

Kanashi ka na (Alas), in Ruby 2.7, the method returns C0_m2, since it used the Method’s called id instead of the method’s original id when looking for the super method. Thankfully, fixing this was straightforward, and it did not cause any additional bugs.

class C0
  def m1; [:C0_m1] end
  def m2; [:C0_m2] end
end
class C1 < C0
  def m1; [:C1_m1] + super end
  alias m2 m1
end
class C2 < C1
  def m2; [:C2_m2] + super end
end

o = C2.new
o.m2
# => [:C2_m2, :C1_m1, :C0_m1]

o.method(:m2).super_method.super_method.call
# => [:C0_m2]

That concludes my presentation. I hope you had fun learning about the object model improvements in Ruby 3. Thank all of you for listening to me.

Photo credits

Photo Credits

Thank You: rawpixel.com