Speeding Up Tests in Applications Using Sequel, Roda, and rack-unreloader (2023-02-14)

This post gives a overview of the changes I've made to reduce the overhead of running individual tests in my largest production application.

My largest production application isn't a huge application. It's currently around 150 models in separate files and 650 routes in 50 route files (one file per top-level routing branch). It's been in continuous development since 2003, originally written in spaghetti PHP before being converted to Rails and ActiveRecord in 2005. In 2008, it switched from ActiveRecord to Sequel, and in 2014 it switched from Rails to Roda.

As a baseline, without any of the changes described in this blog post, the application can load all models in about 9.5 seconds, run a single model test in about 10.6 seconds, and run a single integration test in about 13.2 seconds. These times and all times in this article are using a computer with a Xeon E5-2670 CPU, originally released in 2012.

Using Sequel Caching Plugins

About half of the time loading the models is spent by Sequel's pg_auto_constraint_validations plugin, which does multiple queries for each model class in order to automatically turn many PostgreSQL constraint violations into validation failures. The pg_auto_constraint_validations plugin supports a :cache_file option so that the query results can be cached. Using the :cache_file plugin option drops model loading time to about 4.8 seconds.

Sequel's index_caching extension allows caching indexes for a given Sequel::Database. This is especially useful when using Sequel's auto_validations plugin, which does a query per model for indexes in order to automatically setup uniqueness validations. Use of this extension drops model loading time to about 3.7 seconds.

Sequel's schema_caching extension allows caching table schema entries for a given Sequel::Database. This can skip a query per model class to get the schema for the model's table. Use of this extension drops model loading time to about 2.5 seconds.

Finally, Sequel has a static_cache_cache plugin, for caching values of all rows for models using the static_cache plugin. This application has about 50 models using the static_cache plugin. Use of this plugin drops loading time to about 2.0 seconds.

With these caching plugins, the application can run a single model test in about 3.1 seconds and a single integration test in about 5.6 seconds. That's slow enough that you can really feel it when running a single test, which may execute in under a tenth of a second. It's slow enough to pull you out of flow state. So you need more than query result caching to make running individual tests fast.

Enabling Autoloading of Sequel Model Files and Routes

One way to reduce the amount of time for test setup is simply to load less code. Part of the 3.1 seconds when running a single model test is loading every model file in the application, even if you only may need a single model loaded to run the individual test. In general with Ruby, you use require to load code, which immediately loads the related file. However, Ruby also supports autoload, which will not load the file until the related constant is referenced. Historically, I've not been a fan of autoload, mostly because my production applications run with limited file system access, originally using chroot and now using OpenBSD's unveil support. Ruby's autoload does not work well with limited file system access, especially when using chroot. However, for faster tests, sometimes sacrifices have to be made.

rack-unreloader

Unfortunately, switching to an autoload-based approach took some work, as all of my production applications use rack-unreloader to handle loading and reloading code. While zeitwerk is a much more popular choice these days, rack-unreloader was developed about 5 years earlier and still has a few advantages for web applications using Roda and Sequel:

The main disadvantage of rack-unreloader is you have to use APIs specific to rack-unreloader to load files, since it does no monkey patching. This is unlike zeitwerk, which monkey patches require, sets up a TracePoint, and doesn't require zeitwerk-specific APIs to load files (though it may require zeitwerk-specific APIs for configuration). Another disadvantage for rack-unreloader is you have to write more code for loading files, since rack-unreloader doesn't assume file structure layouts map to class names. However, please take my ideas regarding these advantages and disadvantages with a grain of salt, as I don't have experience using zeitwerk in production.

Anyway, rack-unreloader historically only supported requiring files and reloading them for changes; it did not support autoloading. I had to make some changes to rack-unreloader to support autoloading. After the changes, rack-unreloader can be used in 4 possible modes:

Roda Autoloading

In most cases, using rack-unreloader's autoloading support worked fine, since most files being autoloaded defined a single constant that Ruby could use a standard autoload for. However, Roda's support for splitting up the routing tree using separate route files per branch using the hash_branches plugin cannot use autoload, because there isn't a separate constant referenced per routing branch. To work around this issue, I added an autoload_hash_branches plugin to Roda that builds on top of the hash_branches plugin and delays loading the route file until there is a request for that routing branch. I also added an autoload_named_routes plugin to Roda that builds on top of the named_routes plugin and operates similarly. Additionally, I updated roda-sequel-stack to use autoloading, to allow users to easily use the same approach I'm using in my production applications.

Speedup from Autoloading

Switching the rack-unreloader configuration to use autoload and using the Roda autoload_hash_branches plugin reduced test overhead when running individual model tests from 3.1 seconds to 1.6 seconds and reduced test overhead when running individual web tests from 5.6 seconds to 3.2 seconds. So it definitely helped, but the end result is that the overhead still remained high enough to pull you out of flow state.

From some basic profiling of test startup, the vast majority of the remaining time was taken up by requiring the libraries used. The model test overhead was substantially smaller than the web test overhead because the model tests generally only relied on Sequel and minitest. The web tests also relied on Roda, Capybara, and dependencies of Capybara such as Nokogiri and rack-test. The only way to get further speedups would be to have the libraries the tests use already loaded, so you don't have to pay the startup cost for them.

Preloading Ruby Libraries with a Client/Server Approach

It turns out, it's actually not too hard to preload Ruby libraries using a client/server approach with file descriptor passing, about 25 lines for the client and a little over 100 for the server.

Client (originally named fr for "fast ruby"):

#!/usr/local/bin/ruby --disable-gems

require 'socket'
frs_path = ENV['FRS_PATH'] || File.join(ENV["HOME"], '.frs_socket')
s = UNIXSocket.new(frs_path)
pid = s.readline("\0", chomp: true).to_i
raise "Invalid frs worker pid" unless pid > 1
s.send_io($stdin)
s.send_io($stdout)
s.send_io($stderr)

s.write(Dir.pwd)
s.write("\0")

ENV.each do |k, v|
  s.write(k)
  s.write("=")
  s.write(v)
  s.write("\0")
end
s.write("\0")

ARGV.each do |arg|
  s.write(arg)
  s.write("\0")
end

s.shutdown(Socket::SHUT_WR)
s.read
s.close

At a basic level, the client:

Server (originally named frs for "fast ruby server"):

#!/usr/local/bin/ruby

frs_path = ENV['FRS_PATH'] || File.join(ENV["HOME"], '.frs_socket')
require 'socket'
debug = ENV.delete('DEBUG')

if File.socket?(frs_path)
  begin
    s = UNIXSocket.new(frs_path)
    print "Shutting down existing frs server at #{frs_path}..." if debug
    pid = s.readline("\0", chomp: true).to_i
    raise "Invalid frs worker pid" unless pid > 1
    s.send_io($stdin)
    s.send_io($stdout)
    s.send_io($stderr)
    s.write("close")
    s.shutdown(Socket::SHUT_WR)
    s.read
    s.close
  rescue => e
    puts "#{e.class} #{e.message}" if s && debug
  else
    puts "Success!" if debug
  end
  s = nil
  File.delete(frs_path)
end

exit if ARGV == ["close"]

ARGV.map{|f| require f}
puts $LOADED_FEATURES if debug == 'log'

# Prevent TOCTOU on server socket creation
umask = File.umask(077)
server = UNIXServer.new(frs_path)
File.umask(umask)
system('chmod', '600', frs_path)

Process.daemon unless ENV['FRS_NO_DAEMON']

queue = Queue.new

Thread.new do
  Process.wait while queue.pop
end

while s = server.accept
  queue.push(fork do
    s.write($$.to_s)
    s.write("\0")

    $stdin.reopen(s.recv_io(IO))
    $stdout.reopen(s.recv_io(IO))
    $stderr.reopen(s.recv_io(IO))

    cleanup = proc do
      s.shutdown(Socket::SHUT_WR)
      s.close
      puts $LOADED_FEATURES if ENV['DEBUG'] == 'log'
    end

    dir = s.readline("\0", chomp: true)
    if dir == 'close'
      Process.kill(:KILL, Process.ppid)
      cleanup.call
      Process.exit
    end

    Dir.chdir(dir)

    env = {}
    while line = s.readline("\0", chomp: true)
      break if line.empty?
      k, v = line.split("=")
      env[k] = v
    end
    ENV.replace(env)

    files = []
    args = []
    while !s.eof?
      arg = s.readline("\0", chomp: true)
      (File.file?(arg) ? files : args) << arg 
    end
    ARGV.replace(args)
    files.each{|f| require File.expand_path(f)}

    if (m = ARGV.first == 'm') || args.first&.match?(/\.rb:\d+\z/)
      ARGV.shift if m
      require 'm'
      M.define_singleton_method(:exit!) do |res|
        cleanup.call
        super(res)
      end
      M.run(ARGV)
    elsif ARGV.first == 'irb'
      ARGV.shift
      at_exit(&cleanup)
      require 'irb'
      IRB.start(__FILE__)
    elsif !files.empty?
      if defined?(Minitest) && Minitest.class_variable_get(:@@installed_at_exit)
        Minitest.after_run(&cleanup)
      else
        at_exit(&cleanup)
      end
    else
      $stderr.puts "No files given!"
      $stderr.puts "ARGV: #{ARGV.inspect}"
      cleanup.call
      exit(1)
    end
  end)
end

At at basic level, the server:

  • Shuts down existing server if it is running
  • Creates the Unix socket
  • Requires all arguments
  • Daemonizes
  • Accepts clients from the Unix socket, forking per connection
  • Passes client the worker pid as an indication it is ready
  • Receives the stdin, stdout, stderr file descriptors from the client and uses those
  • Changes to the directory passed by the client
  • Replaces the server's environment variables with the client's environment variables
  • Handles arguments provided by the client
  • Closes the socket connection

The argument handling was tailored to my needs:

  • Treats arguments that are valid files as files to require
  • If first remaining argument is m or something like path/to/file.rb:1234, use the m gem to run a single minitest test
  • If first remaining argument is irb, open an IRB shell
  • If any files were required, and Minitest is defined and set to autorun, assume it will run tests
  • If no files were required, print an error message

If you are familiar with how a Rails library named Spring works, this client/server approach may sound familiar. After getting the client/server approach working, while developing this blog post, I looked at Spring's implementation and it uses a similar approach. It doesn't change the directory, but it does use Unix socket file descriptor passing to pass stdin, stdout, and stderr from the client to the server. It passes the arguments and environment from the client and server as well, though it uses JSON instead of the NULL-termination approach. Spring is also Rails specific and tries to keep only a single worker process in memory, closing other clients when a new client connects.

Using the client/server approach dramatically speeds up running individual tests, as long as the necessary libraries are already loaded. For my largest production application, the overhead from running individual model tests is reduced from 1.6 seconds to 0.5 seconds, and the overhead from running individual web tests is reduced from 3.2 seconds to 0.6 seconds. From the programmer's perspective, you get test output almost immediately, so you are not pulled out of flow state.

Another great part of this design is that the client and server are generic and not application-specific. All of my production application are developed using a small group of libraries, primarily Sequel, Roda, Rodauth, and Forme, and all of their tests are based on minitest, rack-test, and Capybara. I can use the same server process to speed up running individual tests for all of my production applications.

There are definitely issues with the original, proof-of-concept implementation. The server worker processes needed better error handling, and the client did not use the same exit status as the worker process. But it worked well enough for speeding up individual tests. Since I think this client/server approach for library preloading may be useful for other Ruby programmers, I fixed the issues, added tests, coverage, and CI tests and released it as by.

In the test environment:

To make starting the server simple, I'm running it using by-server /path/to/by-require.rb. The by-require.rb then has all of the necessary requires:

(<<END).split.each{|f| require f}
sequel
roda
rodauth
...
END

(<<END).split.each{|f| require "sequel/extensions/#{f}"}
pg_json
pg_json_ops
...
END

(<<END).split.each{|f| require "sequel/plugins/#{f}"}
auto_validations
pg_auto_constraint_validations
...
END

(<<END).split.each{|f| require "roda/plugins/#{f}"}
render
route_csrf
...
END

(<<END).split.each{|f| require "rodauth/features/#{f}"}
base
login
...
END

Not all applications use all of the files required by the server process, but that is OK. Worst case scenario, let's say by is loading a library that the application is using without requiring. This is a bug in the application that by would hide. However, by is only used for speeding up individual tests. It's not used when running the full test suite (which is done with the default rake task), and running the full test suite is always done before committing. The full test suite is parallelized for all of my large production applications and takes less than 100 seconds even for the largest application.