This post gives a overview of the changes I've made to reduce the overhead of running individual tests in my largest production application.
My largest production application isn't a huge application. It's currently around 150 models in separate files and 650 routes in 50 route files (one file per top-level routing branch). It's been in continuous development since 2003, originally written in spaghetti PHP before being converted to Rails and ActiveRecord in 2005. In 2008, it switched from ActiveRecord to Sequel, and in 2014 it switched from Rails to Roda.
As a baseline, without any of the changes described in this blog post, the application can load all models in about 9.5 seconds, run a single model test in about 10.6 seconds, and run a single integration test in about 13.2 seconds. These times and all times in this article are using a computer with a Xeon E5-2670 CPU, originally released in 2012.
About half of the time loading the models is spent by Sequel's pg_auto_constraint_validations plugin, which does multiple queries for each model class in order to automatically turn many PostgreSQL constraint violations into validation failures. The pg_auto_constraint_validations plugin supports a :cache_file
option so that the query results can be cached. Using the :cache_file
plugin option drops model loading time to about 4.8 seconds.
Sequel's index_caching extension allows caching indexes for a given Sequel::Database. This is especially useful when using Sequel's auto_validations plugin, which does a query per model for indexes in order to automatically setup uniqueness validations. Use of this extension drops model loading time to about 3.7 seconds.
Sequel's schema_caching extension allows caching table schema entries for a given Sequel::Database
. This can skip a query per model class to get the schema for the model's table. Use of this extension drops model loading time to about 2.5 seconds.
Finally, Sequel has a static_cache_cache plugin, for caching values of all rows for models using the static_cache plugin. This application has about 50 models using the static_cache plugin. Use of this plugin drops loading time to about 2.0 seconds.
With these caching plugins, the application can run a single model test in about 3.1 seconds and a single integration test in about 5.6 seconds. That's slow enough that you can really feel it when running a single test, which may execute in under a tenth of a second. It's slow enough to pull you out of flow state. So you need more than query result caching to make running individual tests fast.
One way to reduce the amount of time for test setup is simply to load less code. Part of the 3.1 seconds when running a single model test is loading every model file in the application, even if you only may need a single model loaded to run the individual test. In general with Ruby, you use require to load code, which immediately loads the related file. However, Ruby also supports autoload, which will not load the file until the related constant is referenced. Historically, I've not been a fan of autoload
, mostly because my production applications run with limited file system access, originally using chroot and now using OpenBSD's unveil support. Ruby's autoload
does not work well with limited file system access, especially when using chroot
. However, for faster tests, sometimes sacrifices have to be made.
Unfortunately, switching to an autoload
-based approach took some work, as all of my production applications use rack-unreloader to handle loading and reloading code. While zeitwerk is a much more popular choice these days, rack-unreloader was developed about 5 years earlier and still has a few advantages for web applications using Roda and Sequel:
The main disadvantage of rack-unreloader is you have to use APIs specific to rack-unreloader to load files, since it does no monkey patching. This is unlike zeitwerk, which monkey patches require
, sets up a TracePoint
, and doesn't require zeitwerk-specific APIs to load files (though it may require zeitwerk-specific APIs for configuration). Another disadvantage for rack-unreloader is you have to write more code for loading files, since rack-unreloader doesn't assume file structure layouts map to class names. However, please take my ideas regarding these advantages and disadvantages with a grain of salt, as I don't have experience using zeitwerk in production.
Anyway, rack-unreloader historically only supported requiring files and reloading them for changes; it did not support autoloading. I had to make some changes to rack-unreloader to support autoloading. After the changes, rack-unreloader can be used in 4 possible modes:
require
without reloading (production/full test mode)require
with reloading (development mode)autoload
without reloading (individual test mode)autoload
with reloading (development mode with faster startup)In most cases, using rack-unreloader's autoloading support worked fine, since most files being autoloaded defined a single constant that Ruby could use a standard autoload
for. However, Roda's support for splitting up the routing tree using separate route files per branch using the hash_branches plugin cannot use autoload, because there isn't a separate constant referenced per routing branch. To work around this issue, I added an autoload_hash_branches plugin to Roda that builds on top of the hash_branches plugin and delays loading the route file until there is a request for that routing branch. I also added an autoload_named_routes plugin to Roda that builds on top of the named_routes plugin and operates similarly. Additionally, I updated roda-sequel-stack to use autoloading, to allow users to easily use the same approach I'm using in my production applications.
Switching the rack-unreloader configuration to use autoload and using the Roda autoload_hash_branches plugin reduced test overhead when running individual model tests from 3.1 seconds to 1.6 seconds and reduced test overhead when running individual web tests from 5.6 seconds to 3.2 seconds. So it definitely helped, but the end result is that the overhead still remained high enough to pull you out of flow state.
From some basic profiling of test startup, the vast majority of the remaining time was taken up by requiring the libraries used. The model test overhead was substantially smaller than the web test overhead because the model tests generally only relied on Sequel and minitest. The web tests also relied on Roda, Capybara, and dependencies of Capybara such as Nokogiri and rack-test. The only way to get further speedups would be to have the libraries the tests use already loaded, so you don't have to pay the startup cost for them.
It turns out, it's actually not too hard to preload Ruby libraries using a client/server approach with file descriptor passing, about 25 lines for the client and a little over 100 for the server.
fr
for "fast ruby"):#!/usr/local/bin/ruby --disable-gems
require 'socket'
frs_path = ENV['FRS_PATH'] || File.join(ENV["HOME"], '.frs_socket')
s = UNIXSocket.new(frs_path)
pid = s.readline("\0", chomp: true).to_i
raise "Invalid frs worker pid" unless pid > 1
s.send_io($stdin)
s.send_io($stdout)
s.send_io($stderr)
s.write(Dir.pwd)
s.write("\0")
ENV.each do |k, v|
s.write(k)
s.write("=")
s.write(v)
s.write("\0")
end
s.write("\0")
ARGV.each do |arg|
s.write(arg)
s.write("\0")
end
s.shutdown(Socket::SHUT_WR)
s.read
s.close
At a basic level, the client:
frs
for "fast ruby server"):#!/usr/local/bin/ruby
frs_path = ENV['FRS_PATH'] || File.join(ENV["HOME"], '.frs_socket')
require 'socket'
debug = ENV.delete('DEBUG')
if File.socket?(frs_path)
begin
s = UNIXSocket.new(frs_path)
print "Shutting down existing frs server at #{frs_path}..." if debug
pid = s.readline("\0", chomp: true).to_i
raise "Invalid frs worker pid" unless pid > 1
s.send_io($stdin)
s.send_io($stdout)
s.send_io($stderr)
s.write("close")
s.shutdown(Socket::SHUT_WR)
s.read
s.close
rescue => e
puts "#{e.class} #{e.message}" if s && debug
else
puts "Success!" if debug
end
s = nil
File.delete(frs_path)
end
exit if ARGV == ["close"]
ARGV.map{|f| require f}
puts $LOADED_FEATURES if debug == 'log'
# Prevent TOCTOU on server socket creation
umask = File.umask(077)
server = UNIXServer.new(frs_path)
File.umask(umask)
system('chmod', '600', frs_path)
Process.daemon unless ENV['FRS_NO_DAEMON']
queue = Queue.new
Thread.new do
Process.wait while queue.pop
end
while s = server.accept
queue.push(fork do
s.write($$.to_s)
s.write("\0")
$stdin.reopen(s.recv_io(IO))
$stdout.reopen(s.recv_io(IO))
$stderr.reopen(s.recv_io(IO))
cleanup = proc do
s.shutdown(Socket::SHUT_WR)
s.close
puts $LOADED_FEATURES if ENV['DEBUG'] == 'log'
end
dir = s.readline("\0", chomp: true)
if dir == 'close'
Process.kill(:KILL, Process.ppid)
cleanup.call
Process.exit
end
Dir.chdir(dir)
env = {}
while line = s.readline("\0", chomp: true)
break if line.empty?
k, v = line.split("=")
env[k] = v
end
ENV.replace(env)
files = []
args = []
while !s.eof?
arg = s.readline("\0", chomp: true)
(File.file?(arg) ? files : args) << arg
end
ARGV.replace(args)
files.each{|f| require File.expand_path(f)}
if (m = ARGV.first == 'm') || args.first&.match?(/\.rb:\d+\z/)
ARGV.shift if m
require 'm'
M.define_singleton_method(:exit!) do |res|
cleanup.call
super(res)
end
M.run(ARGV)
elsif ARGV.first == 'irb'
ARGV.shift
at_exit(&cleanup)
require 'irb'
IRB.start(__FILE__)
elsif !files.empty?
if defined?(Minitest) && Minitest.class_variable_get(:@@installed_at_exit)
Minitest.after_run(&cleanup)
else
at_exit(&cleanup)
end
else
$stderr.puts "No files given!"
$stderr.puts "ARGV: #{ARGV.inspect}"
cleanup.call
exit(1)
end
end)
end
At at basic level, the server:
The argument handling was tailored to my needs:
require
m
or something like path/to/file.rb:1234
, use the m gem to run a single minitest testirb
, open an IRB shellMinitest
is defined and set to autorun, assume it will run testsIf you are familiar with how a Rails library named Spring works, this client/server approach may sound familiar. After getting the client/server approach working, while developing this blog post, I looked at Spring's implementation and it uses a similar approach. It doesn't change the directory, but it does use Unix socket file descriptor passing to pass stdin, stdout, and stderr from the client to the server. It passes the arguments and environment from the client and server as well, though it uses JSON instead of the NULL-termination approach. Spring is also Rails specific and tries to keep only a single worker process in memory, closing other clients when a new client connects.
Using the client/server approach dramatically speeds up running individual tests, as long as the necessary libraries are already loaded. For my largest production application, the overhead from running individual model tests is reduced from 1.6 seconds to 0.5 seconds, and the overhead from running individual web tests is reduced from 3.2 seconds to 0.6 seconds. From the programmer's perspective, you get test output almost immediately, so you are not pulled out of flow state.
Another great part of this design is that the client and server are generic and not application-specific. All of my production application are developed using a small group of libraries, primarily Sequel, Roda, Rodauth, and Forme, and all of their tests are based on minitest, rack-test, and Capybara. I can use the same server process to speed up running individual tests for all of my production applications.
There are definitely issues with the original, proof-of-concept implementation. The server worker processes needed better error handling, and the client did not use the same exit status as the worker process. But it worked well enough for speeding up individual tests. Since I think this client/server approach for library preloading may be useful for other Ruby programmers, I fixed the issues, added tests, coverage, and CI tests and released it as by.
In the test environment:
ruby -e ''
takes about 0.3 secondsruby --disable-gems -e ''
takes about 0.08 seconds.by -e ''
takes about 0.15 seconds (if by
is setup to disable gems), so it's twice as fast as plain ruby
, because the client avoids loading rubygems (the server already has rubygems loaded).by-server
loading all libraries needed for the production applications starts in about 3 seconds.To make starting the server simple, I'm running it using by-server /path/to/by-require.rb
. The by-require.rb
then has all of the necessary requires:
(<<END).split.each{|f| require f}
sequel
roda
rodauth
...
END
(<<END).split.each{|f| require "sequel/extensions/#{f}"}
pg_json
pg_json_ops
...
END
(<<END).split.each{|f| require "sequel/plugins/#{f}"}
auto_validations
pg_auto_constraint_validations
...
END
(<<END).split.each{|f| require "roda/plugins/#{f}"}
render
route_csrf
...
END
(<<END).split.each{|f| require "rodauth/features/#{f}"}
base
login
...
END
Not all applications use all of the files required by the server process, but that is OK. Worst case scenario, let's say by
is loading a library that the application is using without requiring. This is a bug in the application that by
would hide. However, by
is only used for speeding up individual tests. It's not used when running the full test suite (which is done with the default rake task), and running the full test suite is always done before committing. The full test suite is parallelized for all of my large production applications and takes less than 100 seconds even for the largest application.