A random walk through profiling RSpec tests

In the past couple of weeks I have been spending more time looking at the performance of the pages I work in through the use of rack-mini-profiler and New Relic.
I’ve noticed that as I have more visibility of the impact of code on performance, I write more performant code.

I started wondering if some of the principles I learned about profiling code (specially database queries) could also be applied to profiling tests. I started tinkering with different approaches to profiling RSpec tests and I wanted to share some of the things I learned.

When profiling code I am looking for the following things:

  • Understanding how much time something takes to execute. The first question I ask my self often is “Was it slow? How slow?”
  • Understanding which parts of the code actually took the longest to execute. This is about understanding “Why was it slow?”
  • Bonus: Getting some tips on how to fix the main bottlenecks of performance. Most of the performance issues are pretty common therefore in some cases profiling tools can provide hints at how to fix them. So, “How can I make it less slow?”

How slow?

Whenever you run tests in RSpec it tells how much it took to run them:

› bundle exec rspec spec/models/user_spec.rb

Finished in 0.09569 seconds (files took 1.8 seconds to load)
2 examples, 0 failures

The time RSpec reports is good enough for most uses I’ve had. But the time UNIX tool gives a more reliable account of the amount of time a command takes to run since it includes boot time (i.e. the total time indicated in the time command is higher than the total time indicated by just running the rspec command:

› time bundle exec rspec spec/models/user_spec.rb

Finished in 0.09569 seconds (files took 1.8 seconds to load)
2 examples, 0 failures

bundle exec rspec spec/models/user_spec.rb  1.78s user 0.47s system 74% cpu 3.009 total

In good ol’ Pareto fashion you can see the slowest examples by running RSpec with the -—profile option:

› bundle exec rspec spec/something_spec.rb --profile

Top 10 slowest examples (0.70737 seconds, 93.5% of total time):
  something sleeps for 0.15 seconds (example 3)
    0.15385 seconds ./spec/something_spec.rb:13
  something sleeps for 0.1 seconds (example 1)
    0.10804 seconds ./spec/something_spec.rb:4
  something sleeps for 0.05 seconds (example 8)
    0.05714 seconds ./spec/something_spec.rb:38
  something sleeps for 0.05 seconds (example 11)
    0.05676 seconds ./spec/something_spec.rb:53
  something sleeps for 0.05 seconds (example 10)
    0.05636 seconds ./spec/something_spec.rb:48
  something sleeps for 0.05 seconds (example 5)
    0.05618 seconds ./spec/something_spec.rb:23
  something sleeps for 0.05 seconds (example 6)
    0.05594 seconds ./spec/something_spec.rb:28
  something sleeps for 0.05 seconds (example 9)
    0.05584 seconds ./spec/something_spec.rb:43
  something sleeps for 0.05 seconds (example 7)
    0.05412 seconds ./spec/something_spec.rb:33
  something sleeps for 0.05 seconds (example 4)
    0.05314 seconds ./spec/something_spec.rb:18

Finished in 0.7564 seconds (files took 0.89807 seconds to load)
11 examples, 0 failures

Why was it slow?

Profilers can help us to identify which parts of code take the longest to run. Without that knowledge it is hard to know what to fix. In regards to profiling tests, often we might have some assumptions on what takes the longest (e.g. database access, boot time) but occasionally what we find with proper profiling might surprise us (e.g. memory leaks, incorrect assumptions about how gems work, unexpected HTTP requests).

One of the tools I have been using for this purpose is stackprof. Below is an example on how to configure stackprof to run with a RSpec test.

You can see the result of the profiling by running the commands below:

stackprof works by periodically checking the call stack and verifying the methods that are there. Above, the “Total” column indicates the amount of times a method was found to be in the stack, the “Samples” column indicates the amount of times a method was found to be in the top of the stack. The methods with the highest number of samples are the ones that take the longest to run.

Therefore, looking at the stackprof report above we can see that ActiveRecord::ConnectionAdapters::PostgreSQL::DatabaseStatements#execute takes the longest to run (it has 20.8% of the samples). And that there are a couple of other ActiveRecord/ActiveModel related methods in the list of methods with the biggest amount of samples. That information can lead us to conclude that most of the time is being spent in database related accesses.

The downside of this report would be that you actually don’t get a grasp of the user defined methods that lead to these calls of ActiveRecord related code (e.g. ActiveRecord::ConnectionAdapters::PostgreSQL::DatabaseStatements#execute). Although stackprof allows you to follow the stack up (with the --method flag, you might be too far down the stack to extract much value from it:

You can generate a flame graph by running the commands below and opening the resulting file URI on a browser (Chrome or Safari).

The flame graph is interactive so you can drag and drop, zoom, filter, navigation into the areas of most interest.


The flame graph made the profile easier to comprehend. The flame graph displays the amount of samples on the x axis and the stack on the y axis. We can easily see that the widest bars are the ones that took the longest to run and they correspond to ActiveRecord. We can also see the relationship between the multiple methods along the y axis and conclude that the calls to ActiveModel and ActiveSupport originated from ActiveRecord.

I also gave a try to rblineprof a line-profiler for Ruby. The nice part about rblineprof is that, when used with rblineprof-report it provides feedback on how much time each line took to run at each layer of the stack (from user defined methods to gem methods).

Below is an example on how to configure it to work within an RSpec test.

And below is the output of the profiler (after running the tests, you will have to open the profiler dump). rblineprof’s output is very verbose, if you were to scroll down you would see a long output with details of how much time each called method took to run. You will see also the lines with the longest times of execution marked in red or yellow.


I find stackprof and rblineprof to be complementary tools. The nature of both profilers is different – stackprof is a sampling profiler while rblineprof is an instrumentation profiler. I find it useful to use stackprof to have a generic idea of what the hotspots are and then use rblineprof if I need to have more details about each line of code.

How can I make it less slow?

In my opinion a basic profiler tells you how much time code took to run, a great profiler is able to give a detailed breakdown of which parts took the longest. An awesome profiler would be able to tell how to fix issues with slow code.

For example, New Relic a popular application monitoring tool is able to suggest adding database indexes when it notices that a SQL query lead to a full table scan.

Naturally this is domain specific knowledge, but conceptually a similar tool could exist for tests where the usual suspects are database access and network requests. I don’t know if such a tool exists for tests but I just found the idea interesting.

Closing words

It is not uncommon to hear people talking about long test suites and that left me wondering if part of the solution also involved better tooling. “Most performance issues are visibility issues” so part of the problem could be solved using the tools mentioned above so we could see the direct impact of the tests we write.

Finding a way to give more visibility around test suites would probably involve answering the following questions:

  • How do you show how the test suite is evolving over time? It would be interesting to show the global test suite time but also the evolution of the different test files.
  • How do you show that a test execution time is past the "acceptable" time?
  • Which tools do you provide for “surgical” intervention on a certain test? I think tools like stackprof and rblineprof belong here. Naturally, there should be a low friction method to use them (e.g. adding them to a spec_helper and using them through a flag).

I stumbled on this topic while trying to profile why a particular change I made did not make a test run faster. This post is already quite long but I hope to share that experience at a longer post.

I stumbled on a great deal of great content that helped me understand profiling in Ruby better.