Tracking queue metrics with Sidekiq

Sidekiq is a critical component of many Rails applications therefore its metrics should be appropriately collected.
For example: Imagine you have a Sidekiq job which sends password reset e-mails to customers. If that job’s queue has a latency of 1 hour you want to know about it because it has a significant impact to a customer’s experience.

Sidekiq has built in support to track the most important queue metrics. Some metrics are published to statsd in Sidekiq Pro. You can also leverage Sidekiq’s public API or Sidekiq Middleware to manually instrument metrics. With these tools you can then publish the relevant metrics to an aggregator like DataDog or Prometheus.

Below you will find how to track some important metrics in Sidekiq.

Queue latency

The Sidekiq API allows you to know how long (in seconds) the oldest jobs has been sitting in the queue.

> Sidekiq::Queue.new("queue_name").latency
25

You need to manually instrument this metric in order to have it present in a metrics aggregator. You can do that by having a process continuously poll the queue latency.

# Example for DataDog
loop do
  latency = Sidekiq::Queue.new("queue_name").latency
  $statsd_client.gauge("sidekiq.queue.latency.queue_name", latency)
end

Consume rate

You can track consume rate through Sidekiq’s statsd integration (Sidekiq Pro feature).

In Sidekiq’s case consume rate is the rate at which jobs are executed. The relevant metric is jobs.MyWorker.count .

Note that Sidekiq’s built-in metric focuses around worker classes instead of queues. You might have queues with different worker classes (e.g. default). For these cases you need to consider the consume rate across all worker classes when assessing issues with queue latency.

If you’d like to have a single metric for a queue’s consume rate you can use Sidekiq Server Middleware.

class Middleware::Sidekiq::Server::QueueConsumeRate
  def call(worker, job, queue)
    begin
      $statsd_client.increment("consumed_jobs.#{queue}.count")
      yield
    end
  end
end

Produce rate

You can instrument your application logic whenever you schedule a job within Sidekiq. This approach gets more cumbersome to manage as more worker classes start using the same queue or as a job starts being enqueued in multiple codepaths.

def my_method
  # multiple lines of application logic
  $statsd_client.increment("produced_jobs.queue_name.count")
  MyWorker.perform_async(argument)
end

Alternatively, you can use Sidekiq Client Middleware to track a queue’s produce rate.

class Middleware::Sidekiq::Client::QueueProduceRate
  def call(worker_class, job, queue, redis_pool)
    $statsd_client.increment("produced_jobs.#{queue}.count")
    yield
  end
end

Processing latency

You can track the processing latency for a worker through Sidekiq’s statsd integration. The relevant metric is jobs.MyWorker.perform. This metric focuses on worker classes instead of queues.

To get a single metric for the whole queue you need to use Sidekiq Server Middleware.

class Middleware::Sidekiq::Server::QueueProcessingLatency
  def call(worker, job, queue, &block)
    begin
      $statsd_client.time("processing_latency.#{queue}", &block)
    end
  end
end

You can find other posts of my series on queues here.

I can send you my posts straight to your e-mail inbox if you sign up for my newsletter.