Don’t change the signature of Sidekiq jobs running in production

This is part of my Sidekiq “reliability list”.

You should not change the signature of a Sidekiq job which is running in production. By signature I mean: the arguments and the class name of the job.

Changing the signature of a Sidekiq job causes issues during deployment since multiple versions of the codebase are active simultaneously. These multiple versions have conflicting implementations of the same Sidekiq jobs which leads to exceptions. Changing the signature of a Sidekiq job breaks backward and forward compatibility.

To explain how this is a problem we need to walkthrough how Sidekiq works at a high level.

How Sidekiq works

This is a very basic overview of some key aspects of Sidekiq.

Let’s imagine you have the job below.

class HardWorker
  include Sidekiq::Worker
  def perform(name)
    # do something
  end
end

You can enqueue the job using:

HardWorker.perform_async('bob')

When you enqueue a job, Sidekiq stores a JSON payload with its details (i.e. job name, arguments) into a Redis list.

For the job above we can imagine the following JSON string is stored in Redis

{
  "queue": "default",
  "class": "HardWorker",
  "args": ["bob"],
}

This information is used by the worker thread to:

klass  = constantize(job_hash["class"])
worker = klass.new
  • execute the job
worker.perform(*job_hash["args"])

Note that whenever exceptions are raised when creating an instance or executing a job, the job is retried.

How changing the Sidekiq job signature can cause issues?

With that basic understanding of how Sidekiq works we can go over a few scenarios of issues you can observe when you change the signature of a job.

Adding/Removing Arguments to/from a Sidekiq job

Let’s imagine you add an argument to the job we previously created.

class HardWorker
  include Sidekiq::Worker
  def perform(name, count)
    # do something
  end
end

During a deploy we will have instances of our application running both versions of our code. The old version with a single argument and the new version with two arguments. We will have both versions running for our Sidekiq clients – the ones enqueueing – and for our Sidekiq servers – the ones dequeueing and executing.

This causes issues in the Sidekiq servers whenever they are attempting to execute a job that was enqueued by a different version of the code. This is an issue because you can’t control which servers will execute which payload.

# this will raise an exception if the server only accepts a single argument HardWorker#peform(name)
worker.peform('bob', 5)

# this will raise an exception if the server only accepts two arguments HardWorker#peform(name, count)
worker.peform('bob')

The issue with changing arguments is that even when the new version of the code is fully deployed, it is still unable to process jobs enqueued by the previous version.

Removing arguments from a Sidekiq job runs into a similar issue to what was explained for “Adding arguments to a Sidekiq job”.

Rolling back from these changes is also troublesome because the older version won’t be able to execute jobs enqueued by the new version.

Renaming a Sidekiq job

Let’s imagine you attempt to rename the job we initially created.

class VeryHardWorker
  include Sidekiq::Worker
  def perform(name)
    # do something
  end
end

This causes issues because the Sidekiq servers running the new version of the code can’t create an instance of a class they don’t know (i.e. HardWorker). This means that any unprocessed HardWorker jobs won’t be processed when the new version of the code is fully deployed.

How to prevent these issues?

Adding/Removing Arguments to/from a Sidekiq job

My preferred solution is to create a new class for the new worker without changing the previous implementation. This means keeping two simultaneous implementations of the worker in the codebase.

Using the previous example, the “new version” of the application will be enqueuing VeryHardWorker. While the older version will be enqueueing HardWorker. Both versions of the application are able to process HardWorker. The older version can’t process VeryHardWorker but the jobs keep getting re-enqueued until the new version eventually picks them up.

class HardWorker
  include Sidekiq::Worker
  def perform(name)
    # do something
  end
end

class VeryHardWorker
  include Sidekiq::Worker
  def perform(name, count)
    # do something
  end
end

You will only remove the old implementation (i.e. HardWorker) when jobs for the older worker are no longer being enqueued – at a future (different) deploy.

Renaming a Sidekiq job

The Sidekiq FAQ has a solution for this:

class MyNewWorker
  ...
end
# XXX Delete this alias in a few weeks when old jobs are safely gone
MyOldWorker = MyNewWorker

Alternatively you can use the same solution as the one presented for adding/removing arguments.

Conclusion

The signature of a Sidekiq job is an interface between the Sidekiq client (who enqueues) and the Sidekiq server (who dequeues and executes). When changing that interface you need to ensure compatibility with users of the interface’s older version.

These principles around interface compatibility can be seen in other places too: in API schemas, message bus schemas and database schemas to name a few. The changes you deploy should be compatible with the code that is already running.


I can send you my posts straight to your e-mail inbox if you sign up for my newsletter.