Rails's Swappable Migration Backend for Schema Changes at Scale
This post explores Rails’s swappable migration backend, a little-known feature that lets applications customize how migrations run. At Shopify, we relied on monkey patches and a brittle SQL parser to make Rails migrations work with our Schema Migrations Service. We developed the swappable backend feature to more simply adapt Rails’s migration runner to our needs. We’ll cover why and how we built this, and how Shopify uses it to power database migrations at scale.
At Shopify, we run hundreds of database migrations across many Rails applications every week. Each migration needs to be vetted for safety and executed in a way that doesn’t cause downtime for our merchants. For years, we relied on bespoke tooling and LHMs to perform online schema changes at scale. In 2021, Shopify’s database team began designing a new, centralized system for running schema migrations, the Schema Migrations Service. One of their goals was to enable developers to use vanilla Rails migrations to perform schema changes safely and with zero downtime.
Our database team built the schema migrations gem to solve this problem, but the implementation wasn’t simple. The gem relied on monkey patches and a complicated RACC parser to handle safety checking migrations and submitting them to the schema migrations service. Shopify’s Rails Infrastructure team took the opportunity to build something into the framework that would help us address our schema migration needs more elegantly. We built the swappable migration backend (available as of Rails 7.0) to give applications flexibility over how their migrations execute. Let’s dive into how Shopify uses this feature to power safe database migrations at scale.
Why Production Migrations Require a Different Approach
When you run bin/rails db:migrate in development, Rails executes your migration methods directly
against the database. Each call to create_table, add_column, or add_index immediately
translates to SQL that modifies your schema. This works great for local development, but at
Shopify’s scale, we can’t afford to run schema changes this way in production.
LHMs are a tool for performing online schema migrations. This means that migrations can be performed without locking tables, enabling the system to stay up while the migration is running. We used LHMs for many years to perform schema changes without downtime, but this also meant that we couldn’t use Rails’s native migration API.
Shopify’s database team decided to build a Schema Migrations Service to allow developers to return to using vanilla Rails migrations, while ensuring that schema changes were still performed online behind the scenes. The idea was also to improve the developer experience around migrations by:
- Requiring migrations to pass safety checks before execution (e.g. blocking column-change operations, ensuring a migration only operated on a single table, etc.).
- Submitting migrations to a centralized manager to more easily orchestrate schema changes across multiple database shards, with better testing and retries behaviour.
- Providing developers with more insight into which migrations were running, their progress, etc. from a comprehensive UI.
The schema migrations gem built by our DB team handled safety checking migrations and submitting them to the centralized manager. The initial implementation, however, relied heavily on monkey patches to existing migration codepaths in Rails. Rather than executing migration SQL, the gem patched Rails to capture any SQL statements. It relied on a RACC parser to extract schema change operations from the SQL, safety check them, and then transform them into a JSON DDL (Data Definition Language) to be sent to the manager.
The Rails Infrastructure team realized that this was a great opportunity to make Rails’ migration execution more flexible, so that we could meet Shopify’s schema migration needs without needing to monkey patch a bunch of code or maintain a complicated RACC parser.
Building Rails’s Swappable Migration Strategy
When we started this project in early 2022, we explored several approaches that would allow us to
move away from monkey patching Rails in the gem. One idea was to use static analysis, and try to
parse migration files without running them. Another was to propose schema definition objects for every
migration operation, where Rails would expose Ruby representations of schema changes (like
AddColumnDefinition, CreateTableDefinition, etc.) that could be translated into any format:
SQL, JSON, or otherwise.
The Rails Core team had concerns about the complexity that schema definitions would introduce to Active Record, and we soon pivoted to a simpler approach: the strategy pattern. Instead of fundamentally changing how migrations represent schema changes, we’d introduce an intermediary object between migrations and the connection adapter that could customize execution behavior. This was a cleaner abstraction that solved our problem without requiring massive changes to Active Record’s internals.
In June 2022, we opened a pull request to Rails
proposing this “execution strategy” pattern for migrations. The PR introduced a strategy object
between the Migration class and the connection adapter. Instead of migrations directly delegating
schema statement commands to the connection via method_missing, they would delegate to a strategy
object that could be swapped out.
For example, suppose you call a method like create_table in a migration. Rails routes
that call through a migration strategy object, which by default, is
ActiveRecord::Migration::DefaultStrategy:
module ActiveRecord
class Migration
class DefaultStrategy < ExecutionStrategy
private
def method_missing(method, ...)
connection.send(method, ...)
end
def respond_to_missing?(method, include_private = false)
connection.respond_to?(method, include_private) || super
end
def connection
migration.connection
end
end
end
end
The default strategy sends migration methods to the connection, which executes SQL against your database.
This is how migrations worked before, so most Rails developers are unaware that there’s now a strategy
object working behind the scenes! However, the migration strategy class
can be configured to customize how migrations are executed. As of Rails 7.0, you
can set config.active_record.migration_strategy in your environment configuration (for example, in
config/environments/production.rb). Pass it either a class object or a string with the class name:
# lib/custom_migration_strategy.rb
class CustomMigrationStrategy < ActiveRecord::Migration::DefaultStrategy
def drop_table(*)
raise "Dropping tables is not supported!"
end
end
# config/environments/production.rb
Rails.application.configure do
config.active_record.migration_strategy = CustomMigrationStrategy
end
Now, when you run bin/rails db:migrate, Rails will delegate all migration methods to your custom
strategy, giving you complete control over how migrations are executed.
Note: Outside of production, you will likely want to stick with the default strategy for local development. This setup lets you safely use advanced migration tooling in production while keeping things fast and simple for local development. We do this at Shopify.
Serializing Production Migrations to JSON
Once Rails supported swappable migration backends, we implemented a custom strategy that serialized
migrations as JSON, making them easy to submit to a remote manager. To accomplish
this, our gem introduced a JsonSerializationStrategy class. This class implemented each schema
change method available in migrations, using Rails’s schema definition APIs to build the necessary
schema objects. We then converted these objects into JSON payloads that described each schema operation.
Here’s an example of how we capture create_table operations:
class JsonSerializationStrategy < ActiveRecord::Migration::DefaultStrategy
attr_accessor :connection, :operations
def initialize(connection)
@connection = connection
@operations = []
end
def create_table(...)
td = connection.build_create_table_definition(...)
ddl = connection.schema_creation.accept(td)
definition = extract_table_definition(td.name, ddl)
operations << {
type: :sql,
op: :create_table,
params: {
name: td.name,
definition: definition,
},
}
end
private
def extract_table_definition(table_name, ddl)
table_name_pattern = /^CREATE TABLE #{connection.quote_table_name(table_name.to_s)} /
ddl.sub(table_name_pattern, "")
end
end
Here’s a simplified look at how migrations are run in production, using the swappable strategy:
class ExternalMigrationsRunner
def upload_migration(migration)
# Run the migration, but since we're using the JsonSerializationStrategy,
# we won't execute SQL; instead, the strategy captures all operations as JSON
runnable_migration = migration.migration_class.new
if runnable_migration.respond_to?(:change)
runnable_migration.change
elsif runnable_migration.respond_to?(:up)
runnable_migration.up
end
# Extract the serialized operations from the strategy
operations = runnable_migration.execution_strategy.operations
# Upload to the migrations service via API
ApiClient.upload_migration(
name: migration.name,
database: database_name,
identifier: migration.version,
operations: operations, # JSON representation of schema changes
table_name: migration.table,
author: migration.author
)
end
end
Configuring the Migration Strategy Automatically
Rather than requiring each application to configure the migration strategy in their config file for production, the schema migrations gem leveraged an initializer to set this automatically:
# lib/schema_migrations/railtie.rb
require "rails/railtie"
class Railtie < Rails::Railtie
...
initializer "schema_migrations.migration_strategy_config" do |app|
next unless Rails.env.production?
app.config.active_record.migration_strategy = JsonSerializationStrategy
end
end
This initializer ensures that any application that includes the schema migrations gem has its migrations intercepted and serialized in production environments.
Reimagining Safety Checks: From SQL Parsing to Runtime Analysis
While working on the upstream strategy feature, our team was simultaneously tackling another critical problem: safety checks. Before any migration runs in production at Shopify, the gem performs safety checks to catch common mistakes that could cause downtime, such as:
- Adding a
NOT NULLcolumn without a default value (check out this blog post if you’re interested in learning more) - Renaming a column (breaks downstream consumers)
- Changing a column type in an incompatible way
These checks run in development too, giving developers immediate feedback before they deploy.
The old implementation of the gem’s safety checker relied on a RACC parser to analyze SQL strings, which
was brittle: every time SQL syntax changed or we encountered a new edge case, the parser had to be updated.
We wanted a standalone workflow for being able to safety check migrations, separate from the migrations
actually being executed and submitted to the manager. Consequently, we couldn’t rely on the migration strategy
to do this. Instead, we settled on a new approach that would allow us to move away from the RACC parser and
reduce a lot of the complexity. We developed a MigrationOperationRecorder that “runs” a migration and
records all method calls performed:
class MigrationOperationRecorder
def initialize(migration_class)
@migration = migration_class.new
end
def record
singleton_class = @migration.singleton_class
singleton_class.include(RecordMigrationOperations)
if @migration.respond_to?(:change)
@migration.change
elsif @migration.respond_to?(:up)
@migration.up
end
@migration.method_calls
end
end
The RecordMigrationOperations module works by leveraging the same method_missing mechanism that Rails
uses for migrations. Since ActiveRecord::Migration uses method_missing to route commands to the execution
strategy, we define RecordMigrationOperations#method_missing to store the method call instead:
module RecordMigrationOperations
def method_missing(method, *args, **options, &block)
@method_calls << MigrationOperation.new(
method: method,
args: args,
options: options
)
end
def method_calls
@method_calls ||= []
end
end
Once operations are recorded, individual safety checks can inspect the migration data.
Here’s an example of the SingleTableCheck:
class SingleTableCheck < BaseSafetyCheck
def initialize(migration)
@inspected_migration = migration
end
def check
# @inspected_migration is a specialized object containing info
# about all of the operations the migration performs, as returned
# from MigrationOperationRecorder#record
tables = @inspected_migration.tables
return if tables.one?
raise SafetyCheckError,
"You must work with exactly one table per migration. " \
"Split tables #{tables.to_sentence} into #{tables.length} migrations."
end
end
This check accesses @inspected_migration.tables, which is extracted during the analysis phase,
and validates that exactly one table is involved. If the check fails, it raises a SafetyCheckError
with a clear message telling developers how to fix the issue.
Why Not Use a Migration Strategy for Safety Checking?
You might wonder why we used method_missing for the MigrationOperationRecorder instead of
creating another strategy pattern. Couldn’t we use our newly built feature for safety checking?
The answer comes down to separation of concerns and simplicity. Safety checking and migration
execution serve different purposes:
-
Migration execution needs to be swappable because different environments (development vs. production) require different behaviours. In development, we execute SQL directly. In production, we serialize to JSON and submit to a remote service.
-
Safety checking needs to happen the same way everywhere. We’re analyzing which operations the migration is performing, not executing schema changes. The same safety checks run in development, CI, and production.
Using method_missing for safety checks gives us a simpler implementation that automatically
captures all migration DSL methods without needing to explicitly enumerate them all. A strategy
pattern would have required us to implement every migration method explicitly. Given that we only
wanted to record the migration methods being called and their arguments, opting for a simpler
method_missing approach made more sense.
Per-Adapter Migration Strategies
One challenge with using a global migration strategy is that it’s insufficient for applications using multiple database systems. Since its inception, Shopify has primarily used MySQL, but more recently we’ve been exploring running non-MySQL databases. Different databases have different requirements for how migrations should be serialized, which means that the migration strategy needs to be tailored to database the migrations are running against.
We could make this work by having our gem’s migration strategy inspect the database adapter at runtime and dispatch to the appropriate serialization logic. This is not ideal, though; we’re reimplementing adapter dispatch logic that Rails can handle natively. It felt like this was a missing piece in our upstream solution, so last month, we opened a PR to add per-adapter migration strategies to Rails. This feature will be available in Rails 8.2.
Instead of setting one global strategy:
config.active_record.migration_strategy = JsonSerializationStrategy
You can now register strategies directly on adapter classes:
ActiveSupport.on_load(:active_record_trilogyadapter) do
ActiveRecord::ConnectionAdapters::TrilogyAdapter.migration_strategy =
MysqlStrategy
end
ActiveSupport.on_load(:active_record_postgresqladapter) do
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter.migration_strategy =
PostgreSQLStrategy
end
Rails automatically selects the correct strategy based on the database adapter in use
for each migration. For example, if you’re running migrations against a MySQL database
configured with the Trilogy adapter, Rails chooses MysqlStrategy. If your
migrations are running against a PostgreSQL database, Rails selects PostgreSQLStrategy.
If the current adapter does not have a strategy configured, Rails will fall back to using
the global strategy.
Making Rails Work for You
One of Rails’s design philosophies is convention over configuration. The majority of Rails apps don’t need to think about how their Rails migrations are performed, so we keep things simple with a default migration strategy. At the point where an application needs to customize how their migrations run, the framework provides a clear extension point. Applications can opt-into configurable behaviour as their requirements evolve.
This is also a story about how working in the open benefits everyone. We could have kept our monkey patches internal to Shopify, continuing to patch Rails as needed. Instead, we built a more maintainable solution for ourselves while also providing the Rails community with a new tool for customizing migration behaviour. If you’re running into limitations with Rails for your specific use case, consider whether there’s an opportunity for an upstream contribution that could solve your problem while benefitting the rest of the community.