Optimizing Rails Migrations for Large Databases
Strategies to make ActiveRecord migrations efficient, safe, and fast for large-scale applications.
Introduction
How do you apply Rails migrations safely in production on large databases without downtime?
Rails’ ActiveRecord migrations make schema changes easy, but as your database grows, migrations can lock tables, cause slow queries, or even take down production.
This guide covers:
✅ Minimizing downtime during schema changes
✅ Using pt-online-schema-change
for MySQL and pg_repack
for PostgreSQL
✅ Best practices for large database migrations
By the end, you’ll be able to safely update large databases in production with minimal impact. 🚀
1. Understanding Rails Migrations and Their Challenges
Rails migrations modify the database schema, but large-scale databases present unique challenges:
💥 Long-running migrations → Table locks block reads/writes
💥 Downtime risks → Users experience failures if migrations are slow
💥 Massive data updates → Can cause performance degradation
Common Problematic Migrations
- Adding/removing columns on large tables
- Indexing large datasets (e.g.,
ADD INDEX users(email)
) - Renaming columns (requires recreating the table)
- Foreign key constraints (can cause locks)
Let’s explore strategies to avoid these issues.
2. Safe Strategies for Large Database Migrations
✅ 1. Use add_column
Without a Default Value
When adding a column with a default, Rails updates every row immediately—locking the table.
🚨 Bad Approach:
add_column :users, :status, :string, default: "active", null: false
✔️ Safe Approach:
add_column :users, :status, :string, null: true # Add column first
Then backfill data in batches:
User.in_batches.update_all(status: "active") # Avoids full-table lock
Finally, apply constraints:
change_column_null :users, :status, false
✅ 2. Adding Indexes Without Downtime
Indexes speed up queries, but large indexes on big tables can lock writes for minutes or hours.
🚨 Bad Approach:
add_index :users, :email
✔️ Safe Approach (PostgreSQL)
Use CONCURRENTLY to prevent locking:
execute "CREATE INDEX CONCURRENTLY index_users_on_email ON users(email);"
✔️ Safe Approach (MySQL)
Use pt-online-schema-change:
pt-online-schema-change --alter "ADD INDEX index_users_on_email(email)" --execute D=database,t=users
✅ Zero downtime indexing!
✅ 3. Removing Columns Safely
🚨 Bad Approach:
remove_column :users, :old_column
✔️ Safe Approach:
1️⃣ Stop reading the column in code
2️⃣ Deploy code without using the column
3️⃣ Run migration to remove it
safety_assured { remove_column :users, :old_column }
👀 Why?
- Removing a column instantly drops it, but if queries reference it, they will fail.
- Removing it in phases ensures no application errors.
✅ 4. Splitting Large Data Migrations
If you need to update millions of records, avoid update_all
—it locks rows and slows down the database.
🚨 Bad Approach (locks the entire table):
User.update_all(status: "active")
✔️ Safe Approach: Process in Batches
User.in_batches(of: 5000).update_all(status: "active")
✅ Batching reduces lock contention and improves performance.
✅ 5. Using strong_migrations
to Prevent Risky Changes
Install strong_migrations to detect unsafe migrations:
bundle add strong_migrations
It warns about:
❌ Locking queries
❌ Removing columns incorrectly
❌ Adding indexes without CONCURRENTLY
Run rails db:migrate
—if unsafe, it suggests safe alternatives.
3. Handling Database-Specific Migrations
🛠 PostgreSQL Optimization
For large PostgreSQL databases:
- Use
pg_repack
to remove table bloat after migrations - Use partial indexes for better performance:
execute "CREATE INDEX CONCURRENTLY index_active_users ON users(email) WHERE status = 'active';"
🛠 MySQL Optimization
For large MySQL databases:
- Use
pt-online-schema-change
to avoid downtime - Use
gh-ost
(GitHub’s online schema migration tool)
4. Best Practices for Rails Migrations in Production
✅ Test migrations on a staging environment before running in production
✅ Always wrap migrations in transactions (unless modifying large tables)
✅ Monitor database performance using tools like pg_stat_activity
✅ Use background jobs for large data transformations
✅ Keep your schema version-controlled and rollback-friendly
Conclusion
By following these best practices, you can:
✅ Run Rails migrations with minimal downtime
✅ Avoid table locks and performance issues
✅ Use tools like strong_migrations
, pg_repack
, and pt-online-schema-change
💡 Have you faced migration challenges in Rails? Share your experiences below! 🚀