Django’s makemigrations and migrate commands are great, but running them on large tables can bring your site to its knees, or worse, take it completely offline. The core issue is that ALTER TABLE statements, which migrations often generate, are typically blocking operations. When you’re modifying a significant table, this block can last for minutes or even hours, preventing any reads or writes to that table until the operation finishes.
Here’s how we can avoid that, step-by-step.
The Problem: Blocking ALTER TABLE
When Django needs to add a column, change a column type, or add an index to a large table, it often uses a standard SQL ALTER TABLE command. On many database systems (like PostgreSQL), these commands acquire an exclusive lock on the table. This lock means no other process, including your Django application, can read from or write to the table while the ALTER TABLE is running. For a table with millions of rows, this can take a very long time, leading to significant downtime.
The Solution: Incremental Changes with Multiple Migrations
The trick is to break down the operation into multiple, smaller migrations, each of which is a non-blocking or minimally blocking operation. This allows you to deploy changes incrementally, keeping your application online throughout the process.
Let’s walk through a common scenario: adding a new, non-nullable CharField to an existing, large model.
Scenario: You have a Product model with millions of Product instances. You want to add a new sku field:
# models.py
class Product(models.Model):
name = models.CharField(max_length=255)
# ... other fields
sku = models.CharField(max_length=50, unique=True) # New field
If you just run makemigrations and migrate, Django will likely generate a migration that tries to add the sku column and then immediately tries to set it for all existing rows. This is the blocking part.
Step 1: Add the Column as Nullable
First, create a migration that adds the new column, but allows it to be NULL initially. This is a fast, non-blocking operation on most databases.
-
Generate the initial migration: Modify your model to include the new field as nullable:
# models.py class Product(models.Model): name = models.CharField(max_length=255) # ... other fields sku = models.CharField(max_length=50, unique=True, null=True, blank=True) # Allow null for nowRun
python manage.py makemigrations your_app_name. This will create a migration file likeyour_app_name/migrations/00XX_add_sku_to_product.py. -
Edit the generated migration: Open the generated migration file. You’ll see something like:
from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ('your_app_name', '00XX_previous_migration'), ] operations = [ migrations.AddField( model_name='product', name='sku', field=models.CharField(blank=True, max_length=50, null=True), ), ]This is good. The
AddFieldoperation withnull=Trueis typically non-blocking. -
Apply this migration: Run
python manage.py migrate your_app_name. This will add theskucolumn to yourproductstable without locking it for long.
Step 2: Populate the New Column (Background Job)
Now that the column exists and allows NULLs, you need to populate it for all existing rows. This is where you’ll use a background worker.
-
Create a management command for population: Create a file
your_app_name/management/commands/populate_product_skus.py:from django.core.management.base import BaseCommand from django.db import transaction from your_app_name.models import Product from django.utils import timezone class Command(BaseCommand): help = 'Populates the sku field for existing Product instances.' def add_arguments(self, parser): parser.add_argument('--batch-size', type=int, default=1000, help='Number of records to process per batch.') def handle(self, *args, **options): batch_size = options['batch_size'] # Use a transaction for each batch to ensure atomicity and release locks quickly. # Using select_for_update can be problematic on very large datasets if not handled carefully. # For zero-downtime, we generally avoid explicit locking here. # The goal is to update rows that are still NULL. products_to_update = Product.objects.filter(sku__isnull=True) total_updated = 0 while True: batch = list(products_to_update[:batch_size]) if not batch: break with transaction.atomic(): for product in batch: # Generate your SKU logic here. # For example, a simple hash or a UUID. # Ensure your SKU generation logic is robust and unique. product.sku = f"PROD-{product.id}-{timezone.now().strftime('%Y%m%d%H%M%S')}" product.save(update_fields=['sku']) # Only update the sku field total_updated += len(batch) self.stdout.write(f'Updated {total_updated} products...') self.stdout.write(self.style.SUCCESS(f'Finished populating SKUs. Total updated: {total_updated}'))Important: The
product.save(update_fields=['sku'])is crucial. It tells Django to only update theskucolumn, making the update faster and less prone to conflicts. The SKU generation logicf"PROD-{product.id}-{timezone.now().strftime('%Y%m%d%H%M%S')}"is a placeholder; you’ll need to replace it with your actual, robust SKU generation strategy. -
Run the background job: You can run this command using a task queue like Celery, or simply by executing it in a separate process. For a very large table, run it in batches:
python manage.py populate_product_skus --batch-size=5000Monitor its progress. This process will run for a long time, but it won’t lock your table because it’s only updating individual rows, not performing a schema-level
ALTER TABLE.
Step 3: Make the Column Non-Nullable and Unique
Once the background job has populated the sku for all existing rows (you can verify this by running Product.objects.filter(sku__isnull=True).count() and ensuring it’s 0), you can make the change permanent.
-
Modify the model again: Change your model to enforce non-nullability and uniqueness:
# models.py class Product(models.Model): name = models.CharField(max_length=255) # ... other fields sku = models.CharField(max_length=50, unique=True, null=False, blank=False) # No more null/blank -
Generate the second migration: Run
python manage.py makemigrations your_app_name. This will create a new migration, e.g.,00YY_make_sku_not_nullable.py. -
Edit the second migration: Open this new migration file. It will likely look like this:
from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ('your_app_name', '00XX_add_sku_to_product'), # Depends on the previous one ] operations = [ migrations.AlterField( model_name='product', name='sku', field=models.CharField(max_length=50, unique=True), # Now without null=True, blank=True ), ]This
AlterFieldoperation is what we need to be careful about. On PostgreSQL,AlterFieldto removeNULLconstraint and add aUNIQUEconstraint can still be blocking.Crucially, for PostgreSQL and similar databases, this
AlterFieldoperation needs to be split further if it’s blocking.PostgreSQL Specifics for
AlterField: To makeAlterFieldto add aUNIQUEconstraint and removenull=Truenon-blocking on PostgreSQL, you often need to do it in three steps:-
Migration 1 (already done): Add the column as nullable.
-
Migration 2 (background job): Populate the column for existing rows.
-
Migration 3 (new): Add the
UNIQUEconstraint. This is a fast operation on PostgreSQL if the data already satisfies it.- Edit your
00YY_make_sku_not_nullable.pymigration. - Remove the
migrations.AlterFieldoperation. - Add a new operation to add the unique constraint:
from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ('your_app_name', '00XX_add_sku_to_product'), ] operations = [ # Add the unique constraint first. This is fast if data is already unique. migrations.AddConstraint( model_name='product', constraint=models.UniqueConstraint(fields=['sku'], name='sku_unique_constraint'), ), ] - Apply this migration:
python manage.py migrate your_app_name.
- Edit your
-
Migration 4 (new): Remove
null=Trueandblank=Truefrom the field definition and generate a migration for it.- Change your model to
sku = models.CharField(max_length=50, unique=True). - Run
python manage.py makemigrations your_app_name. This will generate a migration that removes the nullability. - Edit this migration. It might try to do an
AlterFieldagain. If it does, you might need to manually write SQL to set theNOT NULLconstraint. However, often, if theUNIQUEconstraint is already in place and the data is clean, thisAlterFieldwill be fast. The key is that theUNIQUEconstraint is already enforced. - Apply this final migration:
python manage.py migrate your_app_name.
- Change your model to
Why this multi-step approach? On PostgreSQL, adding a
UNIQUEconstraint is a metadata-only operation if the data already adheres to it, making it very fast. Then, changing the column definition toNOT NULLis also fast because the database knows noNULLs exist. If you try to do bothNOT NULLandUNIQUEin a singleAlterFieldon a large table, the database might need to scan the entire table to enforce both, which can be blocking. -
Step 4: Remove Nullability and Blankability
After the unique constraint is in place and verified, you can make the final model change.
-
Modify the model: Remove
null=Trueandblank=Truefrom theskufield definition inmodels.py. -
Generate and apply migration: Run
python manage.py makemigrations your_app_nameand thenpython manage.py migrate your_app_name. This finalAlterFieldshould now be quick because theUNIQUEconstraint is already applied and the data is clean.
Summary of Zero-Downtime Migration Steps:
- Add Column (Nullable):
models.CharField(..., null=True, blank=True)->makemigrations->migrate. (Fast) - Populate Data: Run a background job (e.g., Celery task or management command) to fill the new nullable column for all existing rows. (Long-running, non-blocking)
- Add Unique Constraint: Manually write SQL or use
migrations.AddConstraintto add the unique constraint. ->makemigrations(if needed) ->migrate. (Fast, if data is clean) - Enforce Not Null: Modify model to remove
null=True, blank=True->makemigrations->migrate. (Fast, if previous steps were successful)
This phased approach, especially the separation of adding the constraint and enforcing NOT NULL, is key to avoiding table locks on large datasets.
The next hurdle you’ll likely encounter is managing the complexity of these multi-step migrations, especially when rolling back or dealing with concurrent schema changes.