Database Migration Strategy
Database Migration Strategy
This document outlines the strategies for safely migrating databases in SyncApp without service disruptions.
Types of Migrations
1. Schema Migrations
Schema migrations involve changes to the database structure, such as: - Adding or removing tables - Adding, modifying, or removing columns - Changing column types - Adding or removing indexes or constraints
2. Data Migrations
Data migrations involve changes to the data itself, such as: - Moving data between tables - Transforming existing data - Adding or correcting data - Data cleanup operations
Migration Tools
SyncApp uses Django's built-in migration framework as the primary tool for database migrations. Django migrations provide:
- Automatic migration generation based on model changes
- Forward and backward migration capability
- Dependency tracking for migrations
- Migration history tracking
Migration Process
Development Environment
Migration Creation
python manage.py makemigrations [app_name]
Migration Review
- Review generated migration files in
[app_name]/migrations/
- Check for potentially dangerous operations
- Add
atomic=False
for operations that shouldn't be in a transaction
- Review generated migration files in
Migration Testing
python manage.py migrate [app_name]
Rollback Testing
python manage.py migrate [app_name] [previous_migration]
Production Environment
For production migrations, we follow a zero-downtime migration pattern:
Preparation
- Backup the database
- Schedule the migration during low-traffic periods
- Notify maintenance window if necessary
Code Deployment Stages
a. Compatibility Phase
- Deploy code that can work with both old and new schemas
- Make all model changes backward compatible
- Use multiple database queries if necessary
b. Migration Phase
- Run migrations on one server at a time if multiple servers are present
- Monitor database performance during migrations
c. Cleanup Phase
- Deploy code that removes support for the old schema
- Remove any temporary compatibility code
Best Practices
Migration Safety
Avoid Direct Column Type Changes Instead of:
# Risky direct type change operations = [ migrations.AlterField( model_name='mymodel', name='my_field', field=models.IntegerField(), ), ]
Use:
# Safer approach operations = [ migrations.AddField( model_name='mymodel', name='my_field_new', field=models.IntegerField(null=True), ), migrations.RunPython(migrate_data_forward, migrate_data_backward), migrations.RemoveField( model_name='mymodel', name='my_field', ), migrations.RenameField( model_name='mymodel', old_name='my_field_new', new_name='my_field', ), ]
Set Non-Nullable Fields Carefully
Add fields with
null=True
first, then fill with data, then makenull=False
:# Step 1 migrations.AddField( model_name='mymodel', name='my_field', field=models.CharField(max_length=100, null=True), ) # Step 2 (in the same or separate migration) migrations.RunPython(fill_default_values) # Step 3 (in a separate migration after deployment) migrations.AlterField( model_name='mymodel', name='my_field', field=models.CharField(max_length=100, null=False), )
Avoid Long Transactions
For large tables, use
atomic=False
and perform operations in batches:class Migration(migrations.Migration): atomic = False operations = [ # Operations here will not be wrapped in a transaction ]
Index Creation Strategies
Use
CONCURRENTLY
for index creation on large tables:migrations.RunSQL( "CREATE INDEX CONCURRENTLY idx_name ON myapp_mymodel(field_name);", "DROP INDEX CONCURRENTLY idx_name;" )
Performance Considerations
Large Table Migrations
For tables with millions of rows:
- Run migrations during off-peak hours
- Break down migrations into smaller parts
- Use data migrations for batches of data (e.g., 1000 rows at a time)
Database Load Monitoring
During migrations, monitor:
- Database CPU and memory usage
- Disk I/O
- Lock contention
- Query performance
Database Connection Control
During migrations, adjust connection pools to:
- Reduce web server connections if necessary
- Prevent timeouts during long-running migrations
Rollback Strategy
For each migration, plan a rollback strategy:
Automatic Rollbacks
For simple migrations, Django can automatically rollback:
python manage.py migrate [app_name] [previous_migration]
Manual Rollbacks
For complex migrations, prepare custom SQL scripts:
python manage.py dbshell < rollback_scripts/rollback_migration_xxxx.sql
Point-in-time Recovery
For catastrophic failures, restore from backup:
# Using pg_restore for PostgreSQL pg_restore -d mydatabase backup_file.dump
Migration Testing
Test all migrations against a copy of production data:
Staging Environment
- Apply migrations to a staging environment with production-like data
- Measure migration time and resource usage
- Verify application functionality after migration
Migration Time Estimation
For large tables, estimate migration time:
# Start with a small sample and extrapolate python manage.py migrate_estimate [app_name] [migration]
Monitoring and Alerts
During migration execution:
Progress Monitoring
For long-running migrations:
# Custom management command python manage.py migration_progress [app_name] [migration]
Alert Setup
Configure alerts for:
- Migration duration exceeding threshold
- Database resource usage spikes
- Application errors during migration
Documentation
For each significant migration:
Migration Documentation
Document in migration file:
""" Migration: Add user preferences table Impact: None, creates a new table without modifying existing tables Duration estimate: <1 second Rollback: Run 'python manage.py migrate users 0012' """
Release Notes
Include migration information in release notes:
Database Changes: - Added User Preferences table - Modified Sync table to include 'last_modified' column - Index added to improve sync listing performance
Emergency Procedures
For migration emergencies:
Migration Termination
If a migration is causing severe issues:
- For PostgreSQL, identify the blocking PID and terminate
SELECT pid, query FROM pgstatactivity WHERE state = 'active'; SELECT pgcancelbackend(PID); -- for gentle termination SELECT pgterminatebackend(PID); -- for forced termination
- For PostgreSQL, identify the blocking PID and terminate
Replication Lag Management
If migration causes replication lag in read replicas:
- Monitor lag:
SELECT now() - pg_last_xact_replay_timestamp();
- Reduce application load to allow replicas to catch up
- Monitor lag:
Database Recovery
In case of data corruption or catastrophic failure:
- Switch to standby database if available
- Restore from backup for affected data only if possible
- Restore complete database if necessary