Zero Downtime Migration (ZDM) Patterns
Framework M provides the tools and patterns necessary to perform database schema updates without application downtime. This is achieved through a Two-Phase Migration strategy and a ZDM-Aware Readiness Probe.
1. The Two-Phase Migration Strategy
To avoid breaking the application while migrations are running, schema changes must be forward-compatible.
Phase 1: Forward-Compatible Schema Update
- Goal: Add new columns or tables without modifying existing ones.
- Action: Run
m migrate. - ZDM Property: The
/readyprobe will return 200 once mandatory Alembic revisions are applied. - Code Change: Deploy application code that writes to both the old and new columns.
Phase 2: Data Backfill & Cleanup
- Goal: Migrate existing data to the new schema and remove legacy fields.
- Action: Run a background job for backfilling.
- ZDM Property: The
/readyprobe remains positive during this phase, as the schema is already compatible. - Final Step: Once backfill is complete, deploy a second version of the code that only uses the new column, and finally run a migration to drop the old column.
2. ZDM-Aware Readiness Probe
The /ready endpoint uses the MigrationStatusUtility to determine if a pod should receive traffic. It distinguishes between mandatory and optional sync states.
- Mandatory (Blocks Readiness): Pending Alembic revisions (Phase 1).
- Optional (Allows Readiness): Incomplete declarative sync or background data backfills.
Graceful Sync
You can enable "Graceful Sync" via environment variables to allow pods to start even if minor declarative changes are still being synchronized.
CHECK_DECLARATIVE_SYNC=False
3. Proof of Pattern: Integration Test
Framework M includes an official integration test that validates this pattern: libs/framework-m-standard/tests/adapters/db/test_zdm_scenario.py.
This test simulates:
- An "Old" application version.
- A Phase 1 migration adding a nullable column.
- Confirmation that the
/readyprobe stays positive and CRUD still works during the transition. - Verification of a background backfill job.
4. Best Practices
- Always use Nullable: When adding columns in Phase 1, always make them
nullable=True. - Avoid DROP in Phase 1: Never drop a column in the same migration that adds a new one.
- Monitor Readiness: Observe the
/readystatus during your deployment pipeline to ensure traffic isn't routed to pods with missing mandatory revisions.