Databaseadvanced

Managing Database Migrations

Database migrations are the part of the system where a tiny mistake creates an outage that lasts hours. This is the playbook I follow when migrating a production schema: expand-then-contract, dual writes when needed, idempotent scripts, a real rollback plan, and verification that does not depend on the migration tool reporting success. Boring on purpose, because exciting migrations are the dangerous ones.

45 min7 steps

Steps

Tools

Outcomes

advanced

Difficulty

Technologies used

PostgreSQLPrismaDrizzleNode.js

The methodology

The phases, in order

Each phase below is something I actually run in a project. The descriptions are how I think about the work, not abstract definitions.

Phase

Phase 1 of 7

Choose a Migration Pattern

For anything bigger than adding a nullable column, I use expand-contract: add the new shape, dual-write from the app, backfill, switch reads, then drop the old shape. Each phase is a separate deploy, never combined. Feature flags gate the cutover so a regression rolls back without a schema change.

Phase

Phase 2 of 7

Backwards-compatible Schema Design

The new schema must coexist with the old one for at least one deploy cycle. That means no renaming columns in place, no narrowing constraints without a backfill, and no destructive enum changes. Anything that breaks reads is a flag-day migration and gets a written incident plan before I touch it. See refactoring without freezing for the broader pattern.

Phase

Phase 3 of 7

Idempotent Migration Scripts

Every migration is idempotent: rerunning it must be a no-op. I use IF NOT EXISTS, CREATE OR REPLACE, and explicit version checks. Scripts are versioned in the repo with Prisma or Drizzle, and never edited after they have been applied in any environment. Bad migrations get a follow-up migration, not a rewrite.

Phase

Phase 4 of 7

Backfill Strategy

For large backfills I batch in chunks, throttle to avoid hammering the primary, and run from a worker job rather than the migration tool. The job is checkpointed so a crash resumes from the last completed batch. For huge tables I run the backfill from a read replica when possible.

Phase

Phase 5 of 7

Rollback Plan

Before I run a migration in production I write the rollback steps next to it. Most rollbacks are just disabling a feature flag and removing the new code, because expand-contract leaves the old shape intact. The plan is shared with whoever is on-call so they can act without paging me.

Phase

Phase 6 of 7

Deployment and Cutover

Migrations run in CI before the app deploy, never inline with app startup. I deploy schema changes first, app changes second, and verify each step before moving on. The CI/CD playbook covers the automation.

Phase

Phase 7 of 7

Verification and Drift Checks

After deploy I run a verification job: row counts before and after, sampled hash comparisons, and a smoke query that exercises the new code path. A schema drift checker runs nightly so unauthorized changes in production get caught early. This sounds excessive until the one time it catches a real divergence.

Results

What You'll Achieve

Expected outcomes from implementing this playbook

Zero-downtime migrations with no scary cutovers

Safe rollback procedures, written before the migration runs

Data integrity verification post-migration

A process that scales from solo dev to a team of twenty

Start a project if you need help with a high-stakes migration.

Use this playbook

Want me to run this with you?

The playbook is the public version. The private version is me running it for your team against a real deadline. If you have a project on the line, that is usually the faster path.

Start a project Just ask a question

Related insights

More on this thinking

The studio journal

Essays and notes that pair with the playbooks.

Insights in Database

Filter the journal for pieces on this topic.

Related blueprints

Reference architectures

All blueprints

Production-grade reference systems I have shipped.

Labs

Experiments where I prototype the playbooks in public.

Next up

AI Integration · 120 min

Building RAG Applications

Retrieval-augmented generation looks simple in a demo and stays simple until your knowledge base is bigger than a thousand documents, chunks overlap badly, or relevance scores stop making sense. This is my end-to-end RAG playbook: document processing, embedding pipelines, retrieval tuning, prompt design, and the evaluation harness that tells you whether changes are actually improving results.

Database

Related Playbooks

Other playbooks in this category

intermediate

Building Real-time Features with Supabase

Real-time features sell products, but they also break in spectacular ways if you do not respect the limits of the underlying broker. This playbook is how I add live subscriptions, presence, and optimistic updates to a React app using Supabase, without ending up with a UI that flickers on every reconnection. It covers schema design, RLS, presence, and the patterns that keep real-time work feeling instant rather than fragile.

advanced

Multi-Tenant SaaS Architecture

Multi-tenancy is one of the highest-leverage architectural decisions in a SaaS, and almost impossible to fix later. This playbook is the model I use to design tenant isolation that scales from ten customers to ten thousand: shared schema with row-level isolation, tenant-scoped routing, configuration, billing, and an admin layer that lets you operate the platform without breaking customer trust.

intermediate

Choosing a Database in 2026

Database choice is one of the highest-cost, hardest-to-reverse technical decisions in a product. This playbook is the decision framework I walk teams through when they ask me to weigh in: define the workload first, pick a sensible default, evaluate hosted options on real criteria, plan for scale before you need it, prototype the hardest query, and document the choice so the next engineer does not have to relitigate it.

Securing Your API Endpoints

Building RAG Applications