Tutorials
|stacknotice.com
12 min left|
0%
|2,400 words
Tutorials

The Production Deployment Checklist Senior Devs Never Skip (2026)

The exact 12-step checklist senior engineers run before every production deploy — migration order, feature flags, health checks, and rollback plans.

May 27, 202612 min read
Share:
The Production Deployment Checklist Senior Devs Never Skip (2026)

Most outages aren't caused by bad code. They're caused by good code deployed in the wrong order.

Senior developers don't rely on memory before a deploy. They run a checklist — every single time, even for a one-line change.

Here's the exact checklist, and why each step exists.

Why checklists exist

Pilots don't skip the pre-flight checklist because they've flown 10,000 hours. They do it because they've flown 10,000 hours — enough to know exactly what happens when you skip a step.

The same principle applies to production deploys. Every step in this checklist exists because someone, somewhere, had an outage from skipping it.

The 12-step checklist

#CheckWhy it matters
1Env vars validate at buildSilent undefined in prod = 3 AM alert
2Migrations run BEFORE deployNew code can't see old schema
3No drizzle-kit push in prodApplies changes without migration files
4Feature flag OFF for new featuresShip code off, turn on after smoke test
5Error monitoring configuredFirst error hits Sentry, not a user
6Health check endpoint respondsLoad balancer needs /api/health
7Rate limiting on auth endpointsLogin brute-force = account takeover
8Secrets in env manager, not codeRotating a secret ≠ a new deploy
9Stripe webhooks testedWebhook signature fails silently
10Rollback plan readyKnow the previous deploy hash
11Smoke test the critical pathLog in → do the main action → verify
12Alert channel existsErrors go somewhere humans actually see

Step 1 — Env vars validate at build time

If you're using process.env.THING directly, your app will start and fail at runtime when THING is undefined. The error happens in production, at 2 AM, in front of your first real user.

With t3-env, the build fails — which is exactly what you want:

// src/lib/env.ts
import { createEnv } from '@t3-oss/env-nextjs'
import { z } from 'zod'
 
export const env = createEnv({
  server: {
    DATABASE_URL: z.string().url(),
    CLERK_SECRET_KEY: z.string().min(1),
    STRIPE_SECRET_KEY: z.string().min(1),
    STRIPE_WEBHOOK_SECRET: z.string().min(1),
    SENTRY_DSN: z.string().url(),
  },
  client: {
    NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY: z.string().min(1),
  },
  runtimeEnv: {
    DATABASE_URL: process.env.DATABASE_URL,
    CLERK_SECRET_KEY: process.env.CLERK_SECRET_KEY,
    STRIPE_SECRET_KEY: process.env.STRIPE_SECRET_KEY,
    STRIPE_WEBHOOK_SECRET: process.env.STRIPE_WEBHOOK_SECRET,
    SENTRY_DSN: process.env.SENTRY_DSN,
    NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY: process.env.NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY,
  },
})

If STRIPE_WEBHOOK_SECRET is missing from Vercel, next build fails. You catch it before a single user sees anything.

Pro tip

Add every new env var to env.ts the same moment you add it to .env.local. Never add one without the other.

Step 2 — Migrations run before deploy, always

This is the most important rule in production database management.

❌ WRONG:   Deploy code → Run migrations
✅ CORRECT: Run migrations → Deploy code

Why: during a Vercel deployment, both the old and new versions of your app run simultaneously for a few seconds. The new code expects the new schema. If you deploy code first, new code breaks on the old schema during that window.

With Drizzle:

# Never in production
npx drizzle-kit push
 
# Always in production
npx drizzle-kit generate   # creates the migration file
npx drizzle-kit migrate    # applies it to the database

Run migrations manually from your CI before Vercel deploys, or use a migration step in your GitHub Actions workflow — covered in Step 12 of this guide.

For the full breakdown of safe vs dangerous operations, see the zero-downtime migrations guide.

Step 3 — drizzle-kit push is banned in production

push applies your schema changes directly, without generating migration files. It's designed for development — fast iteration, no noise.

In production, it means:

  • No audit trail of what changed
  • No ability to roll back a migration
  • Risk of accidental data loss with no undo

Add this rule to your CLAUDE.md and your team's internal docs:

## Database rules
- Never use `drizzle-kit push` in production
- Always `generate` then `migrate`
- Migration files are committed alongside the code that requires them

Step 4 — Feature flags for every new feature

The classic failure mode:

❌ Ship → Users see broken feature → Emergency rollback
✅ Ship (flag OFF) → Smoke test in production → Turn flag ON → Gradual rollout

With Vercel Edge Config feature flags:

import { get } from '@vercel/edge-config'
 
export async function isNewDashboardEnabled(userId: string) {
  const config = await get<{ enabledUserIds: string[] }>('new-dashboard')
  return config?.enabledUserIds.includes(userId) ?? false
}

New feature ships disabled. You test it in production with your own account. When it works, you enable it for 5% of users. If something breaks at 5%, you turn the flag off — no rollback, no deploy, 10 seconds to fix.

Step 5 — Error monitoring before go-live

The key word is before. Your error monitoring must be live and verified before you ship the code that might error.

// sentry.client.config.ts
import * as Sentry from '@sentry/nextjs'
 
Sentry.init({
  dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
  environment: process.env.NODE_ENV,
  // Sample 10% of transactions in production — 100% in dev
  tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
  beforeSend(event) {
    // Don't send events in development
    if (process.env.NODE_ENV === 'development') return null
    return event
  },
})

Verify it works before deploying: throw a test error manually, confirm it shows up in your Sentry dashboard.

Full observability setup from day one — Sentry, PostHog, and structured logging.

Step 6 — Health check endpoint

Load balancers, uptime monitors, and deployment systems all need a URL to ping. If you don't have one, the first sign of a database outage is a user telling you.

// src/app/api/health/route.ts
import { db } from '@/lib/db'
import { sql } from 'drizzle-orm'
 
export const runtime = 'nodejs'
 
export async function GET() {
  try {
    await db.execute(sql`SELECT 1`)
    return Response.json(
      { status: 'ok', db: 'connected', ts: Date.now() },
      { headers: { 'Cache-Control': 'no-store' } }
    )
  } catch (err) {
    return Response.json(
      { status: 'error', db: 'disconnected' },
      { status: 503 }
    )
  }
}

This checks the actual database connection, not just that Next.js started. Set up an uptime monitor (BetterStack, UptimeRobot, Checkly) to hit /api/health every 60 seconds. If it returns 503, you get alerted before your users do.

Step 7 — Rate limiting on auth endpoints

Auth endpoints are the most targeted on any public app. Without rate limiting, a brute-force attack on your login endpoint is trivial — a script can try 10,000 passwords while you sleep.

// src/app/api/auth/login/route.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'
 
const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(5, '15 m'), // 5 attempts per 15 minutes per IP
  analytics: true,
})
 
export async function POST(request: Request) {
  const ip = request.headers.get('x-forwarded-for') ?? 'unknown'
  const { success, reset } = await ratelimit.limit(`login:${ip}`)
 
  if (!success) {
    return Response.json(
      { error: 'Too many attempts. Try again later.' },
      {
        status: 429,
        headers: { 'Retry-After': String(Math.ceil((reset - Date.now()) / 1000)) },
      }
    )
  }
 
  // proceed with auth logic
}

Full rate limiting guide with Upstash.

Step 8 — Secrets in your env manager, not in code

Three rules for secrets in production:

  1. Never in code — not even encrypted, not even in a comment
  2. Never in git.env.local is gitignored for a reason
  3. Rotate without deploying — secrets change in Vercel's env dashboard, not in a commit
// Wrong — rotating this secret requires a code change + deploy
const stripe = new Stripe('sk_live_abc123')
 
// Right — rotating means updating the var in Vercel dashboard, nothing else
import { env } from '@/lib/env'
const stripe = new Stripe(env.STRIPE_SECRET_KEY)

If a secret leaks, you want to rotate it in 30 seconds — not in 30 minutes including a deploy.

Step 9 — Stripe webhook signature verification

This is the step that bites almost everyone. Stripe sends webhooks with a signature in the Stripe-Signature header. If you don't verify it, anyone can POST to your webhook endpoint and trigger fake payment events.

// src/app/api/webhooks/stripe/route.ts
import Stripe from 'stripe'
import { env } from '@/lib/env'
 
const stripe = new Stripe(env.STRIPE_SECRET_KEY)
 
export async function POST(request: Request) {
  // Must use raw text — JSON.parse() breaks the signature
  const body = await request.text()
  const signature = request.headers.get('stripe-signature')!
 
  let event: Stripe.Event
  try {
    event = stripe.webhooks.constructEvent(body, signature, env.STRIPE_WEBHOOK_SECRET)
  } catch {
    return new Response('Invalid signature', { status: 400 })
  }
 
  // Safe to handle now
  switch (event.type) {
    case 'customer.subscription.updated':
      // handle...
      break
  }
 
  return new Response(null, { status: 200 })
}

Test before every deploy that touches webhook logic:

stripe listen --forward-to localhost:3000/api/webhooks/stripe
stripe trigger customer.subscription.updated

The full idempotency and webhook guide is in the SaaS Stripe webhooks article.

Step 10 — Know your rollback plan before you deploy

Before clicking deploy, answer this question: if this breaks, what's the first step?

On Vercel:

  1. Dashboard → Deployments
  2. Find the last working deployment
  3. Click "..." → "Promote to Production"

This takes 30 seconds. But you need to know where it is before you're in panic mode at midnight.

Warning

Rolling back code doesn't roll back the database. If your deploy included a migration, rolling back the code leaves the new schema in place. This is why every migration must be backward compatible with the previous version of your code.

The expand-contract pattern ensures your migrations are always safe to roll back.

Step 11 — Smoke test the critical path

After every deploy, manually run through the one flow that would destroy you if it broke:

  1. Sign up or log in
  2. Do the core action (create a project, submit a form, process a payment)
  3. Verify the outcome (data is saved, email sent, webhook fired, UI updated)

This takes 2 minutes. Skip it once and you'll spend 2 hours recovering from the deploy you didn't check.

The critical path is different for every product. Know yours before you start deploying.

Step 12 — Alert channel that humans actually see

"Errors go to Sentry" is not an alert strategy if nobody checks Sentry.

The pattern that works:

Sentry error  → Slack #alerts (immediate)
503 health check → PagerDuty or email (immediate)
Stripe webhook failure → Slack #payments (immediate)
Daily summary → Slack #ops (every morning)

Set this up once. When something breaks at 2 AM, a human sees it within 5 minutes — not discovers it at 9 AM when users have been complaining for 7 hours.

The full deploy sequence

In order, every time:

1.  Merge PR to main
2.  CI runs: lint → typecheck → build (validates env vars)
3.  CI runs: database migrations
4.  Vercel auto-deploys
5.  Smoke test the critical path (2 minutes)
6.  Check Sentry for new errors (first 10 minutes)
7.  If new feature: turn flag ON for 5% of users
8.  Monitor for 30 minutes
9.  Roll out to 100% — or rollback

Automate the checklist

The best checklist is one that runs without you:

# .github/workflows/deploy.yml
name: Deploy
 
on:
  push:
    branches: [main]
 
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
      - run: npm ci
      - run: npm run typecheck
      - run: npm run lint
      - run: npm run build  # fails if env vars missing
 
  migrate:
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npx drizzle-kit migrate
        env:
          DATABASE_URL: ${{ secrets.DATABASE_URL }}
 
# Vercel watches the main branch and deploys after push
# The migrate job always completes before Vercel picks up the new code

check runs first, validates everything. migrate runs after check, updates the database. Vercel deploys after the push — by then, migrations are already applied.

What juniors skip (and why it hurts)

SkipConsequence
Env validationundefined reads silently, crashes at runtime
Migration orderNew code breaks on old schema during deploy window
Feature flagsReal users are your QA team
Health checkOutages discovered by users, not monitors
Rate limiting on authLogin brute-forced while you sleep
Stripe signatureAnyone can fire fake payment events
Rollback planPanic decisions under pressure
Smoke testBroken flow discovered by your best customer

This checklist is 5 minutes before a deploy that saves 5 hours after one. Seniors run it on every push — even the "it's just a typo fix" ones. Especially those.

For the full project setup that makes all of this easier from day one, see How Senior Devs Start a Full-Stack Project in 2026.

#nextjs#devops#postgresql#webdev#typescript
Share:

Enjoyed this article?

Join 2,400+ developers getting weekly insights on Claude Code, React, and AI tools.

No spam. Unsubscribe anytime. By subscribing you agree to our Privacy Policy.