Claude Code
|stacknotice.com
12 min left|
0%
|2,400 words
Claude Code

Claude Code TDD: Force Red-Green-Refactor with Hooks & CLAUDE.md (2026)

Stop letting Claude write implementation before tests. Set up a real TDD workflow in Claude Code using CLAUDE.md rules, hooks for auto-test-runs, and red-green-refactor discipline.

C
Carlos Oliva
Software Developer
June 11, 202612 min read
Share:
Claude Code TDD: Force Red-Green-Refactor with Hooks & CLAUDE.md (2026)

The problem with AI-assisted development and TDD isn't that the AI can't write tests — it's that if you don't constrain it, the AI writes the implementation first, then writes tests that perfectly match that implementation. You end up with 100% test coverage and zero confidence that the tests actually catch bugs.

Real TDD means writing a failing test before any implementation exists. The test defines the behavior. The implementation satisfies the test. This guide shows how to configure Claude Code so that TDD isn't a guideline you might forget — it's the only workflow available.

The Problem: AI's Default Instinct

Give Claude Code a feature request without TDD constraints and this is what happens:

  1. Claude writes the implementation
  2. Claude writes tests that call the implementation
  3. Tests pass on the first run
  4. You have tests that test exactly what the code does, not what it should do

That's not TDD. Those tests will never catch a regression unless the code changes in a way that happens to break the exact path the tests exercise. They're documentation, not safety nets.

The Fix: Constrain the Workflow in CLAUDE.md

The most reliable way to enforce TDD with Claude Code is to make it non-negotiable in your CLAUDE.md. Claude Code reads and follows CLAUDE.md instructions precisely — this is how you make TDD a rule, not a suggestion.

# CLAUDE.md
 
## Development workflow — STRICT TDD (always follow this order)
 
1. **RED**: Write a failing test that describes the desired behavior.
   - Run the test suite: `npm test` — confirm the new test FAILS.
   - Do NOT write implementation before this step.
   - If tests pass before implementation exists, the test is wrong. Delete and rewrite.
 
2. **GREEN**: Write the *minimum* code to make the failing test pass.
   - No extra logic, no "while we're here" additions.
   - Run tests: `npm test` — confirm ALL tests pass.
 
3. **REFACTOR**: Clean up without changing behavior.
   - Run tests after every change: `npm test` — must stay green.
 
**Rules:**
- Never write implementation code without a failing test first.
- Never write more implementation than the current failing test requires.
- One test cycle at a time — complete RED → GREEN → REFACTOR before the next feature.
- If asked to add a feature, say "I'll write the test first" and do so.

This isn't magic — it's telling Claude Code exactly what order to do things in. Claude Code follows procedural instructions in CLAUDE.md reliably when they're specific.

Hooks: Auto-Run Tests After Every File Write

Claude Code hooks fire shell commands in response to tool events. Use PostToolUse hooks to automatically run your test suite every time Claude writes a file:

// .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "npm test -- --reporter=verbose 2>&1 | tail -20"
          }
        ]
      }
    ]
  }
}

Now every time Claude writes or edits a file, your test suite runs automatically. The last 20 lines of output are shown in context, so Claude immediately sees whether the test is red or green.

For larger test suites where running everything is slow, scope to the file being changed:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "npm test -- --reporter=verbose --passWithNoTests 2>&1 | tail -30"
          }
        ]
      }
    ]
  }
}

With Vitest, you can also run in watch mode during development — but hooks work better for the TDD loop because they're synchronous with Claude's write operations.

For more on Claude Code hooks, see the complete hooks guide.

Setting Up the Test Environment

The examples below use Vitest with React Testing Library for unit/integration tests. See the Next.js testing guide with Vitest and Playwright for full setup.

npm install -D vitest @vitejs/plugin-react @testing-library/react @testing-library/user-event jsdom
// vitest.config.ts
import { defineConfig } from 'vitest/config'
import react from '@vitejs/plugin-react'
 
export default defineConfig({
  plugins: [react()],
  test: {
    environment: 'jsdom',
    setupFiles: ['./src/test/setup.ts'],
    globals: true,
  },
})
// src/test/setup.ts
import '@testing-library/jest-dom'

A Real TDD Cycle with Claude Code

Let's build a password strength validator. Here's exactly how to run this session with Claude Code.

Step 1: Write the Prompt

I need a `validatePassword` function in `src/lib/password.ts`.

Follow TDD strictly (as per CLAUDE.md):
1. Write a failing test in `src/lib/password.test.ts` first
2. Show me the test output (confirm it's RED)
3. Write the minimum implementation to make it pass
4. Confirm tests are GREEN
5. Refactor if needed

Requirements:
- At least 8 characters
- At least one uppercase letter
- At least one number
- Returns { valid: boolean, errors: string[] }

Step 2: Claude Writes the Test First

Claude follows the CLAUDE.md rules and writes:

// src/lib/password.test.ts
import { describe, it, expect } from 'vitest'
import { validatePassword } from './password'
 
describe('validatePassword', () => {
  it('returns valid for a strong password', () => {
    const result = validatePassword('SecurePass1')
    expect(result.valid).toBe(true)
    expect(result.errors).toHaveLength(0)
  })
 
  it('fails when password is too short', () => {
    const result = validatePassword('Sh0rt')
    expect(result.valid).toBe(false)
    expect(result.errors).toContain('Password must be at least 8 characters')
  })
 
  it('fails when no uppercase letter', () => {
    const result = validatePassword('lowercase1')
    expect(result.valid).toBe(false)
    expect(result.errors).toContain('Password must contain at least one uppercase letter')
  })
 
  it('fails when no number', () => {
    const result = validatePassword('NoNumbers')
    expect(result.valid).toBe(false)
    expect(result.errors).toContain('Password must contain at least one number')
  })
 
  it('returns multiple errors when multiple rules fail', () => {
    const result = validatePassword('weak')
    expect(result.valid).toBe(false)
    expect(result.errors).toHaveLength(3) // short, no uppercase, no number
  })
})

The hook fires and runs the tests. Output in context:

FAIL src/lib/password.test.ts
  ✗ validatePassword > returns valid for a strong password
    Error: Cannot find module './password'

Confirmed RED — the test fails because password.ts doesn't exist yet.

Step 3: Claude Writes Minimum Implementation

// src/lib/password.ts
export interface PasswordValidation {
  valid: boolean
  errors: string[]
}
 
export function validatePassword(password: string): PasswordValidation {
  const errors: string[] = []
 
  if (password.length < 8) {
    errors.push('Password must be at least 8 characters')
  }
 
  if (!/[A-Z]/.test(password)) {
    errors.push('Password must contain at least one uppercase letter')
  }
 
  if (!/[0-9]/.test(password)) {
    errors.push('Password must contain at least one number')
  }
 
  return { valid: errors.length === 0, errors }
}

Hook fires again. Output:

PASS src/lib/password.test.ts
  ✓ returns valid for a strong password (2ms)
  ✓ fails when password is too short (1ms)
  ✓ fails when no uppercase letter (1ms)
  ✓ fails when no number (1ms)
  ✓ returns multiple errors when multiple rules fail (1ms)

Test Files  1 passed (1)
Tests  5 passed (5)

GREEN. Now Claude refactors if needed (in this case the code is clean enough, nothing to refactor).

Testing React Components: The Same Discipline

TDD works for components too — and it's where the discipline pays off most, because component behavior is easy to test incorrectly.

Prompt:

Build a `<PasswordInput>` component that shows real-time strength feedback.
Follow TDD as per CLAUDE.md — test file first, then component.

Requirements:
- Input field with show/hide password toggle
- Strength indicator: weak / medium / strong
- Shows error messages from validatePassword
- Accessible (label, aria attributes)

Claude writes the test:

// src/components/PasswordInput.test.tsx
import { render, screen } from '@testing-library/react'
import userEvent from '@testing-library/user-event'
import { PasswordInput } from './PasswordInput'
 
describe('PasswordInput', () => {
  it('renders input with label', () => {
    render(<PasswordInput label="Password" />)
    expect(screen.getByLabelText('Password')).toBeInTheDocument()
  })
 
  it('toggles password visibility', async () => {
    const user = userEvent.setup()
    render(<PasswordInput label="Password" />)
 
    const input = screen.getByLabelText('Password')
    expect(input).toHaveAttribute('type', 'password')
 
    await user.click(screen.getByRole('button', { name: /show password/i }))
    expect(input).toHaveAttribute('type', 'text')
 
    await user.click(screen.getByRole('button', { name: /hide password/i }))
    expect(input).toHaveAttribute('type', 'password')
  })
 
  it('shows weak indicator for short passwords', async () => {
    const user = userEvent.setup()
    render(<PasswordInput label="Password" />)
 
    await user.type(screen.getByLabelText('Password'), 'weak')
    expect(screen.getByText('Weak')).toBeInTheDocument()
  })
 
  it('shows strong indicator for valid passwords', async () => {
    const user = userEvent.setup()
    render(<PasswordInput label="Password" />)
 
    await user.type(screen.getByLabelText('Password'), 'StrongPass1')
    expect(screen.getByText('Strong')).toBeInTheDocument()
  })
 
  it('shows error messages', async () => {
    const user = userEvent.setup()
    render(<PasswordInput label="Password" />)
 
    await user.type(screen.getByLabelText('Password'), 'nouppercase1')
    expect(screen.getByText('Password must contain at least one uppercase letter')).toBeInTheDocument()
  })
})

Hook confirms RED (component doesn't exist). Then Claude writes the component to satisfy exactly these tests — no extra features, no speculative behavior.

Handling Regressions: Where TDD Proves Its Value

Three weeks later, you decide to add a special character requirement. Without TDD, you'd change the validator and hope nothing breaks. With TDD:

Add a requirement to validatePassword: at least one special character (!@#$%^&*).
Follow TDD — write the failing test for the new requirement first.

Claude adds one test:

it('fails when no special character', () => {
  const result = validatePassword('NoSpecial1')
  expect(result.valid).toBe(false)
  expect(result.errors).toContain('Password must contain at least one special character')
})

Hook runs — this test fails, all others pass (RED for the right reason). Claude adds the check to the validator. All 6 tests pass. The existing tests act as a regression net — if the new code accidentally breaks the uppercase check, you'd see that immediately.

This is the value of TDD: confidence that changing one thing doesn't silently break another.

Dealing with Claude Skipping the TDD Process

Sometimes — especially mid-session when context is long — Claude starts writing implementation before tests. Use a short, firm intervention:

Stop. You wrote the implementation before the test.
Delete `src/lib/feature.ts`. Write the test first, confirm it's RED, then write the implementation.

Claude Code responds to explicit course-corrections. If this happens repeatedly, add a /compact to clear context and start fresh with the CLAUDE.md rules loaded cleanly.

You can also add a pre-commit hook to block commits when tests fail:

# .husky/pre-commit
#!/bin/sh
npm test -- --run

This catches any case where implementation was pushed without tests passing. See debugging workflows in Claude Code for more on hooks that catch problems before they ship.

Testing Server-Side Code: Route Handlers and Server Actions

TDD applies equally to backend code. For a Route Handler:

// app/api/users/route.test.ts
import { describe, it, expect, vi } from 'vitest'
import { GET } from './route'
 
// Mock the database
vi.mock('@/lib/db', () => ({
  db: {
    query: {
      users: {
        findMany: vi.fn().mockResolvedValue([
          { id: '1', name: 'Alice', email: 'alice@example.com' },
        ]),
      },
    },
  },
}))
 
describe('GET /api/users', () => {
  it('returns users list', async () => {
    const response = await GET(new Request('http://localhost/api/users'))
    const data = await response.json()
 
    expect(response.status).toBe(200)
    expect(data).toHaveLength(1)
    expect(data[0].name).toBe('Alice')
  })
 
  it('returns 401 when not authenticated', async () => {
    // Test auth behavior before implementing it
    const response = await GET(
      new Request('http://localhost/api/users', {
        headers: { Authorization: '' },
      })
    )
    expect(response.status).toBe(401)
  })
})

Write the failing test, confirm RED, then build the route handler to satisfy it.

The CLAUDE.md TDD Template

Copy this into your project's CLAUDE.md:

## TDD — Non-negotiable workflow
 
**Every feature, every time:**
1. Write the test file first. No exceptions.
2. Run `npm test` — confirm the new test FAILS. If it passes, the test is wrong.
3. Write MINIMUM implementation to make the test pass.
4. Run `npm test` — confirm ALL tests pass.
5. Refactor. Run tests again after every change.
 
**If you catch yourself writing implementation before tests:**
- Stop immediately
- Delete the implementation file
- Go back to step 1
 
**Test file naming:** `[filename].test.ts` or `[filename].test.tsx` in the same directory as the source file.
 
**Test structure:** describe → it → arrange → act → assert. No logic in tests.

Summary

The TDD workflow with Claude Code has two layers:

  1. CLAUDE.md rules — explicit, ordered instructions Claude follows. Red before green, always.
  2. PostToolUse hooks — automatic test runs after every file write. No manual npm test needed.

Together these create a feedback loop where Claude can't drift into implementation-first habits without immediately seeing a green test that should have been red. The test suite becomes the specification, and the spec gets written before the code.

For teams, commit your .claude/settings.json hooks alongside your CLAUDE.md — everyone on the project gets the same TDD enforcement automatically.

#claude-code#tdd#testing#vitest#typescript
Share:
C
Carlos Oliva
Software Developer · stacknotice.com

Software developer with hands-on experience building production apps with React, Next.js, Angular, TypeScript, and Spring Boot. I write practical guides on Claude Code, AI tools, and modern web development — covering the decisions and trade-offs that senior-level tutorials actually explain.

More about Carlos

Enjoyed this article?

Get weekly insights on Claude Code, React, and AI tools — practical guides for developers who build real things.

No spam. Unsubscribe anytime. By subscribing you agree to our Privacy Policy.