教程 Tutorial 推荐 Featured

TDD + Claude Code 实战：为什么测试驱动开发是 AI 编程的最佳纪律 TDD + Claude Code in Practice: Why Test-Driven Development Is the Best AI Coding Discipline

很多人让 AI 写代码不写测试。这是最大的错误。本文用真实案例展示 TDD 如何让 Claude Code 的输出质量提升 3 倍。 Many people let AI write code without tests. This is the biggest mistake. Real-world cases showing how TDD triples Claude Code output quality.

Wayne 2026-03-18

来源：Claude Code Skills

TDD + Claude Code 实战：为什么测试驱动开发是 AI 编程的最佳纪律

AI 写代码很快，但没有测试的代码更快——快速地走向崩溃

Claude Code 可以在几秒钟内生成几百行代码。这种速度让人兴奋，也让人危险地放松警惕。当你看到 AI 流畅地输出一个完整的函数、一个 API 端点、甚至一整个模块时，很容易产生一种错觉："它写得这么快，应该是对的吧。"

但现实是：没有测试的 AI 代码，本质上是未经验证的猜测。

我们在实际项目中反复观察到这样的模式：开发者让 Claude Code 生成一个功能，看起来运行正常，于是继续开发下一个功能。几天后，当代码量积累到一定程度，Bug 开始连锁爆发。更糟糕的是，这些 Bug 的根源往往藏在最初那段"看起来没问题"的代码里。

这就是为什么我们把测试驱动开发（Test-Driven Development，简称 TDD）列为 CC Skills 体系中 A 类核心方法论的首位。

AI 语境下的 TDD：不是教条，而是实用主义

传统的 TDD 有时被批评为过于教条——先写测试再写实现，对某些场景来说确实有些僵化。但在 AI 编程的语境下，TDD 获得了全新的生命力，因为它完美地解决了 AI 代码生成的核心问题：验证。

在 Claude Code 中践行 TDD 的基本循环是这样的：

第一步：你写测试，定义期望

// 你来写这个测试——它定义了"正确"是什么
describe('calculateShippingCost', () => {
  test('免费配送：订单满 100 元', () => {
    expect(calculateShippingCost({ total: 150, region: 'domestic' })).toBe(0);
  });

  test('标准配送费：国内订单', () => {
    expect(calculateShippingCost({ total: 50, region: 'domestic' })).toBe(10);
  });

  test('国际配送费加价', () => {
    expect(calculateShippingCost({ total: 50, region: 'international' })).toBe(25);
  });

  test('边界情况：正好 100 元', () => {
    expect(calculateShippingCost({ total: 100, region: 'domestic' })).toBe(0);
  });
});

注意：测试是你写的，不是 AI 写的。这一点至关重要。测试代表的是你的业务意图、你的边界条件理解、你的质量标准。把测试交给 AI 写，等于让考生自己出卷子。

第二步：让 Claude Code 实现

把测试给 Claude Code，让它生成满足这些测试的实现代码。你的提示可以很简单：

"请实现 calculateShippingCost 函数，使其通过以上所有测试用例。"

第三步：运行测试，验证结果

npm test -- --testPathPattern="shipping"

如果所有测试通过——太好了，AI 理解了你的意图，生成的代码是正确的。如果有测试失败——同样是好消息，因为你在第一时间就发现了问题，而不是在上线三天后被用户投诉。

第四步：重构与优化

测试通过后，你可以安全地让 Claude Code 重构代码——优化性能、改善可读性、减少重复。因为测试在那里，任何重构如果破坏了既有行为，你立刻就能发现。

3 倍质量提升：一个真实的对比

在我们的一个内部项目中，我们做了一个对比实验。同一个功能模块（用户权限管理系统），分别用两种方式开发：

方式 A：无测试直接开发

让 Claude Code 直接生成完整的权限管理模块
手动检查代码，觉得"看起来没问题"
部署后在集成测试阶段发现 7 个 Bug
其中 2 个是严重的权限绕过漏洞
修复全部 Bug 用了 4 小时
总用时：生成 20 分钟 + 修复 4 小时 = 4 小时 20 分钟

方式 B：TDD 驱动开发

先花 40 分钟编写 22 个测试用例，覆盖正常流程、边界条件、异常处理
让 Claude Code 实现代码，首次运行 18 个测试通过，4 个失败
针对失败的测试修正实现，10 分钟后全部通过
部署后集成测试阶段发现 1 个 Bug（测试未覆盖的跨模块交互）
修复用了 15 分钟
总用时：测试 40 分钟 + 实现 20 分钟 + 修正 10 分钟 + 修复 15 分钟 = 1 小时 25 分钟

时间节省了约 3 倍，但更重要的是质量差异。方式 A 遗留了 2 个安全漏洞，如果没在集成测试中发现，上线后的后果不堪设想。方式 B 的唯一遗漏是跨模块交互问题，这在单元测试层面本就难以覆盖。

常见异议与回应

"写测试太浪费时间了"

这是最常见的异议，也是最容易反驳的。上面的对比数据已经说明了问题：不写测试"节省"的时间，会在调试阶段以数倍的代价还回来。在 AI 编程中尤其如此，因为 AI 生成代码的速度极快，没有测试的约束，你会在极短时间内积累大量未验证的代码——一颗等待引爆的定时炸弹。

"AI 可以帮我写测试啊"

可以，但要分清楚。让 AI 帮你补充测试用例（比如你写了 5 个核心用例后，让 AI 再补充边界情况）是合理的。但核心测试必须由你来写，因为：

测试定义的是业务需求，只有你知道业务需求
AI 写的测试容易"迎合"自己的实现逻辑，形成循环验证（Circular Validation）
测试是你与 AI 之间的"契约"，契约必须由人来制定

"我的项目很小，不需要测试"

项目越小，TDD 的启动成本越低。3 个测试用例可能只需要 5 分钟。而且"小项目"的定义往往是动态的——今天的小脚本，明天可能变成核心服务。从一开始就建立测试习惯，远比事后补测试轻松。

"TDD 不适合探索性开发"

这个异议有一定道理。在纯粹的原型探索阶段（Prototyping Phase），严格的 TDD 确实可能过于束缚。但即便如此，我们建议至少做"轻量 TDD"——为关键函数写几个核心测试。当探索阶段结束、代码进入正式开发时，这些测试就是你最好的安全网。

如何在 Claude Code 中开始 TDD

快速上手三步法

第一步：配置测试环境

确保你的项目已经配置好测试框架（Jest、Vitest、Pytest 等）。如果还没有，让 Claude Code 帮你配置：

"请为这个项目配置 Vitest 测试环境，包括基本配置文件和示例测试。"

第二步：养成"先写测试"的习惯

每次要开发新功能时，强迫自己先写至少 3 个测试用例：

一个正常流程测试（Happy Path）
一个边界条件测试（Edge Case）
一个异常处理测试（Error Handling）

第三步：建立 TDD 工作流技能

在你的 Claude Code 配置中添加 TDD 相关技能。CC Skills 提供了完整的 TDD 技能模板，包括：

测试编写规范
AI 实现提示模板
验证检查清单（Verification Checklist）
重构安全守则

进阶实践

当你习惯了基本的 TDD 循环后，可以进一步探索：

行为驱动开发（BDD）：用自然语言描述期望行为，再转化为测试
契约测试（Contract Testing）：为 API 接口定义契约，确保前后端一致
快照测试（Snapshot Testing）：对 UI 组件的输出做快照对比，防止意外变更
变异测试（Mutation Testing）：验证你的测试是否真的在"测试"——如果代码被故意破坏，测试能否发现？

结语：纪律是自由的前提

TDD 看起来像是给开发过程加上了枷锁，但实际上它带来的是自由——自由地重构、自由地让 AI 生成代码、自由地部署而不担心回归。

在 AI 编程时代，代码生成的门槛已经大幅降低，但代码质量的标准不应该随之降低。TDD 是确保这个标准的最实用、最可靠的纪律。

如果你只采用 CC Skills 中的一条技能，我们强烈建议：就是这一条。

探索更多 AI 编程最佳实践，请访问 claudecodeskills.wayjet.io。

Source: Claude Code Skills

TDD + Claude Code in Practice: Why Test-Driven Development Is the Best AI Coding Discipline

AI Writes Code Fast, but Untested Code Breaks Faster

Claude Code can generate hundreds of lines of code in seconds. That speed is exhilarating — and dangerously disarming. When you watch AI fluently produce a complete function, an API endpoint, or an entire module, it is easy to fall into an illusion: "It wrote that so fast, it must be correct."

The reality is different: AI-generated code without tests is fundamentally unverified guesswork.

We have observed this pattern repeatedly in real projects: a developer asks Claude Code to generate a feature, it appears to work, so they move on to the next feature. Days later, as the codebase grows, bugs begin cascading. Worse, the root cause often traces back to that original code that "looked fine."

This is why we rank Test-Driven Development (TDD) as the number one core methodology in the CC Skills system.

TDD in the AI Context: Pragmatism, Not Dogma

Traditional TDD is sometimes criticized as overly dogmatic — writing tests before implementation feels rigid in certain scenarios. But in the AI coding context, TDD gains new life because it perfectly addresses the core challenge of AI code generation: verification.

The basic TDD cycle in Claude Code works like this:

Step 1: You Write the Tests, Defining Expectations

// You write this test — it defines what "correct" means
describe('calculateShippingCost', () => {
  test('free shipping for orders over $100', () => {
    expect(calculateShippingCost({ total: 150, region: 'domestic' })).toBe(0);
  });

  test('standard domestic shipping fee', () => {
    expect(calculateShippingCost({ total: 50, region: 'domestic' })).toBe(10);
  });

  test('international shipping surcharge', () => {
    expect(calculateShippingCost({ total: 50, region: 'international' })).toBe(25);
  });

  test('boundary case: exactly $100', () => {
    expect(calculateShippingCost({ total: 100, region: 'domestic' })).toBe(0);
  });
});

The critical point: the tests are written by you, not by the AI. Tests represent your business intent, your understanding of edge cases, your quality standards. Letting the AI write its own tests is like letting a student write their own exam.

Step 2: Let Claude Code Implement

Give the tests to Claude Code and ask it to generate implementation code that passes them. Your prompt can be simple:

"Please implement the calculateShippingCost function to pass all of the above test cases."

Step 3: Run Tests, Verify Results

npm test -- --testPathPattern="shipping"

If all tests pass — excellent, the AI understood your intent and the generated code is correct. If some tests fail — that is equally good news, because you caught the problem immediately rather than discovering it three days after deployment through a user complaint.

Step 4: Refactor and Optimize

After tests pass, you can safely ask Claude Code to refactor — optimize performance, improve readability, reduce duplication. Because the tests are in place, any refactoring that breaks existing behavior is caught immediately.

3x Quality Improvement: A Real Comparison

In one of our internal projects, we ran a comparison experiment. The same feature module (a user permission management system) was developed using two approaches:

Approach A: Direct Development Without Tests

Asked Claude Code to generate the complete permission management module
Manually reviewed the code, decided it "looked fine"
Discovered 7 bugs during integration testing after deployment
2 of them were serious permission bypass vulnerabilities
Fixing all bugs took 4 hours
Total time: 20 minutes generation + 4 hours fixing = 4 hours 20 minutes

Approach B: TDD-Driven Development

Spent 40 minutes writing 22 test cases covering normal flows, edge cases, and error handling
Asked Claude Code to implement; first run: 18 tests passed, 4 failed
Corrected implementation for failing tests; all passed within 10 minutes
Discovered 1 bug during integration testing (a cross-module interaction not covered by unit tests)
Fix took 15 minutes
Total time: 40 min tests + 20 min implementation + 10 min corrections + 15 min fix = 1 hour 25 minutes

The time savings were roughly 3x, but the quality difference mattered more. Approach A left 2 security vulnerabilities that, if not caught in integration testing, could have had serious consequences in production. Approach B's only gap was a cross-module interaction issue — something inherently difficult to catch at the unit test level.

Common Objections and Responses

"Writing tests wastes too much time"

This is the most common objection and the easiest to counter. The comparison data above speaks for itself: time "saved" by skipping tests is repaid with compounding interest during the debugging phase. This is especially true in AI coding, where generation speed is so fast that without the constraint of tests, you accumulate massive amounts of unverified code in a very short time — a ticking time bomb.

"AI can write the tests for me"

It can, but you need to draw a clear line. Asking AI to supplement test cases (for example, after you have written 5 core cases, asking it to add edge cases) is reasonable. But core tests must be written by you, because:

Tests define business requirements, and only you know the business requirements
AI-written tests tend to "accommodate" their own implementation logic, creating circular validation
Tests are the "contract" between you and the AI, and contracts must be authored by humans

"My project is too small for tests"

The smaller the project, the lower the startup cost for TDD. Three test cases might take only 5 minutes. And the definition of "small project" is often dynamic — today's small script can become tomorrow's core service. Establishing testing habits from the start is far easier than retrofitting tests later.

"TDD does not suit exploratory development"

This objection has some merit. During pure prototyping phases, strict TDD can indeed be too constraining. Even so, we recommend at least "lightweight TDD" — write a few core tests for key functions. When the exploration phase ends and code enters formal development, those tests become your best safety net.

Getting Started with TDD in Claude Code

Three Steps to Quick Start

Step 1: Configure Your Test Environment

Ensure your project has a testing framework set up (Jest, Vitest, Pytest, etc.). If not, ask Claude Code to help:

"Please configure a Vitest testing environment for this project, including the base config file and a sample test."

Step 2: Build the "Tests First" Habit

Every time you start developing a new feature, force yourself to write at least 3 test cases:

One happy path test
One edge case test
One error handling test

Step 3: Establish a TDD Workflow Skill

Add TDD-related skills to your Claude Code configuration. CC Skills provides complete TDD skill templates, including:

Test writing standards
AI implementation prompt templates
Verification checklists
Refactoring safety rules

Advanced Practices

Once you are comfortable with the basic TDD cycle, explore further:

Behavior-Driven Development (BDD): Describe expected behavior in natural language, then convert to tests
Contract Testing: Define contracts for API interfaces to ensure frontend-backend consistency
Snapshot Testing: Capture UI component output snapshots to prevent unintended changes
Mutation Testing: Verify that your tests actually test — if code is deliberately broken, do the tests catch it?

Conclusion: Discipline Is the Foundation of Freedom

TDD looks like adding constraints to the development process, but what it actually delivers is freedom — freedom to refactor, freedom to let AI generate code, freedom to deploy without fearing regressions.

In the AI coding era, the barrier to code generation has dropped dramatically, but the standard for code quality should not drop with it. TDD is the most practical, most reliable discipline for maintaining that standard.

If you adopt only one skill from CC Skills, we strongly recommend this one.

Explore more AI coding best practices at claudecodeskills.wayjet.io.

tddtestingdisciplineclaude-code