多 Agent 系统开发 Multi-Agent System Development

完整覆盖 Full Coverage

构建多 Agent 协作系统的完整方法论，从架构设计到提示工程、SDK 实现、评测验证 Complete methodology for multi-agent systems — architecture, prompt design, SDK implementation, evaluation, and review

入口条件 Entry Condition

你需要构建一个多 Agent 协作系统 You need to build a multi-agent collaborative system

交付物 Deliverables

Agent 系统完成，经过评测验证，多 Agent 能按职责正确协作 Agent system complete, evaluated and verified, multi-agent collaboration working correctly

7 个步骤 7 Steps

架构设计 Architecture Design

Solo Solo

明确哪些任务需要子代理，定义通信流程，设计上下文隔离策略 Identify tasks needing subagents, define communication, design context isolation strategy

子代理驱动开发（Subagent-Driven Development）

Agent 上下文工程

质量标准 Quality Criteria

- 明确哪些任务需要独立子代理 - 定义代理间的通信方式 - 设计上下文隔离策略（四桶法：写、选、压、隔） - 每个代理有清晰的职责边界 - Identify tasks needing independent subagents - Define inter-agent communication - Design context isolation strategy (four-bucket: Write, Select, Compress, Isolate) - Each agent has clear responsibility boundaries

验证门 Verification Gate

架构是否清晰划分了代理职责？上下文策略是否防止退化？ Does architecture clearly separate responsibilities? Does context strategy prevent degradation?

提示设计 Prompt Design

Solo Solo

为每个 Agent 设计 System Prompt、触发条件、工具权限和质量标准 Design system prompts, triggering conditions, tool permissions, and quality standards for each agent

Agent 系统提示设计

质量标准 Quality Criteria

- 每个 Agent 有完整的 frontmatter（name、description、model、tools） - description 包含 2-4 个触发示例 - System Prompt 包含角色、职责、流程、输出格式 - 工具权限遵循最小权限原则 - Each agent has complete frontmatter (name, description, model, tools) - Description includes 2-4 triggering examples - System prompt covers role, responsibilities, process, output format - Tool permissions follow least-privilege principle

验证门 Verification Gate

每个 Agent 的触发条件是否清晰？System Prompt 是否覆盖边缘情况？ Are triggering conditions clear for each agent? Do system prompts cover edge cases?

并行调度 Parallel Dispatch

多人协作 Multi-Agent

独立问题域各派一个代理，并发调查而非顺序排队 One agent per independent problem domain, concurrent investigation

并行代理调度（Dispatching Parallel Agents）

质量标准 Quality Criteria

- 每个独立问题域派遣一个专门代理 - 并行执行而非顺序排队 - 每个代理有明确的任务边界 - 结果汇总机制就位 - Dedicated agent per independent problem domain - Parallel execution, not sequential queuing - Clear task boundaries per agent - Result aggregation mechanism in place

验证门 Verification Gate

代理是否在独立的问题域上工作？有没有任务重叠或竞争？ Are agents working on independent domains? Any task overlap?

接口标准化 Interface Standardization

Solo Solo

用 MCP 协议标准化 Agent 与外部服务的交互接口 Standardize agent-to-external-service interfaces with MCP protocol

MCP 构建器（MCP Builder）

质量标准 Quality Criteria

- MCP 协议规范正确实现 - 工具定义清晰完整 - 错误处理和边界情况覆盖 - 接口文档齐全 - MCP protocol correctly implemented - Tool definitions clear and complete - Error handling and edge cases covered - Interface documentation complete

验证门 Verification Gate

MCP 服务器是否响应正确？工具定义是否完整覆盖所有功能？ Does MCP server respond correctly? Do tool definitions cover all functions?

SDK 实现 SDK Implementation

Solo Solo

使用 Claude Agent SDK 将设计落地为可运行的 Agent 代码 Implement agent designs as runnable code using Claude Agent SDK

Claude Agent SDK 开发指南

质量标准 Quality Criteria

- 使用正确的 SDK API（query、tool、子代理） - Hooks 生命周期正确配置 - 权限模式匹配安全需求 - 结构化输出经过 JSON Schema 验证 - Correct SDK API usage (query, tool, subagents) - Hooks lifecycle properly configured - Permission mode matches security requirements - Structured outputs validated with JSON Schema

验证门 Verification Gate

Agent 是否能正确启动和响应？Hooks 和权限是否按预期工作？ Can agents start and respond correctly? Do hooks and permissions work as expected?

评测验证 Evaluation & Testing

Solo Solo

用 LLM-as-Judge 和多维评分量表系统性评测 Agent 质量 Systematically evaluate agent quality using LLM-as-Judge and multi-dimensional rubrics

Agent 评测方法论

质量标准 Quality Criteria

- 五维评分量表覆盖指令遵循、完整性、效率、推理、连贯性 - 测试集跨复杂度分层（简单、中等、复杂） - 偏差缓解措施到位（位置交换、长度归一化） - 通过阈值（加权分 ≥ 3.5/5.0） - Five-dimension rubric covers instruction following, completeness, efficiency, reasoning, coherence - Test set stratified across complexity levels - Bias mitigation measures in place (position swap, length normalization) - Passing threshold met (weighted score ≥ 3.5/5.0)

验证门 Verification Gate

各维度评分是否达标？是否检测并缓解了评测偏差？ Do all dimension scores meet threshold? Is evaluation bias detected and mitigated?

评审 Review

多人协作 Multi-Agent

对 Agent 系统进行两阶段评审，确认协作正确性 Two-phase review of the agent system, verify collaboration correctness

请求代码评审（Requesting Code Review）

接收代码评审（Receiving Code Review）

质量标准 Quality Criteria

- 两阶段评审流程完成 - Agent 协作逻辑正确性验证 - 错误处理路径覆盖 - Two-phase review process complete - Agent collaboration logic verified - Error handling paths covered

验证门 Verification Gate

两阶段评审是否都完成？Agent 协作是否经过实际测试？ Both review phases complete? Agent collaboration actually tested?