Subagents
Planning & CoordinationClean Context Per Subtask
Subagents use independent messages[], keeping the main conversation clean
s01 > s02 > s03 > [ s04 ] s05 > s06 > s07 > s08 > s09 > s10 > s11 > s12
"When the main agent is too heavy, send a lightweight clone" -- subagents isolate context, each doing their own work without pollution.
Harness layer: Isolation -- temporary clones with independent context windows.
Problem
A complex task (e.g., "add docstrings to 20 files") bloats the main agent's context to overflow. Each file adds a large chunk. And sub-task intermediate state (errors, retries) pollutes the main context, affecting subsequent decisions.
Solution
Main Agent
├── spawn("Add docstring to a.py") → Subagent 1 (fresh context) → "Done"
├── spawn("Add docstring to b.py") → Subagent 2 (fresh context) → "Done"
└── spawn("Add docstring to c.py") → Subagent 3 (fresh context) → "Failed"
Each subagent has a fresh context window with just one instruction. Returns only a summary result.
Key Code
def subagent(task: str, tools=TOOLS) -> str:
messages = [{"role": "user", "content": task}]
for _ in range(MAX_TURNS):
response = client.messages.create(model=MODEL, system=SUB_SYSTEM, messages=messages, tools=tools)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason != "tool_use":
return extract_text(response) # Only return text summary
# ... execute tools, append results ...
return "Max turns exceeded"
What's New (s03 → s04)
| Component | s03 | s04 |
|---|---|---|
| Tools | 5 | 6 (+subagent) |
| Context | Single shared | Main/sub isolated |
| Complexity | Serial single-thread | Can delegate subtasks |
Deep Dive: Design Decisions
Q1: Can subagents spawn sub-subagents?
Technically yes, but not recommended. Recursive subagents cause context tree explosion, hard-to-trace errors, and cost runaway. Claude Code limits to one level: main can spawn subs, but subs can't spawn.
Q2: Why smaller max_tokens for subagents?
Subagent tasks are more focused, needing shorter responses. max_tokens=4000 vs main's 8000 reduces unnecessary generation. Subagents typically need 3-5 tool calls, not 20-50.
Q3: How do subagent results return to the main agent?
Text summary only — not the full message history. If you return full history, you defeat the purpose of context isolation.
Q4: How does this differ from s20's Parallel Teams?
s04 is serial subagents: main spawns one at a time, waits for completion. s20 is true parallelism: multiple worker processes running simultaneously with file locks.
Q5: Does the subagent use the same system prompt?
Usually not. Subagents get a more focused system prompt, removing planning/collaboration instructions, making them more execution-oriented.
Try It
cd learn-claude-code
python agents/s04_subagent.py
Recommended prompts:
"Add docstrings to all functions in hello.py"— observe context isolation"Create 3 utility files: string_utils, math_utils, file_utils"— observe multiple spawns
References
- Building Effective Agents: Orchestrator-Workers — Anthropic, Dec 2025. Defines the Orchestrator-Workers pattern. s04 is its single-threaded version.
- Claude Code: Subagent — Anthropic Docs. Claude Code's subagent implementation for delegating complex subtasks.
- Context Window Best Practices — Anthropic Docs. Context management strategies explaining benefits of isolating subtask contexts.