Workflow Patterns
Planning & CoordinationFive Orchestration Strategies
Don't build agents — build workflow patterns; start simple, add complexity only when needed
s13 > [ s14 ] s15 > s16 > s17 > s18 > s19 > s20 > s21
"Don't reach for a framework until the simple approach fails." -- Start simple, add complexity only when proven necessary.
Harness layer: Planning & Coordination -- Five orchestration patterns, from simple to complex.
Problem
s01's single-loop agent has one mode: loop tool calls until done. But real tasks vary enormously — some need strict step sequences, some need parallel execution with voting, some need dynamic routing. We need a toolbox of orchestration patterns to choose from based on the task's nature.
Solution
Five patterns, increasing complexity:
┌─────────────────────────────────────────────────────────────┐
│ 1. Prompt Chaining [A] ──gate──► [B] ──gate──► [C] │
│ Sequential pipeline + quality gates │
├─────────────────────────────────────────────────────────────┤
│ 2. Routing ┌► [Math Agent] │
│ Classification routing In ─┤► [Code Agent] │
│ └► [Chat Agent] │
├─────────────────────────────────────────────────────────────┤
│ 3. Parallelization ┌► [A] ─┐ │
│ Parallel + aggregate In┤► [B] ─┼──► Aggregate │
│ └► [C] ─┘ │
├─────────────────────────────────────────────────────────────┤
│ 4. Orchestrator-Workers Orchestrator │
│ Dynamic decomposition ┌──┤──┐ │
│ [W1] [W2] [W3] │
├─────────────────────────────────────────────────────────────┤
│ 5. Evaluator-Optimizer [Generate] ──► [Evaluate] ──↺ │
│ Generate-evaluate loop │
└─────────────────────────────────────────────────────────────┘
Core Concepts
Pattern 1: Prompt Chaining (Sequential Pipeline)
Each step's output becomes the next step's input. Quality gates between steps halt the chain if quality is insufficient:
def prompt_chain(task):
analysis = call_agent("Analyze this task", task)
# Gate: check analysis completeness
if "INSUFFICIENT" in analysis:
return {"error": "Analysis incomplete", "details": analysis}
code = call_agent("Implement based on analysis", analysis)
# Gate: code must include error handling
if "try" not in code and "except" not in code:
return {"error": "Missing error handling"}
review = call_agent("Review this code", code)
return review
Use when: Steps have clear dependencies and need intermediate quality checks.
Pattern 2: Routing (Classification-based)
Route requests to the most appropriate specialist agent based on input characteristics:
def route(user_input):
category = classify(user_input) # "math" | "code" | "chat"
route_map = {
"math": {"system": "You are a math expert.", "tools": [calculator]},
"code": {"system": "You are a coding expert.", "tools": [bash, write_file]},
"chat": {"system": "You are a friendly assistant.", "tools": []},
}
config = route_map[category]
return call_agent(config["system"], user_input, tools=config["tools"])
Use when: Multiple input types, each requiring different tools and expertise.
Pattern 3: Parallelization (Parallel + Aggregate)
Multiple agents process the same task in parallel, results are aggregated. Two variants:
# Variant A: Sectioning — each handles different aspects
def parallel_sections(code):
results = run_parallel([
("Check security vulnerabilities", code),
("Check performance issues", code),
("Check code style", code),
])
return aggregate(results)
# Variant B: Voting — each independently judges the same thing
def parallel_voting(question):
answers = run_parallel([
("Answer this question", question), # Agent 1
("Answer this question", question), # Agent 2
("Answer this question", question), # Agent 3
])
return majority_vote(answers)
Use when: Tasks that can be independently partitioned, or high-stakes decisions needing redundant verification.
Pattern 4: Orchestrator-Workers
An orchestrator dynamically decomposes tasks, delegates to specialized workers, and aggregates results:
def orchestrator(task):
# Orchestrator dynamically decomposes (not hardcoded steps)
subtasks = call_agent("Break this into subtasks. Return JSON array.", task)
results = []
for subtask in json.loads(subtasks):
result = call_agent(f"Complete this subtask: {subtask}",
context=task, tools=CODING_TOOLS)
results.append(result)
return call_agent("Combine these results", "\n".join(results))
Use when: Complex tasks where steps can't be predetermined.
Pattern 5: Evaluator-Optimizer
Generate → Evaluate → Improve loop. s13's TDAD is a concrete implementation of this pattern:
def evaluator_optimizer(task, max_rounds=3):
output = None
for round in range(max_rounds):
if output is None:
output = call_agent("Implement this", task)
else:
output = call_agent(f"Improve based on feedback: {feedback}", output)
eval_result = call_agent("Score 0-10. Be strict.", output)
score = extract_score(eval_result)
feedback = extract_feedback(eval_result)
if score >= 8:
return output
return output
Use when: Output has clear quality criteria and can be iteratively improved.
Key Code
# Pattern selector — auto-selects pattern based on task characteristics
def select_pattern(task_description):
if needs_sequential_steps(task_description): return "prompt_chaining"
if has_clear_categories(task_description): return "routing"
if is_independently_decomposable(task_description): return "parallelization"
if is_complex_unknown_structure(task_description): return "orchestrator_workers"
if has_clear_quality_criteria(task_description): return "evaluator_optimizer"
return "single_agent" # Default: no complex pattern needed
What's New (s01 → s14)
| Aspect | s01 (Agent Loop) | s14 (Workflow Patterns) |
|---|---|---|
| Execution mode | Single while loop | Five selectable patterns |
| Task decomposition | None | Orchestrator dynamic breakdown |
| Parallel processing | None | Parallelization + Voting |
| Quality assurance | None | Evaluator-Optimizer loop |
| Routing strategy | All tasks handled the same | Routing by classification |
| Complexity management | "One size fits all" | Choose simplest sufficient pattern |
Deep Dive: Design Decisions
Q1: How to choose the right Workflow pattern? Is there a decision tree?
Anthropic's single most important principle in "Building Effective Agents": "Start simple." Decision tree:
What does your task need?
├── Fixed steps with dependencies? → Prompt Chaining
├── Clear input categories? → Routing
├── Can be split into independent subtasks? → Parallelization
├── Unknown number/type of steps? → Orchestrator-Workers
├── Clear quality criteria for iteration? → Evaluator-Optimizer
└── None of the above? → Single Agent Loop (s01)
Key rule: If a single Agent Loop can complete the task in 3-5 tool calls, don't use any pattern. Every abstraction layer has costs — slower, more tokens, harder to debug.
Q2: When should the Evaluator-Optimizer loop stop?
Three stopping conditions (don't rely on just one):
- Quality met:
score >= threshold(e.g., 8/10) - Improvement stalled: Score unchanged for 2 consecutive rounds (±0.5 counts as stalled)
- Max rounds: Hard upper limit (typically 3-5 rounds)
Anthropic's research: Agents produce the largest improvements in the first 2-3 rounds, with diminishing returns after. Beyond 5 rounds almost never adds value.
Q3: What are the practical uses of the Voting variant of Parallelization?
Voting addresses model non-determinism. The same prompt given to a model three times can produce three different answers. Practical uses:
- High-stakes judgment: "Does this code have a security vulnerability?" — 3 independent agents vote, 2/3 consensus confirms
- Structured extraction: Extract JSON from natural language — take majority-consistent fields from 3 extraction results
- Classification: User intent classification — majority voting is more accurate than single classification
Cost tradeoff: Voting multiplies API cost by N (voter count). Use only when error cost far exceeds API cost.
Q4: How do Prompt Chaining gates prevent error cascading?
Gates are circuit breakers in the chain. Without gates, Prompt Chaining is just a pipeline where errors amplify:
No gate: Analyze(wrong) → Implement(based on wrong analysis) → Review(garbage in, garbage out)
Gate: Analyze(wrong) → Gate: ❌ Incomplete detected → Halt, report to user
Gate implementations:
- Rule gates: Check for required keywords, correct format
- LLM gates: Another LLM call evaluates if intermediate result meets continuation criteria
- Code gates: Run tests, lint, type checks — pass to continue
Best gates are cheap and fast. A regex format check is more efficient than another LLM call.
Q5: Why does Anthropic emphasize "start simple"? What are the costs of complex patterns?
Anthropic repeats this principle throughout their blog. Hidden costs of complex patterns:
| Cost Type | Impact |
|---|---|
| Token cost | Orchestrator-Workers consumes 3-10× more tokens than single agent |
| Latency | Multi-hop flows add 1-3 seconds API latency per hop |
| Debug difficulty | Multi-agent bugs require tracing cross-context data flows |
| Fragility | Longer chains have higher probability of end-to-end failure |
| Maintenance | More complex patterns = more code = harder to change |
Correct approach: Start with a single Agent Loop, upgrade only when you can prove it's insufficient. Each upgrade should be a data-driven decision, not "it feels like it should be more complex."
Try It
cd learn-claude-code
python agents/s14_workflow_patterns.py
Recommended prompts:
"Analyze, implement, and review a Fibonacci function"— triggers Prompt Chaining"Is this a math question or coding task: calculate the area of a circle"— triggers Routing"Review this code for security, performance, and style"— triggers Parallelization"Build a TODO app with CRUD operations"— triggers Orchestrator-Workers"Write a sorting function and keep improving until optimal"— triggers Evaluator-Optimizer
References
- Building Effective Agents — Anthropic, Dec 2025. The original definition of five orchestration patterns, emphasizing "start simple" and when each pattern applies. Core reference for s14.
- Demystifying evals for AI agents — Anthropic, Jan 2026. Evaluator-Optimizer pattern applied to evaluation, including "Grade the outcome, not the path."
- Agentic Coding Trends 2026 — Anthropic, Mar 2026. Industry paradigm shift from "manual coding" to "orchestrating agents."
- Prompt Engineering Guide — Anthropic Docs. Foundations of Prompt Chaining and Routing, including classifier design and intermediate validation.