Learn Claude Code
s14

Workflow Patterns

Planning & Coordination

Five Orchestration Strategies

335 LOC4 toolsChaining, Routing, Parallelization, Orchestrator-Worker, Evaluator-Optimizer
Don't build agents — build workflow patterns; start simple, add complexity only when needed

s13 > [ s14 ] s15 > s16 > s17 > s18 > s19 > s20 > s21

"Don't reach for a framework until the simple approach fails." -- Start simple, add complexity only when proven necessary.

Harness layer: Planning & Coordination -- Five orchestration patterns, from simple to complex.

Problem

s01's single-loop agent has one mode: loop tool calls until done. But real tasks vary enormously — some need strict step sequences, some need parallel execution with voting, some need dynamic routing. We need a toolbox of orchestration patterns to choose from based on the task's nature.

Solution

Five patterns, increasing complexity:

┌─────────────────────────────────────────────────────────────┐
│ 1. Prompt Chaining        [A] ──gate──► [B] ──gate──► [C]  │
│    Sequential pipeline + quality gates                      │
├─────────────────────────────────────────────────────────────┤
│ 2. Routing                     ┌► [Math Agent]              │
│    Classification routing  In ─┤► [Code Agent]              │
│                                └► [Chat Agent]              │
├─────────────────────────────────────────────────────────────┤
│ 3. Parallelization        ┌► [A] ─┐                         │
│    Parallel + aggregate In┤► [B] ─┼──► Aggregate            │
│                           └► [C] ─┘                         │
├─────────────────────────────────────────────────────────────┤
│ 4. Orchestrator-Workers         Orchestrator                │
│    Dynamic decomposition     ┌──┤──┐                        │
│                            [W1] [W2] [W3]                   │
├─────────────────────────────────────────────────────────────┤
│ 5. Evaluator-Optimizer    [Generate] ──► [Evaluate] ──↺     │
│    Generate-evaluate loop                                   │
└─────────────────────────────────────────────────────────────┘

Core Concepts

Pattern 1: Prompt Chaining (Sequential Pipeline)

Each step's output becomes the next step's input. Quality gates between steps halt the chain if quality is insufficient:

def prompt_chain(task):
    analysis = call_agent("Analyze this task", task)
    # Gate: check analysis completeness
    if "INSUFFICIENT" in analysis:
        return {"error": "Analysis incomplete", "details": analysis}
    code = call_agent("Implement based on analysis", analysis)
    # Gate: code must include error handling
    if "try" not in code and "except" not in code:
        return {"error": "Missing error handling"}
    review = call_agent("Review this code", code)
    return review

Use when: Steps have clear dependencies and need intermediate quality checks.

Pattern 2: Routing (Classification-based)

Route requests to the most appropriate specialist agent based on input characteristics:

def route(user_input):
    category = classify(user_input)  # "math" | "code" | "chat"
    route_map = {
        "math": {"system": "You are a math expert.", "tools": [calculator]},
        "code": {"system": "You are a coding expert.", "tools": [bash, write_file]},
        "chat": {"system": "You are a friendly assistant.", "tools": []},
    }
    config = route_map[category]
    return call_agent(config["system"], user_input, tools=config["tools"])

Use when: Multiple input types, each requiring different tools and expertise.

Pattern 3: Parallelization (Parallel + Aggregate)

Multiple agents process the same task in parallel, results are aggregated. Two variants:

# Variant A: Sectioning — each handles different aspects
def parallel_sections(code):
    results = run_parallel([
        ("Check security vulnerabilities", code),
        ("Check performance issues", code),
        ("Check code style", code),
    ])
    return aggregate(results)

# Variant B: Voting — each independently judges the same thing
def parallel_voting(question):
    answers = run_parallel([
        ("Answer this question", question),  # Agent 1
        ("Answer this question", question),  # Agent 2
        ("Answer this question", question),  # Agent 3
    ])
    return majority_vote(answers)

Use when: Tasks that can be independently partitioned, or high-stakes decisions needing redundant verification.

Pattern 4: Orchestrator-Workers

An orchestrator dynamically decomposes tasks, delegates to specialized workers, and aggregates results:

def orchestrator(task):
    # Orchestrator dynamically decomposes (not hardcoded steps)
    subtasks = call_agent("Break this into subtasks. Return JSON array.", task)
    results = []
    for subtask in json.loads(subtasks):
        result = call_agent(f"Complete this subtask: {subtask}",
                           context=task, tools=CODING_TOOLS)
        results.append(result)
    return call_agent("Combine these results", "\n".join(results))

Use when: Complex tasks where steps can't be predetermined.

Pattern 5: Evaluator-Optimizer

Generate → Evaluate → Improve loop. s13's TDAD is a concrete implementation of this pattern:

def evaluator_optimizer(task, max_rounds=3):
    output = None
    for round in range(max_rounds):
        if output is None:
            output = call_agent("Implement this", task)
        else:
            output = call_agent(f"Improve based on feedback: {feedback}", output)
        eval_result = call_agent("Score 0-10. Be strict.", output)
        score = extract_score(eval_result)
        feedback = extract_feedback(eval_result)
        if score >= 8:
            return output
    return output

Use when: Output has clear quality criteria and can be iteratively improved.

Key Code

# Pattern selector — auto-selects pattern based on task characteristics
def select_pattern(task_description):
    if needs_sequential_steps(task_description):     return "prompt_chaining"
    if has_clear_categories(task_description):        return "routing"
    if is_independently_decomposable(task_description): return "parallelization"
    if is_complex_unknown_structure(task_description): return "orchestrator_workers"
    if has_clear_quality_criteria(task_description):  return "evaluator_optimizer"
    return "single_agent"  # Default: no complex pattern needed

What's New (s01 → s14)

Aspects01 (Agent Loop)s14 (Workflow Patterns)
Execution modeSingle while loopFive selectable patterns
Task decompositionNoneOrchestrator dynamic breakdown
Parallel processingNoneParallelization + Voting
Quality assuranceNoneEvaluator-Optimizer loop
Routing strategyAll tasks handled the sameRouting by classification
Complexity management"One size fits all"Choose simplest sufficient pattern

Deep Dive: Design Decisions

Q1: How to choose the right Workflow pattern? Is there a decision tree?

Anthropic's single most important principle in "Building Effective Agents": "Start simple." Decision tree:

What does your task need?
├── Fixed steps with dependencies?          → Prompt Chaining
├── Clear input categories?                 → Routing
├── Can be split into independent subtasks? → Parallelization
├── Unknown number/type of steps?           → Orchestrator-Workers
├── Clear quality criteria for iteration?   → Evaluator-Optimizer
└── None of the above?                      → Single Agent Loop (s01)

Key rule: If a single Agent Loop can complete the task in 3-5 tool calls, don't use any pattern. Every abstraction layer has costs — slower, more tokens, harder to debug.

Q2: When should the Evaluator-Optimizer loop stop?

Three stopping conditions (don't rely on just one):

  1. Quality met: score >= threshold (e.g., 8/10)
  2. Improvement stalled: Score unchanged for 2 consecutive rounds (±0.5 counts as stalled)
  3. Max rounds: Hard upper limit (typically 3-5 rounds)

Anthropic's research: Agents produce the largest improvements in the first 2-3 rounds, with diminishing returns after. Beyond 5 rounds almost never adds value.

Q3: What are the practical uses of the Voting variant of Parallelization?

Voting addresses model non-determinism. The same prompt given to a model three times can produce three different answers. Practical uses:

  1. High-stakes judgment: "Does this code have a security vulnerability?" — 3 independent agents vote, 2/3 consensus confirms
  2. Structured extraction: Extract JSON from natural language — take majority-consistent fields from 3 extraction results
  3. Classification: User intent classification — majority voting is more accurate than single classification

Cost tradeoff: Voting multiplies API cost by N (voter count). Use only when error cost far exceeds API cost.

Q4: How do Prompt Chaining gates prevent error cascading?

Gates are circuit breakers in the chain. Without gates, Prompt Chaining is just a pipeline where errors amplify:

No gate:  Analyze(wrong) → Implement(based on wrong analysis) → Review(garbage in, garbage out)
Gate:     Analyze(wrong) → Gate: ❌ Incomplete detected → Halt, report to user

Gate implementations:

  • Rule gates: Check for required keywords, correct format
  • LLM gates: Another LLM call evaluates if intermediate result meets continuation criteria
  • Code gates: Run tests, lint, type checks — pass to continue

Best gates are cheap and fast. A regex format check is more efficient than another LLM call.

Q5: Why does Anthropic emphasize "start simple"? What are the costs of complex patterns?

Anthropic repeats this principle throughout their blog. Hidden costs of complex patterns:

Cost TypeImpact
Token costOrchestrator-Workers consumes 3-10× more tokens than single agent
LatencyMulti-hop flows add 1-3 seconds API latency per hop
Debug difficultyMulti-agent bugs require tracing cross-context data flows
FragilityLonger chains have higher probability of end-to-end failure
MaintenanceMore complex patterns = more code = harder to change

Correct approach: Start with a single Agent Loop, upgrade only when you can prove it's insufficient. Each upgrade should be a data-driven decision, not "it feels like it should be more complex."

Try It

cd learn-claude-code
python agents/s14_workflow_patterns.py

Recommended prompts:

  • "Analyze, implement, and review a Fibonacci function" — triggers Prompt Chaining
  • "Is this a math question or coding task: calculate the area of a circle" — triggers Routing
  • "Review this code for security, performance, and style" — triggers Parallelization
  • "Build a TODO app with CRUD operations" — triggers Orchestrator-Workers
  • "Write a sorting function and keep improving until optimal" — triggers Evaluator-Optimizer

References

  • Building Effective Agents — Anthropic, Dec 2025. The original definition of five orchestration patterns, emphasizing "start simple" and when each pattern applies. Core reference for s14.
  • Demystifying evals for AI agents — Anthropic, Jan 2026. Evaluator-Optimizer pattern applied to evaluation, including "Grade the outcome, not the path."
  • Agentic Coding Trends 2026 — Anthropic, Mar 2026. Industry paradigm shift from "manual coding" to "orchestrating agents."
  • Prompt Engineering Guide — Anthropic Docs. Foundations of Prompt Chaining and Routing, including classifier design and intermediate validation.