Tool Design (ACI)
Tools & ExecutionACI Principles
Good tools are hard to misuse; tool design impacts agent quality more than prompt design
s13 > s14 > s15 > s16 > s17 > s18 > s19 > s20 > [ s21 ]
"Machines make different mistakes than humans." -- Design interfaces for agents, not humans.
Harness layer: Tools & Execution -- ACI = Agent-Computer Interface design principles.
Problem
s02 defined tools, but those interfaces were "good enough" quality. In production, poor tool design is the #1 cause of agent failure: model picks wrong tool, passes wrong arguments, or returns are too unstructured for subsequent processing. We need systematic tool design principles — not HCI, but ACI.
Solution
ACI Three Design Principles (Anthropic):
┌────────────────────────────────────────────────────┐
│ 1. Poka-yoke (Error-proofing) │
│ — Make incorrect usage impossible │
│ 2. Structured Output │
│ — Return JSON, not free text │
│ 3. Precise Descriptions │
│ — Description is model's SOLE basis for choice │
└────────────────────────────────────────────────────┘
Additional: 4. Minimize required params 5. Actionable errors 6. Truncated returns
Core Concepts
Principle 1: Poka-yoke (Error-proofing)
Make incorrect usage impossible at the interface level:
# ❌ Error-prone: comma-separated string
"files": {"type": "string", "description": "Comma-separated file paths"}
# ✅ Error-proof: use array type directly
"files": {"type": "array", "items": {"type": "string"}}
# ❌ Error-prone: boolean as string
"dry_run": {"type": "string"} # Model might pass "yes", "true", "1"
# ✅ Error-proof: boolean type
"dry_run": {"type": "boolean"}
# ❌ Error-prone: free text for constrained values
"color": {"type": "string"}
# ✅ Error-proof: enum restriction
"color": {"type": "string", "enum": ["red", "green", "blue"]}
Principle 2: Structured Output
Tool returns must be structured JSON, not free text:
# ❌ Free text: model must parse natural language
def run_tests():
return "3 tests passed, 2 failed. test_login failed with AssertionError..."
# ✅ Structured JSON: model can directly use fields
def run_tests():
return json.dumps({
"passed": 3, "failed": 2, "total": 5,
"failures": [{"name": "test_login", "error": "AssertionError", "line": 42}],
"summary": "❌ 2/5 FAILED"
})
Principle 3: Precise Descriptions
# ❌ Too vague
"description": "Run some tests."
# ✅ Precise: when to use + what it does + what it doesn't do
"description": (
"Run pytest tests in the specified file. "
"Returns structured results with pass/fail count. "
"Use AFTER writing code that needs verification. "
"Does NOT run linters — use 'lint' for that."
)
Principle 4: Minimize Required Parameters
# ❌ Too many required
{"required": ["file", "start_line", "end_line", "content", "encoding"]}
# ✅ Only essential required, rest have defaults
{"required": ["file", "content"],
"properties": {"encoding": {"type": "string", "default": "utf-8"}}}
Principle 5: Actionable Error Messages
# ❌ Not actionable
return {"error": "Invalid input"}
# ✅ Actionable: what went wrong + how to fix
return {"error": "file_not_found", "path": "/src/main.py",
"suggestion": "Use 'read_file' to list available files first",
"similar_files": ["src/app.py", "src/index.py"]}
Key Code
def execute_edit_file(args):
path = args["path"]
if not os.path.exists(path):
return json.dumps({"error": "file_not_found", "path": path,
"suggestion": "Use read_file to verify the path exists."})
content = open(path).read()
if args["find"] not in content:
similar = find_similar_substrings(content, args["find"])
return json.dumps({"error": "text_not_found",
"suggestion": "Exact text not found. Did you mean:",
"similar_matches": similar[:3]})
new_content = content.replace(args["find"], args["replace"], args.get("count", 1))
open(path, "w").write(new_content)
return json.dumps({"status": "success", "path": path, "changes_made": 1})
What's New (s02 → s21)
| Aspect | s02 (Tools) | s21 (ACI Tool Design) |
|---|---|---|
| Interface design | "Works fine" | Systematic ACI principles |
| Parameter types | Simple strings | Structured JSON Schema |
| Error handling | Exception text | Actionable structured errors |
| Description | Single line | Precise: when + what + what-not |
| Error prevention | None | Enums, arrays, defaults |
| Return values | Free text | Structured JSON |
Deep Dive: Design Decisions
Q1: Does structured JSON cost more tokens than free text?
Yes, ~10-20% more. But benefits far outweigh: parsing success jumps from ~70% to ~99%, preventing cascading retry failures that waste far more tokens.
Q2: What if the model passes mismatched find text?
Most common agent error. Solution: fuzzy matching with edit distance, return similar snippets as suggestions, recommend read_file first.
Q3: How long should tool descriptions be?
Anthropic recommends 2-4 sentences (40-80 words). Too short = model doesn't know when to use. Too long = model ignores details + wastes tokens. Each tool definition consumes ~150-300 tokens.
Q4: What do ACI and HCI have in common?
Many shared principles: error-proofing (grayed buttons vs enum params), consistency (UI style vs return format), feedback (progress bars vs structured results), error recovery ("Did you mean..." in both). Core difference: HCI is visual+intuitive, ACI is textual+rational.
Q5: How to test tool design automatically?
Test edge cases (file_not_found), verify model selects correct tool, check description quality (length, includes "Use" keyword).
Try It
cd learn-claude-code
python agents/s21_tool_design.py
Recommended prompts:
"Edit the file to fix the typo"— observe error-proofing handling mismatched find text"Run the tests and show results"— observe structured output"Create a new file with a template"— observe parameter minimization
References
- ACI Design Principles — Anthropic, Dec 2025. Original ACI definition: designing interfaces for agents matters as much as HCI for humans.
- Poka-yoke — Toyota Production System error-proofing concept applied to tool interface design.
- Claude Tool Use Guide — Anthropic Docs. Official tool definition guide including description best practices.
- Agentic Coding Trends 2026 — Anthropic, Mar 2026. Tool selection accuracy's impact on agent success rate.
- MCP Tool Schema Design — MCP Spec. inputSchema design guidelines using JSON Schema error-proofing.