s18

Auto Mode + Sandbox

Safety & Governance

Permission Tiers + Sandbox

234 LOC6 toolsPermissionManager (SAFE/ASK/DENY) + Sandbox

Safety should not be the enemy of efficiency; 95% auto-approve, 5% human oversight

SAFE/ASK/DENY Permission Tiers

Permission Problem

Confirming every tool call kills productivity. Skipping all is dangerous.

1/6

s13 > s14 > s15 > s16 > s17 > [ s18 ] s19 > s20 > s21

"Safety should not be the enemy of efficiency." -- Security and speed can coexist.

Harness layer: Safety & Governance -- Three-tier permission system + sandbox isolation.

Problem

Previous agents required user confirmation for every tool call. For routine coding tasks — reading files, running lint, executing tests — requiring confirmation every time severely slows work. But removing all confirmations risks rm -rf / or data leaks. We need a tiered permission system: safe operations auto-execute, dangerous ones need approval.

Solution

Three Permission Tiers (SAFE / ASK / DENY)

Tool call request
     │
     ▼
┌─────────────────────────────┐
│     Permission Classifier    │
│  (classify by tool + args)   │
└──────────┬──────────────────┘
           │
    ┌──────┼──────┐
    ▼      ▼      ▼
 ┌──────┐ ┌────┐ ┌─────┐
 │ SAFE │ │ASK │ │DENY │
 │ Auto │ │Need│ │Block│
 │ exec │ │ ok │ │     │
 └──┬───┘ └─┬──┘ └──┬──┘
    │       │       └──► "Permission denied"
    │       └──► "Allow? [y/n]"
    └──────► [Execute in sandbox]

Core Concepts

Three Permission Tiers

Tier	Action	Examples
SAFE	Auto-execute, no confirmation	Read files, grep, lint, `git status`
ASK	Requires user confirmation	Write files, install packages, `git commit`
DENY	Blocked immediately	`rm -rf`, network requests, `/etc` access, `sudo`

Classifier Logic

class PermissionClassifier:
    SAFE_TOOLS = {"read_file", "grep", "lint", "git_status", "git_log"}
    DENY_PATTERNS = [r"rm\s+-rf", r"sudo\s+", r"curl|wget", r"/etc/|/var/"]

    def classify(self, tool_name, args):
        if tool_name in self.SAFE_TOOLS: return "SAFE"
        if any(re.search(p, json.dumps(args)) for p in self.DENY_PATTERNS): return "DENY"
        return "ASK"  # Default: ask user

Sandbox Isolation

For Auto Mode auto-execution, add sandbox protection:

class SandboxExecutor:
    def __init__(self):
        self.allowed_dirs = ["/workspace", "/tmp"]
        self.network_enabled = False

    def execute(self, command):
        for path in extract_paths(command):
            if not any(path.startswith(d) for d in self.allowed_dirs):
                raise PermissionError(f"Access denied: {path}")
        return subprocess.run(command, ...)

Key Code

def agent_loop(messages):
    classifier = PermissionClassifier()
    sandbox = SandboxExecutor()

    while True:
        response = client.messages.create(...)
        for tool_use in response.tool_use_blocks:
            level = classifier.classify(tool_use.name, tool_use.input)
            if level == "DENY":
                tool_result = f"❌ Permission denied: {tool_use.name}"
            elif level == "ASK":
                if input(f"🔒 Allow {tool_use.name}? [y/n] ") == "y":
                    tool_result = sandbox.execute_tool(tool_use.name, tool_use.input)
                else:
                    tool_result = "⏭️ Skipped by user"
            else:  # SAFE
                tool_result = sandbox.execute_tool(tool_use.name, tool_use.input)
            messages.append(...)

What's New (s02 → s18)

Aspect	s02 (Tools)	s18 (Auto Mode)
Permissions	All require confirmation	Three tiers
Auto execution	None	SAFE tier auto-runs
Danger protection	None	DENY blocklist
Sandbox	None	Filesystem + network isolation
User experience	Confirm everything	Only ASK tier needs confirmation

Deep Dive: Design Decisions

Q1: Should the classifier use rules or an LLM?

Rules, not LLM. Reasons: (1) Speed: < 1ms vs 1-3s. (2) Determinism: Security policies can't be probabilistic. (3) Auditability: Rules can be fully listed and reviewed. (4) Cost: Extra LLM call per tool call is expensive.

Q2: What's the difference between Auto Mode and skip-permissions?

Auto Mode keeps DENY tier protection. Skip-permissions has zero safety net — only for fully isolated CI/CD containers.

Q3: How does the sandbox isolate filesystem access?

Three layers: (1) Path allowlist. (2) Symlink detection (prevent escape). (3) OS-level sandbox (macOS sandbox-exec, Linux seccomp-bpf).

Q4: What if the SAFE allowlist is too conservative?

Users can extend via config: {"allowedTools": ["npm_test", "git_diff", "prettier_format"]}. Safety advice: consider worst case — if the agent misunderstands parameters, what happens? write_file worst case = overwrite important file. read_file worst case = read wrong file (much lower risk).

Q5: How does enterprise permission management differ from personal use?

Enterprise adds: mandatory audit logging, non-overridable DENY lists, centralized policy distribution, MCP Server approval requirements.

Try It

cd learn-claude-code
python agents/s18_auto_mode.py

Recommended prompts:

"Read the README file" — observe SAFE auto-execution
"Delete the temp files" — observe DENY blocking
"Write a new file called hello.py" — observe ASK confirmation
"Run the test suite" — observe SAFE/ASK tier classification

References

Claude Code: Auto Mode — Anthropic Docs. Auto Mode settings including --auto and --dangerously-skip-permissions.
Claude Code: Native Sandboxing — Anthropic, Feb 2026. macOS Sandbox and Linux seccomp integration.
Building Effective Agents — Anthropic, Dec 2025. Balancing safety and efficiency.
OWASP Top 10 for LLM Applications — OWASP, 2025. Prompt Injection attacks via malicious tool calls; DENY mechanism defends against this.
Agentic Coding Trends 2026 — Anthropic, Mar 2026. Enterprise agent permission management trends.