Learn Claude Code
s20

Parallel Teams

Collaboration

File-Lock Task Board

264 LOC6 toolsTaskBoard + parallel Workers + atomic claiming
The key to scaling is decoupling; task board + file locks = linear scalability

s13 > s14 > s15 > s16 > s17 > s18 > s19 > [ s20 ] s21

"16 parallel agents compiled a 100K-line C compiler in hours." -- Parallelization turns agents from tools into factories.

Harness layer: Concurrency & Collaboration -- File-lock task boards for safe parallel execution.

Problem

s09-s11 taught agent teamwork, but within a single conversation. For massive tasks (refactoring 100 files), we need real parallelism: multiple agent processes working simultaneously, each handling different file sets. Core challenge: preventing conflicts when multiple Workers modify the same file.

Solution

                    ┌──────────────┐
                    │  Orchestrator │  ← Decompose tasks, assign to board
                    └──────┬───────┘
              ┌────────────┼────────────┐
              ▼            ▼            ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │ Worker 1 │ │ Worker 2 │ │ Worker 3 │  ← Independent processes
        │ a.py,b.py│ │ c.py,d.py│ │ e.py,f.py│
        └────┬─────┘ └────┬─────┘ └────┬─────┘
             ▼             ▼             ▼
        ┌─────────────────────────────────────┐
        │     Task Board (File Locks)          │
        │  a.py🔒W1  b.py🔒W1  c.py🔒W2     │
        │  File lock: whoever locks, owns it   │
        └─────────────────────────────────────┘

Core Concepts

File-Based Task Board

class FileBasedTaskBoard:
    def claim_task(self, worker_id, task):
        """Worker claims task and locks files."""
        lock_file = f"{self.board_dir}/{task['id']}.lock"
        try:
            fd = os.open(lock_file, os.O_CREAT | os.O_EXCL | os.O_WRONLY)
            os.write(fd, f"{worker_id}".encode())
            os.close(fd)
            return True
        except FileExistsError:
            return False  # Already claimed

    def complete_task(self, worker_id, task_id, result):
        """Mark task done."""
        with open(f"{self.board_dir}/{task_id}.result", "w") as f:
            json.dump({"worker": worker_id, "result": result, "status": "done"}, f)
        os.remove(f"{self.board_dir}/{task_id}.lock")

Parallel Workers

Each Worker is an independent process with its own context window:

import multiprocessing

def run_worker(worker_id, task, board):
    if not board.claim_task(worker_id, task):
        return  # Task already claimed
    messages = [{"role": "user", "content": task["instructions"]}]
    result = run_agent_loop(messages, tools=CODING_TOOLS)
    board.complete_task(worker_id, task["id"], result)

def parallel_execute(tasks, board):
    processes = []
    for i, task in enumerate(tasks):
        p = multiprocessing.Process(target=run_worker, args=(i, task, board))
        p.start()
        processes.append(p)
    for p in processes:
        p.join()

What's New (s09 → s20)

Aspects09 (Agent Teams)s20 (Parallel Teams)
Parallel modeMulti-role in same processTrue multi-process parallel
Conflict avoidanceProtocols and messagesFile locks + exclusive assignment
Scalability2-5 agents16+ worker processes
CommunicationShared messages arrayTask board files
ScaleSingle featureEntire codebase (100+ files)

Deep Dive: Design Decisions

Q1: Why file locks instead of a database?

File locks have zero dependencies, are directly observable (ls .task_board/), and os.O_CREAT | os.O_EXCL provides OS-level atomicity. Production scenarios use SQLite or Redis, but the principle is identical.

Q2: What if a Worker fails? Will tasks get stuck?

Three strategies: (1) Timeout release: auto-release locks after N minutes. (2) Heartbeat detection. (3) Retry assignment: released tasks return to available pool.

Q3: What if files have dependencies?

Group dependent files to the same Worker. For complex dependencies, use two-phase execution: interface phase → implementation phase.

Q4: Can Workers do duplicate work?

Prevented by: (1) Exclusive assignment — each file to one Worker only. (2) Optimistic locking — check before writing. (3) Incremental checks — skip already-completed files.

Q5: Processes or threads for parallel agents?

Processes in Python, not threads. No GIL limitation, full memory isolation, Worker crash doesn't affect others. For I/O-bound agents, asyncio is also viable but processes are simpler and safer.

Try It

cd learn-claude-code
python agents/s20_parallel_teams.py

Recommended prompts:

  • "Refactor all files in src/ to use async/await" — watch task decomposition and parallel execution
  • "Add type hints to all Python files" — observe multi-Worker parallel work
  • "Add docstrings to all functions" — observe Task Board lock mechanism

References