Hypd AI automation agency logo

Hypd

AI Tools / Published June 15, 2026

How to Run Claude Code Subagents That Actually Work: The Queue Pattern

Three concurrent agents, one shared queue file, and four rules that prevent the mistakes that make multi-agent Claude Code setups fail — too many agents, no coordination, no state tracking, no timeouts.

Last updated: June 15, 2026

claude codesubagentsautomationagentsworkflowai tools
Bottom line

Three concurrent agents, one shared queue file, and four rules that prevent the mistakes that make multi-agent Claude Code setups fail — too many agents, no coordination, no state tracking, no timeouts.

This guide is reviewed for clarity, service accuracy, and AI-search readability. The next quarterly content review is tracked internally before unsupported metrics or client proof are added.

Why most multi-agent setups break

Spawning multiple Claude Code agents sounds straightforward. In practice, most setups fail for the same four reasons: too many agents running at once (context pressure collapses quality), no shared coordination layer (agents repeat work or conflict), no state tracking (you can't tell what's done or in progress), and no timeouts (one stuck agent blocks the whole pipeline).

The pattern below solves all four. It's not complex — one queue file, three agents, and a few rules in your CLAUDE.md.

The four rules

  • Cap at 3 concurrent agents. More than 3 creates context pressure that degrades output quality. Three is the number where parallel throughput outpaces the quality cost.
  • Use a shared queue file. All coordination happens through a single JSON file. One file is the source of truth.
  • Track state per task. Every task needs one of four states:
    prompt
    pending
    ,
    prompt
    in_progress
    ,
    prompt
    completed
    ,
    prompt
    failed
    . Without state, you can't restart safely and you can't tell when the work is done.
  • Set timeouts and retry limits. Every task gets a timeout. Failed tasks retry with exponential backoff (1s, 2s, 4s). After two retries, escalate to a human.

The CLAUDE.md subagents block

Add this block to your CLAUDE.md. It tells Claude exactly how to behave when spawning and managing subagents — without it, the agent makes its own decisions about concurrency and retry logic, which usually means neither.

text
## SUBAGENTS
Max concurrent agents: 3
Default timeout: 30 minutes
Retry policy: exponential backoff (1s, 2s, 4s, 8s)
Max retries per task: 2
Task types: refunds, support, data, reporting

The queue file structure

The queue lives at a fixed path — typically

prompt
task_queue.json
at the project root. Every task is an object with these fields:

json
[
  {
    "task_id": "task_001",
    "type": "refund",
    "status": "pending",
    "assigned_agent": null,
    "retries": 0,
    "result": null,
    "data": { "order_id": "ORD-123", "amount": 49.99 }
  },
  {
    "task_id": "task_002",
    "type": "support",
    "status": "in_progress",
    "assigned_agent": "agent_1",
    "retries": 0,
    "result": null,
    "data": { "ticket_id": "TKT-456", "issue": "login failure" }
  }
]

The subagent template

Each subagent lives in

prompt
.claude/agents/<name>.md
. The YAML frontmatter tells Claude what the agent is for, what tools it can use, and which model to run it on.

markdown
---
name: refund-processor
description: Processes customer refund requests from the task queue
tools:
  - Read
  - Write
  - Bash
model: claude-haiku-4-5-20251001
---

You are a refund processing agent. Your job:
1. Read your assigned task from task_queue.json
2. Process the refund via the payment API
3. Update task status to completed or failed
4. Write the result back to the queue

Never process a task that is not assigned to you.
Never modify tasks assigned to other agents.

How the parent agent spawns and monitors

The parent agent reads the queue, claims up to 3 pending tasks by setting their status to

prompt
in_progress
and assigning itself, then spawns one subagent per claimed task. It polls the queue file for completions and respawns failed tasks with backoff until the retry limit is hit.

The key invariant: the parent never modifies tasks it didn't claim. If an agent crashes mid-task, the task stays in

prompt
in_progress
until the parent detects a timeout, resets it to
prompt
pending
, and retries.

What this looks like at scale

A real example: an e-commerce team processing refund requests. Before the queue pattern, one agent handled requests sequentially — about 10 refunds per hour. With three subagents running in parallel from a shared queue, throughput hit 180 refunds per hour. Error rate dropped 97% because each agent only handles its assigned tasks and state transitions are explicit.

The math: 3 agents × 60 refunds/hour each = 180/hour.

Five things to monitor

  • Queue depth — how many tasks are pending. A growing queue means agents can't keep up.
  • Agent utilization — are all three agents actually processing, or is one idle?
  • Task completion time — average time from
    prompt
    in_progress
    to
    prompt
    completed
    . Outliers point to API latency or task complexity issues.
  • Error rate — ratio of failed to completed. Above 5% means something systematic is breaking.
  • Retry rate — how often tasks hit retry logic. High retry rates under low error rates usually mean timeout thresholds are too tight.

Frequently asked questions

Why cap at 3 agents instead of running as many as possible? Each subagent runs in its own context window. Beyond 3, the parent agent's context fills with coordination overhead and reasoning quality degrades. Three is empirically the point where throughput gains from parallelism outpace the quality cost from context pressure.

What happens if two agents try to claim the same task simultaneously? Each agent reads the queue, claims a pending task in a single Write operation, then re-reads to confirm the claim persisted. If two agents claim the same task, one sees its claim overwritten and moves to the next pending task. Optimistic concurrency — it doesn't prevent double-claiming but detects and resolves it immediately.

Can I use this pattern with Claude Code cloud routines? Yes, with one constraint: cloud routines have a 1-hour minimum interval. For high-throughput queues, run the parent agent in a loop or as a local scheduled task. Cloud routines work well for the daily kickoff — starting the parent agent that then manages the queue for the session.