Subagents vs Multiple Claude Code Instances: Practical Differences

Claude Code gives you two ways to have “more than one agent at a time” working for you:

Subagents – scoped helpers spawned from inside a single Claude Code session, managed via /agents.
Multiple Claude Code instances – separate claude processes, each in a different working directory, each with its own session.

Both look superficially similar (“two Claudes doing things in parallel”), but they are different architectures with different trade-offs. This article walks through the practical differences and when to reach for which.

The two shapes in one paragraph each

Subagents. One parent session, multiple isolated children inside it. The parent dispatches work, each child runs in its own context window with its own tool allowlist and possibly its own model, returns a final reply, and the parent continues. Children cannot spawn grandchildren. The whole thing lives under one terminal, one auth, one set of project files.

Multiple instances. Two or more independent claude processes, started in different directories (often in different iTerm tabs, tmux panes, or VS Code windows). Each has its own session, its own conversation history, its own CLAUDE.md, its own .claude/agents/, its own memory, its own auth. They share nothing except the filesystem and whatever side channels you wire up (git, files on disk, a database).

What each one actually shares

Aspect	Subagents	Multiple instances
Working directory	Same as parent	Each instance picks its own
Project files (CLAUDE.md, `.claude/agents/`)	Inherited from parent’s project	Per-instance, based on that instance’s cwd
Conversation history	Parent has its own; each subagent has its own; parent only sees final reply	Each instance fully isolated; no shared transcript
Tool boundaries	Can be scoped per subagent	Each instance has its own settings
Model	Per-subagent (`inherit`, `sonnet`, `opus`, `haiku`)	Per-instance (set at launch or via `/model`)
Auth / API quota	Single bill, parent + children	Separate sessions, each counted independently
Crash blast radius	A failing subagent doesn’t kill the parent, but the parent is sitting idle while it ran	Each instance fails independently; the others keep working
Coordination	Parent orchestrates explicitly or by automatic delegation	You orchestrate, by switching terminals or by writing files both instances watch

Coordination model: who’s holding the plan

This is the single biggest difference in practice.

With subagents, the parent is the planner. The parent decides what work to dispatch, picks the subagent, hands it a scoped task, waits for the reply, integrates it, and decides what’s next. The subagent is a worker; the parent is the foreman. You only talk to one Claude (the parent).

With multiple instances, you are the planner. Each instance is independent and has no idea the others exist. If instance A finishes a refactor and instance B needs to know, you have to tell B (or wire a side channel like a shared STATUS.md file). The orchestration logic lives in your head and on your keyboard.

This sounds like a small difference but it changes everything about how the work feels. Subagent work is one conversation with delegation underneath. Multi-instance work is N parallel conversations you have to keep coherent yourself.

Context isolation: what the parent sees

A subagent runs in its own context window. The parent does not see the subagent’s working transcript – only the subagent’s final reply. This is by design: it keeps the parent’s context lean, and it lets a subagent burn through a long search or a deep dive without polluting the parent’s history. The downside is that you can’t “show me what the subagent tried” from the parent. Once the subagent returns, its intermediate state is gone.

With multiple instances there’s no isolation question to ask – each instance is its own world. Want to see what instance B has been doing? Switch to that terminal and scroll. Everything’s still there.

Memory and state

Claude Code’s per-project memory (~/.claude/projects/<path-slug>/memory/) is keyed by the project’s directory.

Subagents share their parent’s memory directory. They’re operating in the same project, so any memory the parent wrote is visible to subagents that need it, and vice versa.

Multiple instances in different directories have different memory directories. Instance A in ~/code/foo writes to one memory store. Instance B in ~/code/bar writes to another. Even two instances in the same directory share memory but operate as if they’re the same project.

This matters when you start using memory for persistent context. Subagents inherit it for free; multi-instance setups don’t, and you need to plan accordingly.

Cost and latency

Subagents. You pay for both parent and child tokens while a subagent is running. The parent sits idle waiting, so its tokens are mostly the “send the task, receive the reply” part. The child does the actual reasoning. Total cost is roughly child cost plus a small parent surcharge for orchestration. Latency is sequential from the parent’s perspective – the parent isn’t doing anything else while the subagent runs.

Multiple instances. Each instance is fully active in parallel. If you have three instances each grinding on a different task, you’re paying for all three concurrently. Wall-clock time goes down; total token spend goes up. There’s no orchestrator overhead because there’s no orchestrator.

Practical rule: subagents are cheaper when you only need one thing at a time but want it scoped. Multiple instances are faster when you have genuinely independent work and you’re willing to pay for parallel.

Tool boundaries

Subagents have a clean way to scope tools: a per-subagent tools allowlist or disallowedTools denylist in the frontmatter. A research subagent can be defined as “Read, Grep, Glob, WebFetch only” and you have a structural guarantee it cannot edit code or shell out.

Multiple instances inherit whatever tool set their session was launched with. You can vary per instance via settings.json or --allowedTools, but there’s no per-task boundary – the boundary is per-process, set once at startup.

For “trust the agent but limit the blast radius” patterns, subagents are the natural fit.

When to reach for subagents

The work is multiple kinds of small tasks within one project (research, then write, then review).
You want different tool scopes for different kinds of work (read-only research, then a separate edit step).
You want different models for different steps (haiku for grep-heavy search, opus for the final synthesis) without juggling sessions.
You want team-shared conventions checked into git (.claude/agents/changelog-writer.md) so every developer’s session has the same helpers.
You only need one thing at a time and would rather not have N terminal tabs to keep track of.

When to reach for multiple instances

The work is genuinely parallel and the tasks are mostly independent (refactor module A in ~/code/foo while module B in ~/code/bar gets new features).
The tasks are in different projects with different CLAUDE.md files and different conventions.
You want isolation so a long-running grind in one place doesn’t block the other.
You want full visibility into each agent’s transcript, not just the final reply.
You’re comfortable being the orchestrator – writing the STATUS file, switching terminals, holding the plan.

When you actually want both

The hybrid pattern is common. Two claude instances, each in its own project directory, each with its own .claude/agents/ defining a small set of project-specific subagents. The instances are independent (parallel projects), and within each instance you get the subagent benefits (scoped tools, scoped models, team-shared definitions).

Example:

iTerm tab 1: claude running in ~/code/api. .claude/agents/ includes a migration-scout subagent for the Express -> Fastify migration in flight.
iTerm tab 2: claude running in ~/code/web. .claude/agents/ includes a component-reviewer for the design-system overhaul.

Each tab orchestrates its own work via subagents. You orchestrate between tabs by writing files, checking git status, and switching contexts deliberately.

Common confusion: “background agents” are a third thing

Both subagents and multi-instance setups happen inside terminal sessions you can see. Claude Code also has background agents (managed via the claude agents CLI command, not /agents) – detached sessions running server-side that you can check on later. Those are a third architecture entirely. They’re closer to multi-instance than to subagents – separate sessions, separate transcripts, separate cost – but the lifecycle is different (start, detach, query, attach back) and they’re aimed at long-running grinds you don’t want tied to a terminal window.

If you find yourself wanting “multiple Claudes but I don’t want to keep terminals open,” background agents are likely what you actually want.

Decision shortcut

A rough heuristic:

One project, one task, one transcript: just talk to one Claude. Skip the question.
One project, repeated patterns of work: subagents.
Multiple projects: multiple instances, one per project.
Long-running work you’d like to walk away from: background agents.
Multiple projects and repeated patterns within each: multiple instances each with their own subagents.

The fastest way to over-engineer a Claude Code setup is to reach for subagents when a normal conversation would do, or to spin up multiple instances when one instance with two subagents would have been simpler. Default to the smallest shape that fits the work.