Patterns
Pattern 09 of 26
Multi-Agent Orchestration
What to reach for when one agent is not enough
Some tasks are too large to fit cleanly in one agent pass. Multi-agent orchestration splits the work: a planner that breaks down the goal, a researcher that gathers information, a coder that implements, a reviewer that checks the output. Each agent has its own narrow scope, tools, and instructions. The orchestrator coordinates them. The complexity is in the coordination layer, not in any individual agent.
Why it matters
Context windows are finite. Specialization helps quality. When I run a research-heavy coding task, a single general-purpose agent gets sloppy across both. Separate agents for research and implementation each do their part better. The tradeoff is coordination overhead. Whether that trade is worth it depends on how long and how complex the task is.
Deep Dive
Multi-agent orchestration is what you reach for when a single context window is not enough or when you need genuinely parallel work. The standard pattern has a supervisor orchestrator that understands the goal and delegates subtasks to workers. Each worker has its own system prompt, its own tools, and a narrow scope. A researcher agent does not write code. A coder agent does not do literature review. They each do one thing well. The orchestrator takes their outputs and assembles them into a result. Anthropic documented both primary topologies in their December 2024 guide: supervisor and peer-to-peer.
Supervisor topology is simpler to reason about because there is one agent with a global view of the task state. Peer-to-peer is more flexible but harder to trace when something goes wrong. In my experience, most production systems start with supervisor and stay there. Peer-to-peer makes sense when the supervisor itself becomes a bottleneck, which usually only happens at scale. The 2024 IJCAI survey on LLM multi-agents is the most thorough academic treatment of both topologies and worth reading if you are designing a system from scratch.
The coordination complexity is where people underestimate the cost. Each inter-agent message adds latency and token spend. Parallel agents reading and writing shared state can produce race conditions in ways that are hard to reproduce. A worker that fails partway through can block the orchestrator indefinitely if you have not built explicit failure handling. These are all solvable engineering problems, but they need deliberate design. LangGraph, OpenAI Agents SDK, and CrewAI all have primitives for state management, error propagation, and agent termination specifically because teams kept hitting these issues. Learn those primitives before you ship anything.