Engineering Insights

June 17, 2025 —

Pandelis Zembashis CTO twitter-icon

Yes, You Should Build Multi-Agent Systems.

Cognition AI’s recent post, “Don’t Build Multi-Agents,” claims that multi-agent systems are fragile, ineffective, and unnecessary. They argue that coordination between agents leads to “dispersed decision-making” and poor context sharing—calling multi-agent setups “the wrong way of building agents.” Their solution? Stick to one monolithic AI.

It’s a bold take—and we believe it’s wrong.

Multi-agent systems aren’t just viable—they’re already advancing real-world AI software and research. Dismissing them now is like dismissing multi-core processors in 2005: a short-sighted view we’ll likely look back on with amusement. Let’s unpack Cognition’s claims—and explain why yes, you should build multi-agents.

Cosine’s AutoPM: A Multi-Agent System Breaking Records

Let’s start with AutoPM—our real-world proof that multi-agent systems aren’t fragile; they’re transformative.

AutoPM is an AI product manager that orchestrates multiple specialised agents to handle complex software development tasks. It autonomously breaks down abstract goals into structured subtasks, then plans and executes each one—without user intervention. The result? Fully formed, production-ready pull requests delivered faster and cleaner than ever.

This isn’t theory—it’s deployed, working technology. AutoPM relentlessly iterates until a task is completed to spec, using planning, coding, and other agents in concert. Cosine calls it “a multi-agent innovation” for a reason.

And it performs. In our announcement, we shared that AutoPM recently hit 72% on the SWE-Lancer Diamond benchmark—an industry record—generating the equivalent of $155,550 in value. That success stems directly from its ability to delegate and parallelise across agents. If that’s “fragile,” we need more of it.

Cognition claims that multi-agent systems lead to miscommunication and inconsistency. AutoPM proves otherwise. With the right architecture, agents can share context, coordinate effectively, and scale autonomously. It didn’t buckle under complexity—it thrived on it.

AutoPM is living proof: multi-agent systems don’t just work—they excel.

Anthropic Is All-In on Multi-Agent Systems

If multi-agent systems don’t work, why is Anthropic—one of the world’s top AI labs—deploying one?

In a June 13, 2025 engineering report, Anthropic announced that Claude Research now uses multiple Claude agents to explore complex topics in parallel. What started as a prototype is now powering a real product—proof that multi-agent systems aren’t just viable, but valuable.

Anthropic reports that once their models reached a certain capability, “multi-agent systems [became] a vital way to scale performance.” Their multi-agent setup—an orchestrator agent delegating to sub-agents—beat a single-agent baseline by 90.2% on internal benchmarks. When tasked with finding all board members of IT companies in the S&P 500, the single agent failed; the multi-agent system succeeded by decomposing and parallelising the task.

They also addressed the exact concerns Cognition raises—coordination, duplication, context-sharing—by refining their architecture and prompts. A lead agent manages sub-agents and integrates their outputs, solving the very problems sceptics cite.

Anthropic isn’t experimenting idly. They’re shipping multi-agent systems because they work—especially for complex, large-scale tasks that exceed a single model’s scope.

Breaking the “Fragile System” Fallacy

Cognition argues that multi-agent systems are inherently fragile—prone to miscommunication, conflicting outputs, and coordination failures. Their example? A toy scenario where two agents build mismatched parts of a Flappy Bird clone. It’s a straw man—and one that ignores how real multi-agent systems are actually built.

In practice, both Cosine’s AutoPM and Anthropic’s Claude Research system avoid these issues through intentional design. Each uses an orchestrator agent to maintain global context and ensure sub-agents stay aligned. Subtasks aren’t tackled in isolation—they’re coordinated under a shared vision. That’s why AutoPM produces polished pull requests, not disjointed code fragments.

The claim that “no one is putting a dedicated effort” into multi-agent coordination is also mistaken. Anthropic publishes detailed research on orchestrator patterns and prompt strategies. OpenAI is exploring swarm-based approaches. Microsoft is investing in multi-agent tooling via Autogen. Far from neglected, this space is a hotbed of active innovation.

Cognition also overlooks the key advantage of multi-agent systems: scale. By parallelising work, they break through single-agent limitations like token windows and serial execution. As Anthropic notes, they’re “efficiency multipliers” for large models. Yes, early systems are more costly—so were early multi-core CPUs. The answer isn’t to retreat to single-thread thinking, but to improve coordination and efficiency over time.

Cognition’s stance isn’t principled—it’s status quo bias. Real systems are solving the very problems they raise, and scaling beyond what single agents can achieve.

Dismissing Multi-Agent Systems in 2025 is a Mistake

History has a message: scaling requires parallelism. In the early 2000s, some insisted we didn’t need multi-core processors—just faster single cores. They were wrong. By 2005, multi-core computing had arrived, unlocking a new era of performance. Today, every smartphone runs on multiple cores. The same shift is happening in AI.

As single agents approach their limits, the path forward is clear: more agents, more specialisation, more parallel execution. Dismissing multi-agent systems now is like rejecting multi-core in 2005—spectacularly short-sighted.

The proof is already here. Cosine’s AutoPM is breaking benchmarks with a multi-agent architecture. Anthropic is investing deeply and shipping production systems. The so-called “fragility” is being solved by active research—not left as an unsolved problem.

The bottom line: yes, you should build multi-agents. The frontier of AI is being built by systems that coordinate, specialise, and scale. Sticking to single-agent thinking is clinging to the past.

We’re entering the age of AI parallelism. Embrace it—or risk being the one in 2030 still bragging about their single-core model while the rest of the world has moved on.

Multi-agent systems aren’t a fad. They’re working, scaling, and defining the future. Don’t get left behind.

You Should Build Multi-Agents

Cosine’s AutoPM: A Multi-Agent System Breaking Records

Anthropic Is All-In on Multi-Agent Systems

Breaking the “Fragile System” Fallacy

Dismissing Multi-Agent Systems in 2025 is a Mistake