Hypothesis

Today's best coding agents lose nearly half their capability when paired up to share work.

【令人震惊】斯坦福 CooperBench 发现：当两个顶级 Coding Agent 协作时，性能下降近 50%！这彻底打破了「Agent 越多越好」的直觉。更令人不安的是，失败集中在「中等难度」任务的甜区——恰好是最应该从协作中受益的区间。这对 Multi-Agent 架构设计者是一个严峻的警示：规模化 Agent 系统的瓶颈不在算力，而在「社会智能」。

CooperBench 50-percent-drop multi-agent coordination-gap shocking

Tags

Annotators

URL