Both illustrate how decomposing complex tasks across specialized agents can address problems that monolithic models handle poorly.
这一观点提出了多智能体架构在处理复杂任务中的优势,为解决单一模型难以处理的问题提供了新的解决方案。
Both illustrate how decomposing complex tasks across specialized agents can address problems that monolithic models handle poorly.
这一观点提出了多智能体架构在处理复杂任务中的优势,为解决单一模型难以处理的问题提供了新的解决方案。
Workspace agents can gather context from the right systems, follow team processes, ask for approval when needed, and keep work moving across tools.
许多人可能认为 AI 工具难以理解和执行复杂的团队流程,但作者强调 workspace agents 能够理解和执行这些流程,挑战了 AI 在复杂任务中的能力限制。
But the real power of agents comes when they can work as a team. Instead of lone-wolf bots carrying out single tasks, such as using a browser to make a restaurant reservation or sending you a summary of your inbox, new tools can yoke together multiple agents, give each of them a different job, and orchestrate their behaviors so that they all pull together to complete more complex tasks than an individual agent could do by itself.
主流观点可能认为人工智能代理将独立完成工作,但作者指出,它们的真正力量在于团队合作,通过协同工作完成比单个代理更复杂的任务。
It maintains 97% skill compliance across 40 complex skills on MM Claw, each skill exceeding 2,000 tokens.
97%的技能合规率是一个非常高的指标,特别是在处理超过2000个token的复杂技能时。这表明M2.7不仅能够理解复杂指令,还能在长时间任务中保持一致性和可靠性。对于需要构建复杂代理工作流的开发者来说,这一数据点特别有价值,因为它意味着模型可以可靠地执行多步骤、高复杂度的任务。
we may see a growing divergence between the capabilities we can measure and the capabilities we actually care about.
「可测量的能力」与「真正关心的能力」之间的分歧正在扩大——这是整篇文章最深刻的洞见。所有当前 benchmark 都偏向「干净、自包含、可自动评分」的任务,而真实工作是「混乱、跨系统、需人类判断」的。随着 AI 向长任务延伸,这个测量-现实之间的鸿沟不会缩小,只会加速扩大。这意味着未来关于「AI 能否替代某类工作」的争论,将越来越难以用数据解决——因为数据本身无法捕捉真实工作的本质。
Grözinger. N., Irlenbusch. B., Laske. K., Schröder. M., (2020). Innovation and Communication Media in Virtual Teams – An Experimental Study. Institute of Labor Economics. Retrieved from: https://covid-19.iza.org/publications/innovation-and-communication-media-in-virtual-teams-an-experimental-study/