Hypothesis

2 Matching Annotations

May 2026
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/glasswing-initial-update

1
1. fxp007 22 May 2026
  
  in Public
  
  their rate of bug-finding has increased by more than a factor of ten
  
  10倍的漏洞发现率提升是一个关键性能指标，表明AI模型在安全测试效率上的革命性突破。这一数据点特别有价值，因为它直接量化了AI与传统安全方法相比的性能提升。然而，文章没有提供具体的基准测试数据，如之前每小时发现多少漏洞，使得这个'10倍'的相对提升缺乏绝对参考。
  
  data-point performance-metrics efficiency-gain
Visit annotations in context

Tags

data-point

efficiency-gain

performance-metrics

Annotators

fxp007

URL

anthropic.com/research/glasswing-initial-update
Apr 2026
blog.skypilot.co blog.skypilot.co

https://blog.skypilot.co/research-driven-agents/

1
1. fxp007 17 Apr 2026
  
  in Public
  
  Coding agents working from code alone generate shallow hypotheses. Adding a research phase — arxiv papers, competing forks, other backends — produced 5 kernel fusions that made llama.cpp CPU inference 15% faster.
  
  这一声明揭示了AI代理在代码优化中的关键局限：仅基于代码的优化会产生浅显的假设。通过引入研究阶段，包括阅读学术论文、研究竞争项目和后端实现，代理能够发现更深层次的优化机会，实现了显著的性能提升。这表明AI代理需要更广泛的上下文信息才能做出有意义的创新。
  
  ai-optimization research-phase performance-gain
Visit annotations in context

Tags

performance-gain

research-phase

ai-optimization

Annotators

fxp007

URL

blog.skypilot.co/research-driven-agents/

Tags

Annotators

URL

Tags

Annotators

URL