1 Matching Annotations
  1. Last 7 days
    1. The model reportedly scored 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro, but its strongest signal came from real-world results, including uncovering a 27-year-old flaw in OpenBSD, a 16-year-old vulnerability in FFmpeg, and autonomously chaining Linux kernel exploits without human input.

      这些惊人的安全漏洞发现能力表明AI已经超越了传统安全工具,能够自主发现几十年未被发现的漏洞。特别是能够自主链接Linux内核漏洞的能力,展示了AI在网络安全领域的革命性潜力,这可能彻底改变安全研究和漏洞修复的方式。