1 Matching Annotations
  1. Last 7 days
    1. For the computer-use work that sits at the heart of XBOW's autonomous penetration testing, the new Claude Opus 4.7 is a step change: 98.5% on our visual-acuity benchmark versus 54.5% for Opus 4.6.

      在视觉敏锐度测试中从54.5%跃升至98.5%是一个惊人的进步,这展示了AI在网络安全领域的突破性进展,'our single biggest Opus pain point effectively disappeared'表明这一进步解决了实际应用中的关键瓶颈。