3 Matching Annotations
  1. Apr 2026
    1. Tasks where correctness is harder to verify may not have seen the same speedup, so the acceleration we document here may not be as general as the headline numbers suggest.

      主流媒体和公众可能认为AI能力在所有领域都在加速提升,但作者明确指出,在正确性难以验证的任务中可能没有相同的加速现象。这一观点挑战了人们对AI进步普遍性的假设。

    1. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back.

      这展示了Claude Opus 4.7在自主验证和执行复杂任务方面的显著进步,标志着AI模型从简单响应向真正自主工作迈出的重要一步,这种自我验证机制大大提高了AI输出的可靠性。