1 Matching Annotations
  1. Last 7 days
    1. The three metrics where we find acceleration are concentrated in programming and mathematics. These are areas that labs have explicitly targeted for improvement, and they share an important property: correctness is easy to verify automatically.

      大多数人可能认为AI能力的加速是跨领域普遍发生的,但作者指出加速主要集中在编程和数学领域,因为这些领域正确性容易自动验证。这一发现挑战了人们对AI能力普遍提升的假设,暗示加速可能是有选择性的。