8 Matching Annotations
  1. May 2026
    1. the overall accuracy of predicting the risk of natural disaster—aggregated across 20 categories such as wildfires, floods, and tornadoes—was increased by 5%.

      5%的灾害预测准确率提升虽然看似不大,但这是针对20种不同灾害类别的综合提升,对于灾害预警系统而言具有重要价值。这种提升可能挽救生命并减少经济损失,特别是在高风险地区。

  2. Apr 2026
    1. Using these ability scores, the method predicts performance on new tasks with ~88% accuracy, including for models such as GPT-4o and Llama-3.1.

      令人惊讶的是:ADeLe方法能够以约88%的准确度预测AI模型在新任务上的表现,这包括像GPT-4o和Llama-3.1这样先进的大模型。这种预测能力远超传统评估方法,为AI性能评估提供了革命性的突破,使研究人员能够更可靠地预见模型在未见过的任务上的表现。

  3. May 2021
  4. Oct 2020
  5. Sep 2020
  6. Aug 2020
  7. Jun 2020