19 Matching Annotations
  1. Last 7 days
    1. Andrej Karpathy built a simple automation pipeline for AI agents to optimize training in 5-minute increments.

      这个案例展示了AI系统在自动化研究中的应用,5分钟的增量优化时间是一个精细的时间尺度,表明AI系统已经能够进行快速迭代的实验。61K+的GitHub星标表明这种方法在AI研究社区中引起了广泛关注。

  2. May 2026
    1. Scientific publication compresses a branching, iterative research process into a linear narrative, discarding the majority of what was discovered along the way.

      大多数人认为科学论文完整记录了研究过程,但作者认为传统科学论文实际上丢弃了大部分发现,只呈现线性叙事,这构成了所谓的'故事税'。这种观点挑战了学术界对出版物完整性的普遍认知。

  3. Apr 2026
    1. An AI researcher subsequently gifted them each a ChatGPT Pro subscription to encourage their 'vibe mathing.'

      大多数人认为严肃的数学研究需要严谨的方法和深厚的专业知识,但作者使用'vibe mathing'这种非正式术语描述这种研究方式,挑战了学术研究方法论的传统规范。

    1. two participants gave it 9/10 and one "11/10"

      一个 2 小时的桌游式推演,三位顶级 AI 安全研究员给出了 9-11 分的评价——这本身就是一个信号:严肃的 AI 研究机构正在用「角色扮演」的方式准备未来。这种方法论(预演未来能力下的工作流)在其他领域有先例——军事桌游、灾难演习、情景规划——但将其用于 AI 能力演进,是 METR 独特的研究品味的体现。

    1. Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior.

      【启发】这句话提示了一种全新的 AI 研究范式:与其问「模型能做什么」,不如问「模型为什么这样做」。把情绪作为切入口去理解模型行为,本质上是把心理学方法论引入了 AI 可解释性研究。这对从业者的启发是:未来最有价值的 AI 研究,可能不在算法创新,而在「为已知现象寻找机制性解释」——就像这篇论文做的那样。

  4. Aug 2025
  5. Apr 2024
  6. Feb 2024
  7. Dec 2021
  8. Nov 2021
    1. (the VTA is also part ofthis system, but is too small to image with standard fMRImethods, but see [35] for successful imaging methods).

      All imaging studies face questions of validity and should (and many do) link to comprehensive details on instrumentation, methodology, and interpretation. Apparently, the professional consensus remains that, properly executed and interpreted, fMRI and other functional imaging techniques based on detection of oxygenation can lead to highly valid conclusions. (See Nautil.us article.)

  9. Jul 2021
  10. Jun 2021
  11. Oct 2020
  12. Sep 2020
  13. Jun 2020