18 Matching Annotations
  1. May 2026
    1. The Scientist AI is going to be trained using essentially the same machine learning techniques: stochastic gradient descent on large neural nets, transformers, whatever works best. It doesn't care about what is the architecture of the neural net. So all of the effort that is currently being done to improve, for example, memory and other properties and continual learning, can just be applied directly to the Scientist AI.

      Bengio解释Scientist AI将使用与现有模型相同的基础技术,这意味着实现成本不会显著增加,打破了安全与能力必须取舍的常见假设,为安全AI提供了实用路径。

    1. The benchmark tasks were meticulously constructed to be realistic, involving the hard work of hundreds of experts and likely millions of dollars — placing it among the most expensive economics papers of all time.

      作者提到GDPval基准测试可能花费了数百万美元,由数百名专家参与构建。这一数据点显示了AI基准测试的高昂成本,但也暗示了这类测试可能存在资源分配不均的问题。考虑到其成本与实际经济影响之间的差距,这种高投入低产出的现象值得反思。

    1. When our engineers no longer spend time supervising Codex sessions, the economics of code changes completely. The perceived cost of each change drops because we're no longer investing human effort in driving the implementation itself.

      大多数人认为AI编程会增加监督成本,但作者认为通过Symphony系统,人类监督成本实际上大幅下降,因为AI能够自主完成大部分实现工作。这个观点挑战了人们对AI编程成本结构的普遍认知,暗示正确的AI编排可能根本性地改变软件开发的经济模型。

  2. Apr 2026
    1. Resolution increases make them more expensive, then efficiency gains reduce costs - a sawtooth pattern.

      大多数人可能认为AI成本会呈现单调下降或上升的趋势,但作者提出'锯齿状'模式,即精度提升导致成本上升,然后效率提升又降低成本。这种波动性挑战了人们对技术成本发展的常规预期。

    1. This is the part people miss about AI-native companies - the $113k is not a cost, it is your headcount budget allocated differently.

      大多数人认为AI成本是额外的支出,但作者认为AI成本实际上是对人力预算的重新分配。这挑战了传统成本会计观念,暗示AI不是成本而是投资,但也可能低估了AI实际成本和维护的复杂性。

    1. The extra tokens bought something measurable. +5pp on strict instruction-following. Small. Real. So: is that worth 1.3–1.45x more tokens per prompt?

      这是一个令人惊讶的价值权衡案例。Anthropic用高达45%的token成本增加,只换来了5个百分点的指令遵循提升。这种不成比例的交换表明,在AI模型优化中,'微小但真实'的改进可能需要付出巨大成本,这挑战了人们对技术改进应该'物有所值'的普遍假设。

    1. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens.

      这是一个令人惊讶的发现,表明即使是小型、廉价的模型也能实现与昂贵的专有模型相当的安全漏洞检测能力。这挑战了AI安全领域需要最前沿模型的假设,暗示了经济高效的AI安全解决方案的可能性。

    1. As the cost of software development falls, trusted partners with broad adoption can expand faster than anyone else.

      在开发成本下降的背景下,广泛采用和信任成为扩张的关键因素,这暗示AI时代的赢家可能不是技术最先进的,而是能够最快建立信任生态系统的公司。

    1. 未来的评估体系,必须同时考虑:成功率、成本、延迟。这有点类似于对于云计算的考核标准,而不是传统软件。

      这一观点揭示了AI技能评估需要引入新的维度,特别是成本因素,这反映了AI时代的独特挑战,也暗示未来技能市场可能会出现基于资源消耗的定价机制,这与传统软件市场有本质区别。

    1. Performance: dev-browser: 3m53s, $0.88, 100% success rate — beats MCP configs, Chrome extensions, 'browser skill' stacks.

      令人惊讶的是:这种新技术不仅在功能上超越传统方法,在性能指标上也取得了显著优势,100%的成功率和相对较低的成本显示了其技术成熟度和实用性,这可能会使现有的浏览器自动化解决方案迅速过时。

    1. It defines a frontier model as any AI model trained using more than $100 million in computational costs, which likely could apply to America's largest AI labs, like OpenAI, Google, xAI, Anthropic, and Meta.

      令人惊讶的是:训练一个前沿AI模型的成本竟然高达1亿美元以上,这凸显了AI研发的惊人投入门槛。只有少数科技巨头能够负担如此高昂的计算成本,这可能正在重塑AI行业的竞争格局,形成新的技术垄断。

    1. Total cost: ~$29 ($20 in CPU VMs, $9 in API calls) over ~3 hours with 4 VMs.

      令人惊讶的是:仅花费29美元和3小时,AI代理就实现了显著的性能提升(x86上提升15.1%,ARM上提升5%)。这种低成本高效能的优化方式颠覆了传统认为高性能优化需要大量人力和时间的观念。

    1. Artificial Analysis has also positioned Gemini 3.1 Flash TTS within its 'most attractive quadrant' for its ideal blend of high-quality speech generation and low cost.

      令人惊讶的是:这个模型不仅质量高,而且成本效益也非常出色,在'最具吸引力象限'中占据一席之地。这表明Google在平衡AI性能和商业可行性方面取得了显著突破,这对大多数用户来说是意想不到的。

    1. since reasoning models and agentic AI can rack up quite a bill

      文章提醒了一个常被忽视的约束条件:AI的使用成本。在讨论AI替代人类时,人们往往默认AI是低成本方案,但推理模型和智能体的高昂算力成本意味着,仅凭能力覆盖并不等于经济上的可行替代,成本收益分析仍是决定性门槛。

    1. The cost of understanding what happens in a video has dropped by a factor of roughly 40, while the quality of that understanding has improved dramatically.

      大多数人认为AI视频分析仍处于早期阶段且成本高昂,但作者指出AI视频分析成本已大幅下降40倍,质量反而提升。这一反直觉观点暗示视频分析可能已经跨越了实用性的门槛,将催生全新的应用类别,挑战了人们对AI视频处理能力的传统认知。

    1. By using SAM, the Alta team has been able to process more than 20 million images without incurring exorbitant costs, allowing them to focus on building the best possible product for their users.

      大多数人可能认为初创公司需要依赖昂贵的第三方API来处理大量图像,但作者通过使用开源SAM模型,实现了大规模图像处理而不产生巨额成本。这一观点挑战了'高质量AI服务必须昂贵'的行业共识,展示了开源模型在成本效益方面的优势。

  3. Sep 2023