Hypothesis

3 Matching Annotations

Apr 2026
news.smol.ai news.smol.ai

https://news.smol.ai/issues/26-04-08-not-much

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Gemma4-31B worked in an iterative-correction loop (with a long-term memory bank) for 2 hours to solve a problem that baseline GPT-5.4-Pro couldn't
  
  令人惊讶的是，较小的Gemma4-31B模型通过迭代修正循环和长期记忆库工作了2小时，解决了GPT-5.4-Pro无法解决的问题。这表明模型架构创新和推理能力可能比单纯的规模扩展更重要，为AI发展提供了新的方向。
  
  surprising model-architecture ai-innovation
Visit annotations in context

Tags

model-architecture

ai-innovation

surprising

Annotators

fxp007

URL

news.smol.ai/issues/26-04-08-not-much
arxiv.org arxiv.org

https://arxiv.org/abs/2604.05091

1
1. fxp007 16 Apr 2026
  
  in Public
  
  We replace persistent autograd graphs with stateless layer templates, binding weights dynamically as they stream in, eliminating persistent graph metadata while providing flexibility in scheduling.
  
  令人惊讶的是：研究团队摒弃了传统的持久化自动微分图，采用无状态层模板和动态权重绑定的创新方法，这不仅消除了图元数据开销，还提供了调度灵活性。这种架构层面的创新可能是实现单GPU训练百亿参数模型的关键突破。
  
  surprising autograd architecture-innovation
Visit annotations in context

Tags

architecture-innovation

surprising

autograd

Annotators

fxp007

URL

arxiv.org/abs/2604.05091
www.anthropic.com www.anthropic.com

A "diff" tool for AI: Finding behavioral differences in new models

1
1. fxp007 09 Apr 2026
  
  in Public
  
  our DFC is architecturally designed with three distinct sections: A shared dictionary, A "French-only" section, An "English-only" section
  
  Dedicated Feature Crosscoder（DFC）的三段式架构设计是这项研究的核心技术突破：通过分别建立「共享词典」和两个「专属词典」，强制让模型差异特征有独立的表示空间，而非被混入共享特征中。令人惊讶的是，如此影响深远的安全工具，其设计思路竟然与字典编纂学高度同构。
  
  DFC architecture crosscoder technical-innovation
Visit annotations in context

Tags

crosscoder

architecture

DFC

technical-innovation

Annotators

fxp007

URL

anthropic.com/research/diff-tool

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL