Hypothesis

2 Matching Annotations

Last 7 days
huggingface.co huggingface.co

https://huggingface.co/papers/2604.04514

1
1. fxp007 24 Apr 2026
  
  in Public
  
  AI coding agents operate in a paradox: they possess vast parametric knowledge yet cannot remember a conversation from an hour ago.
  
  这个陈述揭示了当前AI系统的一个根本性矛盾——拥有大量静态知识却缺乏动态记忆能力，这挑战了我们对AI'智能'的传统理解。如果AI真正智能，它应该能够记住并利用过去的交互经验，而这正是当前大型语言模型架构的明显缺陷。
  
  paradox ai-limitation
Visit annotations in context

Tags

ai-limitation

paradox

Annotators

fxp007

URL

huggingface.co/papers/2604.04514
Apr 2026
www.anthropic.com www.anthropic.com

A "diff" tool for AI: Finding behavioral differences in new models

1
1. fxp007 09 Apr 2026
  
  in Public
  
  Because these benchmarks are human-authored, they can only test for risks we have already conceptualized and learned to measure.
  
  这句话揭示了当前 AI 安全评测体系的致命盲区：所有 benchmark 都是人类提前想好的问题，而真正危险的「未知的未知」（unknown unknowns）根本无法被预设题目捕捉。这意味着我们现有的模型安全认证，本质上是一场对已知风险的自我测试。
  
  benchmark-limitation unknown-unknowns AI-safety surprising
Visit annotations in context

Tags

surprising

unknown-unknowns

benchmark-limitation

AI-safety

Annotators

fxp007

URL

anthropic.com/research/diff-tool

Tags

Annotators

URL

Tags

Annotators

URL