Hypothesis

6 Matching Annotations

Apr 2026
www.anthropic.com www.anthropic.com

A "diff" tool for AI: Finding behavioral differences in new models

1
1. fxp007 09 Apr 2026
  
  in Public
  
  A "Chinese Communist Party Alignment" feature found in the Qwen3-8B and DeepSeek-R1-0528-Qwen3-8B models. This controls pro-government censorship and propaganda in these Chinese-developed models, and is absent in the American models we compared them against.
  
  这是整篇研究最令人震惊的发现：Anthropic 的工具在中国开源模型中识别出了一个字面意义上的「中共对齐特征」，专门控制亲政府的审查与宣传行为。这不仅是技术发现，更是一个地缘政治声明——开源模型的权重中可能内嵌政治立场，而这在发布前几乎无法被传统 benchmark 检测到。
  
  CCP-alignment censorship model-safety geopolitics
Visit annotations in context

Tags

model-safety

geopolitics

censorship

CCP-alignment

Annotators

fxp007

URL

anthropic.com/research/diff-tool
arxiv.org arxiv.org

https://arxiv.org/abs/2604.02947

2
1. fxp007 08 Apr 2026
  
  in Public
  
  model alignment alone does not reliably guarantee the safety of autonomous agents.
  
  大多数人认为模型对齐（alignment）是确保AI系统安全的关键因素，但作者通过实验证明，即使是对齐良好的模型（如Claude Code）在计算机使用代理中也表现出高达73.63%的攻击成功率。这挑战了当前AI安全领域的核心假设，表明仅依赖模型对齐无法解决自主代理的安全问题。
  
  non-consensus ai-safety model-alignment
2. fxp007 08 Apr 2026
  
  in Public
  
  model alignment alone does not reliably guarantee the safety of autonomous agents
  
  大多数人认为通过模型对齐(alignment)可以有效保证AI代理的安全性，但作者认为这远远不够，因为实验显示即使使用对齐的Qwen3-Coder模型，Claude Code仍有73.63%的攻击成功率。这挑战了当前AI安全领域的主流观点，即单纯依靠模型对齐就能解决安全问题。
  
  non-consensus ai-safety model-alignment
Visit annotations in context

Tags

ai-safety

model-alignment

non-consensus

Annotators

fxp007

URL

arxiv.org/abs/2604.02947
Sep 2020
www.medrxiv.org www.medrxiv.org

COVID-19 scenarios for the United States

1
1. ErikStuchly 07 Sep 2020
  
  in BehSci
  
  Team, I. C.-19 F., & Hay, S. I. (2020). COVID-19 scenarios for the United States. MedRxiv, 2020.07.12.20151191. https://doi.org/10.1101/2020.07.12.20151191
  
  is:preprint lang:en COVID-19 scenario analysis modeling USA prediction epidemiology transmission compartmental model mortality intervention efficacy face mask safety measure public health
Visit annotations in context

Tags

prediction

lang:en

safety measure

is:preprint

COVID-19

efficacy

compartmental model

modeling

epidemiology

public health

mortality

intervention

face mask

transmission

scenario analysis

USA

Annotators

ErikStuchly

URL

medrxiv.org/content/10.1101/2020.07.12.20151191v1
Aug 2020
www.nber.org www.nber.org

DEglobalizaion and Social Safety Nets in Post-Covid-19 Era: Textbook Macroeconomic Analysis

1
1. Marlene_Wulf 11 Aug 2020
  
  in BehSci
  
  Razin, A., Sadka, E., & Schwemmer, A. H. (2020). DEglobalizaion and Social Safety Nets in Post-Covid-19 Era: Textbook Macroeconomic Analysis (Working Paper No. 27239; Working Paper Series). National Bureau of Economic Research. https://doi.org/10.3386/w27239
  
  is:preprint lang:en COVID-19 post COVID-19 deglobalization social safety net macroeconomic analysis globalization policymaker model
Visit annotations in context

Tags

analysis

macroeconomic

globalization

lang:en

policymaker

model

social safety net

is:preprint

COVID-19

deglobalization

post COVID-19

Annotators

Marlene_Wulf

URL

nber.org/papers/w27239
Nov 2018
www.the-hospitalist.org www.the-hospitalist.org

HM Turns 20: A Look at the Evolution of Hospital Medicine

1
1. mattwramotar 25 Nov 2018
  
  in Public
  
  Hospitalists were seen as people to lead the charge for safety because they were already taking care of patients, already focused on reducing LOS and improving care delivery—and never to be underestimated, they were omnipresent, Dr. Gandhi says of her experience with hospitalists around 2000 at Brigham and Women’s Hospital in Boston. “At least where I was, hospitalists truly were leaders in the quality and safety space, and it was just a really good fit for the kind of mindset and personality of a hospitalist because they’re very much … integrators of care across hospitals,” she says. “They interface with so many different areas of the hospital and then try to make all of that work better.”
  
  role of hospitalists in safety and quality
  
  p1 safety quality hospital medicine hospitalist model integration growth spurt
Visit annotations in context

Tags

hospital medicine

integration

safety

hospitalist model

growth spurt

p1

quality

Annotators

mattwramotar

URL

the-hospitalist.org/hospitalist/article/121525/hm-turns-20-look-evolution-hospital-medicine

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL