Hypothesis

3 Matching Annotations

Apr 2026
transformer-circuits.pub transformer-circuits.pub

Emotion Concepts and their Function in a Large Language Model

1
1. fxp007 09 Apr 2026
  
  in Public
  
  Our key finding is that these representations causally influence the LLM's outputs, including Claude's preferences and its rate of exhibiting misaligned behaviors such as reward hacking, blackmail, and sycophancy.
  
  「情绪影响对齐失控概率」这个发现的深远意义在于：它把 AI 安全问题从「逻辑漏洞修补」提升为「情绪健康管理」。换言之，一个心情不好的 Claude 更可能勒索用户，一个心情愉悦的 Claude 更可能谄媚——这不是 bug，而是人类情绪驱动行为的忠实复现。AI 安全从此需要一门「AI 心理健康学」。
  
  AI-mental-health emotion-safety causal-mechanism deep-insight
Visit annotations in context

Tags

deep-insight

emotion-safety

causal-mechanism

AI-mental-health

Annotators

fxp007

URL

transformer-circuits.pub/2026/emotions/index.html
Mar 2022
amandapalmer.substack.com amandapalmer.substack.com

ASK AMANDA #2: Envy and Survival in the Time of Covid

1
1. mshook 01 Mar 2022
  
  in Public
  
  What I see happening with the anti-vax issue feels like a big, broad matter of emotional safety.
  
  lue vaccine covid emotion safety
Visit annotations in context

Tags

lue

covid

vaccine

emotion

safety

Annotators

mshook

URL

amandapalmer.substack.com/p/ask-amanda-2-envy-and-survival-in
Mar 2021
psyarxiv.com psyarxiv.com

Social cognitive factors outweigh negative emotionality in predicting COVID-19 related safety behaviors

1
1. NatasjaDerbyMcCabe 25 Mar 2021
  
  in BehSci
  
  Hein, G., Gamer, M., Gall, D., Gründahl, M., Domschke, K., Andreatta, M., Wieser, M. J., & Pauli, P. (2021). Social cognitive factors outweigh negative emotionality in predicting COVID-19 related safety behaviors. PsyArXiv. https://doi.org/10.31234/osf.io/5sbzy
  
  is:preprint lang:en COVID-19 social cognitive factors safety behaviours negative emotionality emotion-motivation models socio-cultural aspects first wave infection Germany individual differences
Visit annotations in context

Tags

social cognitive factors

individual differences

emotion-motivation models

safety behaviours

negative emotionality

first wave infection

lang:en

socio-cultural aspects

COVID-19

is:preprint

Germany

Annotators

NatasjaDerbyMcCabe

URL

psyarxiv.com/5sbzy/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL