4 Matching Annotations
  1. May 2025
    1. anthropic's new AI model shows ability to deceive and blackmail

      for - progress trap - AI - blackmail - AI - autonomy - progress trap - AI - Anthropic - Claude Opus 4 - to - article - Anthropic Claude 4 blackmail and news leak - progress trap - AI - article - Anthropic Claude 4 - blackmail - rare behavior - Anthropic’s new AI model didn’t just “blackmail” researchers in tests — it tried to leak information to news outlets

  2. Jun 2020