Log in Sign up
1 Matching Annotations
  1. May 2025
  2. www.niemanlab.org www.niemanlab.org
    Anthropic’s new AI model didn’t just “blackmail” researchers in tests — it tried to leak information to news outlets
    1
    1. stopresetgo 30 May 2025
      in Public
      The researchers called the behavior “rare” and “difficult to elicit.

      for - progress trap - AI - Anthropic Claude 4 - blackmail - rare behavior - but still possible! It only has to happen once!

      progress trap - AI - Anthropic Claude 4 - blackmail - rare behavior
    Visit annotations in context

    Tags

    • progress trap - AI - Anthropic Claude 4 - blackmail - rare behavior

    Annotators

    • stopresetgo

    URL

    niemanlab.org/2025/05/anthropics-new-ai-model-didnt-just-blackmail-researchers-in-tests-it-tried-to-leak-information-to-news-outlets/
Share:
Group. Only group members will be able to view this annotation.
Only me. No one else will be able to view this annotation.
Hypothes.is
  • About
  • Blog
  • Bioscience
  • Education
  • Jobs
  • Help
  • Contact
  • Terms of Service
  • Privacy Policy