2 Matching Annotations
  1. Apr 2025
    1. we may be tempted to conduct our own activities aimed at introducing trojans or backdoors into adversary models. This could end up being necessary, but it could also trigger dangerous loss of control behaviors and runaway escalation.

      for - progress trap - ASI - introducing trojan and backdoors in adversary ASI - can backfire

    2. highly capable and context-aware AI systems can invent dangerously creative strategies to achieve their internal goals that their developers never anticipated or intended them to pursue.

      for - progress trap - ASI - progress trap - AGI