1 Matching Annotations
  1. Last 7 days
    1. The standard AI judges use to define "safe" are measured wrong. They punish action. They ignore inaction.

      令人惊讶的是:当前AI安全评估标准存在根本性缺陷——它们只惩罚错误行动,却忽视错误的不作为。这种评估方式导致AI模型被优化为看起来安全,但实际上可能因为过度谨慎而变得真正危险。