8 Matching Annotations
  1. Last 7 days
    1. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.

      这是一个重要的技术声明,质疑政府行动的合理性。Anthropic声称发现的漏洞是已知的、微小的,且其他模型也能发现。这需要独立验证,以确定政府反应是否过度,以及Fable 5的安全性是否真的如Anthropic所描述的那样。

  2. Jun 2026
    1. We have instituted strong safeguards that greatly reduce the likelihood that Fable is misused for tasks related to cybersecurity (among others). In fact, our safeguards are so strong that many users have complained that they are overly broad.

      这是一个重要的自我辩护声明,涉及Anthropic对其安全措施的评估。需要核实这些安全措施的有效性,以及用户投诉的真实性。同时,这也值得深入了解AI模型安全措施的标准和评估方法,以及不同利益相关者对'过度严格'的不同看法。

    2. Our understanding is that the government believes it has become aware of a method of bypassing, or 'jailbreaking' Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities.

      这里包含了需要核实的技术细节。Anthropic声称政府发现的'越狱'方法仅能识别一些已知的、次要的漏洞,且其他公开模型也能发现这些漏洞。需要独立验证这一技术评估的真实性和准确性,以及政府所关注的安全问题的严重程度。

    3. The potential jailbreaks that have been disclosed to us are either entirely benign responses or are minor findings that provide no Mythos-specific uplift.

      大多数人认为政府发现的AI模型漏洞应该是严重的安全威胁,但作者认为被披露的潜在越狱要么是完全良性的响应,要么是次要发现,没有提供Mythos特有的提升。这挑战了政府对AI安全威胁严重性的主流认知。

    1. The company says it has only seen evidence of this kind of jailbreak being used to find 'minor' and 'relatively simple' software vulnerabilities

      大多数人认为AI模型的安全漏洞都可能导致严重后果,但作者指出Anthropic发现的所谓'越狱'只能找到'次要'和'相对简单'的软件漏洞,这挑战了政府对模型安全威胁的严重性评估,暗示政府反应过度。

  3. Aug 2022
  4. Apr 2020