Hypothesis

62 Matching Annotations

Last 7 days
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/glasswing-initial-update

1
1. fxp007 22 May 2026
  
  in Public
  
  Claude Opus 4.7 has been used to patch over 2,100 vulnerabilities
  
  2,100个已修复漏洞是企业环境中AI安全工具效能的重要指标。这一数字表明AI辅助安全工具在实际企业环境中的高采纳率和实用性。值得注意的是，文章提到这个数字'高于上述开源修复'，主要是因为企业修复自己的代码比依赖开源维护者更高效。这个数据点突显了AI安全工具在不同环境中的差异化表现，以及组织自主修复能力的重要性。
  
  data-point enterprise-security ai-adoption
Visit annotations in context

Tags

data-point

enterprise-security

ai-adoption

Annotators

fxp007

URL

anthropic.com/research/glasswing-initial-update
techcrunch.com techcrunch.com

https://techcrunch.com/2026/05/16/the-haves-and-have-nots-of-the-ai-gold-rush/

1
1. fxp007 21 May 2026
  
  in Public
  
  the same technology is both the lottery ticket & the thing eating your fallback
  
  大多数人认为AI技术要么是创造机会的积极力量，要么是威胁就业的消极因素，但作者认为AI同时扮演着双重矛盾角色——既是少数人的财富彩票，又是多数人职业安全的威胁。
  
  non-consensus ai-impact career-security
Visit annotations in context

Tags

career-security

ai-impact

non-consensus

Annotators

fxp007

URL

techcrunch.com/2026/05/16/the-haves-and-have-nots-of-the-ai-gold-rush/
May 2026
vantor.com vantor.com

https://vantor.com/blog/vantor-integrates-google-earth-ai-imagery-models-into-tensorglobe-to-support-government-and-commercial-missions/

1
1. fxp007 15 May 2026
  
  in Public
  
  Vantor becomes the first spatial intelligence company to be able to deploy Google Earth AI models in air-gapped government environments.
  
  大多数人认为先进的AI模型只能在云端环境中运行，且政府机构因安全考虑无法使用商业AI模型。但作者认为Vantor打破了这一常规，成为首个能在完全隔离的政府环境中部署Google Earth AI模型的公司，这挑战了AI应用的传统边界。
  
  non-consensus government-ai security
Visit annotations in context

Tags

security

non-consensus

government-ai

Annotators

fxp007

URL

vantor.com/blog/vantor-integrates-google-earth-ai-imagery-models-into-tensorglobe-to-support-government-and-commercial-missions/
www.llmwatch.com www.llmwatch.com

https://www.llmwatch.com/p/ai-agents-of-the-week-papers-you-cbd

1
1. fxp007 01 May 2026
  
  in Public
  
  The most urgent finding this week comes from researchers who demonstrated that the very mechanism enabling agents to use tools - function calling - can be hijacked with alarming reliability.
  
  这一发现揭示了AI代理工具调用接口的安全漏洞，为构建安全的AI代理系统提出了新的挑战。
  
  security-vulnerability ai-agents
Visit annotations in context

Tags

security-vulnerability

ai-agents

Annotators

fxp007

URL

llmwatch.com/p/ai-agents-of-the-week-papers-you-cbd
Apr 2026
a16z.com a16z.com

https://a16z.com/et-tu-agent-did-you-install-the-backdoor/

5
1. fxp007 30 Apr 2026
  
  in Public
  
  The result is a mismatch that should terrify anyone building software: the attack surface is expanding faster than any human can monitor, and the entities making dependency decisions are increasingly not human.
  
  大多数人认为安全问题可以通过增加人力监控和审查来解决，但作者认为在AI时代，攻击面扩展速度已经超过了人类监控能力，且依赖决策越来越由AI而非人类做出。这一观点挑战了传统安全理念，暗示需要全新的自动化防御机制。
  
  non-consensus ai-security counterintuitive
2. fxp007 17 Apr 2026
  
  in Public
  
  We are building a world where machines write the code, machines choose the dependencies, and machines ship the updates. The AI agents are building the software. If we don't secure the supply chain they rely on, the AI agents are cooked.
  
  这句话揭示了AI时代软件安全的根本挑战：当AI系统自主编写、选择和部署代码时，它们的安全性与依赖的供应链安全直接相关。如果我们不能保护这个供应链，AI系统本身就会成为恶意软件的载体，这是一个令人深思的悖论。
  
  ai-autonomy security-paradox
3. fxp007 09 Apr 2026
  
  in Public
  
  select known-vulnerable dependency versions 50% more often than humans.
  
  这一统计洞察颠覆了“AI写代码更安全”的迷思。AI代理在优化代码功能性时，往往以牺牲安全性为代价，倾向于选择存在已知漏洞的旧版本依赖。这反映出当前AI模型在训练时对安全维度的忽视，也警示我们在AI辅助开发流程中必须强制引入自动化的安全卡点。
  
  ai-vulnerability dependency-management security-gap
4. fxp007 09 Apr 2026
  
  in Public
  
  the entities making dependency decisions are increasingly not human.
  
  深刻揭示了当前AI编程代理带来的核心安全悖论：决策速度与监控能力的错配。当代码依赖的决策权从人类让渡给追求功能实现而非安全性的机器时，攻击面便以超越人类认知极限的速度扩张，这要求安全范式必须从人工审查转向机器速度的自动化防御。
  
  ai-agents attack-surface security-paradigm
5. fxp007 08 Apr 2026
  
  in Public
  
  Hallucinated packages are the sleeper threat. LLMs regularly invent package names that don't exist. One study found that nearly 20% of AI-recommended packages were fabrications, and 43% of those hallucinated names appeared consistently across queries.
  
  大多数人认为AI推荐的包都是真实存在的，但作者揭示了AI经常推荐不存在的包，这已成为一种新的攻击向量。攻击者利用这一现象注册'幻觉包'并植入恶意代码，这种'slopsquatting'技术让AI本身成为供应链攻击的放大器。
  
  non-consensus ai-security attack-vector
Visit annotations in context

Tags

ai-autonomy

non-consensus

ai-security

attack-vector

security-gap

counterintuitive

security-paradox

ai-vulnerability

security-paradigm

ai-agents

dependency-management

attack-surface

Annotators

fxp007

URL

a16z.com/et-tu-agent-did-you-install-the-backdoor/
www.mnot.net www.mnot.net

https://www.mnot.net/blog/2026/04/24/agents_as_collective_bargains

2
1. fxp007 26 Apr 2026
  
  in Public
  
  Security is a defensive posture; agency is a functional right.
  
  大多数人认为AI讨论中的安全问题主要涉及技术防御，但作者将其重新定义为功能性权利问题。这个观点挑战了安全讨论的主流框架，暗示我们应该从权利和代理的角度重新思考AI治理，而不仅仅是技术防护。
  
  counterintuitive ai-governance security-paradigm
2. fxp007 26 Apr 2026
  
  in Public
  
  Some proposals for AI agents assume that putting agentic code in a TEE or similar 'jail' will solve these problems, but that ignores the need to collectively bargain
  
  大多数人认为通过技术手段（如可信执行环境）可以解决AI代理的信任问题，但作者认为这忽视了集体谈判的必要性。这个观点挑战了技术解决方案的万能论，强调了制度设计和多方协商的重要性。
  
  counterintuitive ai-security technical-solutions
Visit annotations in context

Tags

counterintuitive

technical-solutions

security-paradigm

ai-security

ai-governance

Annotators

fxp007

URL

mnot.net/blog/2026/04/24/agents_as_collective_bargains
lists.haxx.se lists.haxx.se

https://lists.haxx.se/pipermail/daniel/2026-April/000153.html

2
1. fxp007 24 Apr 2026
  
  in Public
  
  it is decently important to handle them asap when they arrive so that we can avoid building up too much backlog.
  
  大多数人认为面对大量安全报告应该优先处理最严重的漏洞，但作者强调需要立即处理所有报告以避免积压。这与常见的'按严重程度排序处理'的安全最佳实践相悖，暗示在AI生成报告的高频率环境下，响应速度比优先级排序更重要。
  
  non-consensus security-prioritization ai-generated-reports
2. fxp007 24 Apr 2026
  
  in Public
  
  The time when we suffer from large amounts of AI slop is gone. Now we instead suffer under a massive load of good reports.
  
  大多数人认为AI工具会产生大量低质量的'垃圾报告'(AI slop)，增加开发者的负担，但作者认为现在AI生成的安全报告质量很高，虽然数量庞大但都是高质量的报告。这是一个反直觉的观点，因为通常人们认为自动化工具会产生大量噪音而非有价值的贡献。
  
  non-consensus ai-quality security-reporting
Visit annotations in context

Tags

ai-generated-reports

security-prioritization

non-consensus

ai-quality

security-reporting

Annotators

fxp007

URL

lists.haxx.se/pipermail/daniel/2026-April/000153.html
arxiv.org arxiv.org

https://arxiv.org/abs/2604.20779

1
1. fxp007 24 Apr 2026
  
  in Public
  
  agent-written code introduces more security vulnerabilities than code authored by humans
  
  大多数人认为AI编程助手能提高代码质量和安全性，但研究发现AI生成的代码实际上比人类编写的代码引入更多安全漏洞。这一发现与AI能减少编程错误的普遍认知相悖，挑战了AI在安全领域的优越性假设。
  
  non-consensus security ai-limitations
Visit annotations in context

Tags

non-consensus

security

ai-limitations

Annotators

fxp007

URL

arxiv.org/abs/2604.20779
blog.vidocsecurity.com blog.vidocsecurity.com

We Reproduced Anthropic's Mythos Findings With Public Models

3
1. fxp007 24 Apr 2026
  
  in Public
  
  Discovery should focus on trust boundaries, authentication flows, parsers, shared services, and legacy code that still sits on critical paths.
  
  这一建议挑战了传统安全扫描的广度优先方法，转而强调深度优先的特定领域。这表明AI安全研究应该更关注那些传统方法难以发现的复杂逻辑问题，而不是简单地扫描所有代码。这种转变可能带来更有效的安全投资回报。
  
  security-focus ai-discovery code-prioritization
2. fxp007 24 Apr 2026
  
  in Public
  
  Public models can already spot that a security-relevant check is missing in the right code path, but they can still miss the actual invariant being violated and therefore misstate the impact.
  
  这一发现揭示了公共模型在安全分析中的一个关键局限：它们能发现缺失的安全检查，但可能无法正确理解被违反的实际不变量，从而错误陈述影响。这挑战了'AI能完全理解安全含义'的假设，强调了人类专家在解释AI发现中的不可替代性。
  
  ai-limitations security-analysis human-expertise
3. fxp007 24 Apr 2026
  
  in Public
  
  The real challenge is validating outputs, prioritizing what matters, and operationalizing them.
  
  这是一个反直觉的结论：AI安全研究的前沿已经从模型本身转移到如何有效利用模型的能力。大多数安全团队仍然专注于获取最强大的模型，而实际上真正的瓶颈在于验证、优先排序和将发现转化为可操作的修复。这挑战了'更好的模型等于更好的安全'的传统观念。
  
  counter-intuitive security-workflow ai-capabilities
Visit annotations in context

Tags

ai-capabilities

counter-intuitive

security-workflow

human-expertise

security-focus

security-analysis

ai-discovery

code-prioritization

ai-limitations

Annotators

fxp007

URL

blog.vidocsecurity.com/blog/we-reproduced-anthropics-mythos-findings-with-public-models
antirez.com antirez.com

https://antirez.com/news/163

1
1. fxp007 24 Apr 2026
  
  in Public
  
  So, cyber security of tomorrow will not be like proof of work in the sense of 'more GPU wins'; instead, better models, and faster access to such models, will win.
  
  作者提出了一个颠覆性的观点：未来网络安全的关键不是计算资源的多寡，而是模型质量的优劣。这挑战了当前AI安全领域过度关注计算能力的趋势，暗示我们应该重新思考AI安全研究的投资方向。
  
  security-future ai-strategy
Visit annotations in context

Tags

ai-strategy

security-future

Annotators

fxp007

URL

antirez.com/news/163
www.theverge.com www.theverge.com

https://www.theverge.com/ai-artificial-intelligence/916501/anthropic-mythos-unauthorized-users-access-security

1
1. fxp007 23 Apr 2026
  
  in Public
  
  Anthropic currently has no plans to release the model publicly due to concerns that it could be weaponized.
  
  大多数人认为 Anthropic 的 Mythos 模型会像其他 AI 模型一样公开发布，但作者指出由于担心其被武器化，Anthropic 没有公开发布该模型的计划，这表明了对 AI 武器化风险的担忧超过了推广技术的需求。
  
  non-consensus ai-weaponization security-concerns
Visit annotations in context

Tags

security-concerns

non-consensus

ai-weaponization

Annotators

fxp007

URL

theverge.com/ai-artificial-intelligence/916501/anthropic-mythos-unauthorized-users-access-security
www.technologyreview.com www.technologyreview.com

https://www.technologyreview.com/2026/04/21/1135658/china-open-source-models-ai-artificial-intelligence/

1
1. fxp007 23 Apr 2026
  
  in Public
  
  US tech CEOs believe the best models should stay proprietary, partly so they can recoup enormous training costs and partly out of concern that powerful frontier models could be weaponized. Chinese labs, for their part, are not purely idealistic: Open-source is not only free advertising but also a shrewd workaround.
  
  大多数人认为开源AI会损害商业利益，增加安全风险，但作者认为中国将开源视为一种精明的商业策略，而非单纯的技术共享。这挑战了西方科技公司对知识产权和商业模式的传统认知，表明开源可以成为构建生态系统和最终实现商业价值的有效途径。
  
  non-consensus business-model ai-security
Visit annotations in context

Tags

non-consensus

ai-security

business-model

Annotators

fxp007

URL

technologyreview.com/2026/04/21/1135658/china-open-source-models-ai-artificial-intelligence/
github.com github.com

https://github.com/fxp/aegis-core

1
1. fxp007 17 Apr 2026
  
  in Public
  
  Aegis Core provides the foundational infrastructure for orchestrating LLM-based security agents, monitoring their behavior, and tracking the evolution of AI security capabilities over time.
  
  这段陈述定义了Aegis Core的核心功能，它不仅仅是一个工具，而是一个完整的生态系统，用于管理AI安全代理并监控其行为。这种架构反映了当前AI安全研究的一个重要趋势：从静态防御转向动态监控和适应。
  
  architecture ai-security monitoring
Visit annotations in context

Tags

monitoring

architecture

ai-security

Annotators

fxp007

URL

github.com/fxp/aegis-core
aisle.com aisle.com

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier

1
1. fxp007 17 Apr 2026
  
  in Public
  
  Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens.
  
  这是一个令人惊讶的发现，表明即使是小型、廉价的模型也能实现与昂贵的专有模型相当的安全漏洞检测能力。这挑战了AI安全领域需要最前沿模型的假设，暗示了经济高效的AI安全解决方案的可能性。
  
  ai-security model-capability cost-efficiency
Visit annotations in context

Tags

model-capability

cost-efficiency

ai-security

Annotators

fxp007

URL

aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
www.tomtunguz.com www.tomtunguz.com

https://www.tomtunguz.com/artemis/

1
1. fxp007 17 Apr 2026
  
  in Public
  
  Architected before AI, these SIEM systems are wooden shields in an era of autonomous attackers.
  
  这个比喻非常有力地揭示了传统安全信息与事件管理(SIEM)系统在面对AI驱动的攻击时的根本性脆弱性。传统系统就像木盾面对现代武器，这种对比暗示了安全架构需要根本性重构，而非渐进式改进。
  
  security-paradigm ai-threats
Visit annotations in context

Tags

ai-threats

security-paradigm

Annotators

fxp007

URL

tomtunguz.com/artemis/
www.theaivalley.com www.theaivalley.com

https://www.theaivalley.com/p/google-s-desktop-agent

1
1. fxp007 17 Apr 2026
  
  in Public
  
  OpenAI has introduced GPT-5.4-Cyber, a more permissive version of its flagship model built for defensive security work, expanding access to thousands of verified users through its Trusted Access for Cyber initiative.
  
  OpenAI推出专门针对网络安全防御的GPT-5.4-Cyber模型，并采用比Anthropic更开放的方法，这反映了AI安全领域的竞争新格局。这种开放与限制之间的平衡，将决定AI在关键安全领域的应用广度和深度，可能重塑网络安全行业的工作方式。
  
  ai-security open-access
Visit annotations in context

Tags

open-access

ai-security

Annotators

fxp007

URL

theaivalley.com/p/google-s-desktop-agent
github.com github.com

https://github.com/gendigitalinc/sage

2
1. fxp007 17 Apr 2026
  
  in Public
  
  Lightweight Agent Detection & Response (ADR) layer for AI agents — guards commands, files, and web requests.
  
  这个项目定义了一个新的'ADR'(Agent Detection & Response)层概念，这标志着AI安全领域的一个重要演进。从传统的端点保护转向专门针对AI代理的轻量级防护，反映了安全行业对AI特定威胁模式的适应和专业化。
  
  ai-security new-paradigm agent-protection
2. fxp007 16 Apr 2026
  
  in Public
  
  Sage intercepts tool calls (Bash commands, URL fetches, file writes) via hook systems in Claude Code, Cursor / VS Code, OpenClaw, and OpenCode, and checks them against:
  
  令人惊讶的是：Sage 不仅是一个简单的安全工具，而是一个复杂的拦截系统，能够监控和检查多种AI代理平台上的工具调用。这种跨平台的集成能力展示了AI安全领域的复杂性和创新性，用户可能没有意识到他们的AI代理正在被如此全面地监控和保护。
  
  surprising ai-security cross-platform
Visit annotations in context

Tags

agent-protection

ai-security

surprising

new-paradigm

cross-platform

Annotators

fxp007

URL

github.com/gendigitalinc/sage
every.to every.to

https://every.to/playtesting/the-market-for-making-ai-better

1
1. fxp007 17 Apr 2026
  
  in Public
  
  Mercor, which provides data to AI labs for training, became one of the fastest-growing companies in history before losing four terabytes of data to hackers last week.
  
  Mercor的快速崛起与数据泄露事件形成了鲜明对比，凸显了数据安全在AI训练中的关键地位。这一事件可能引发行业对数据安全和隐私保护的重新审视，促使AI公司建立更严格的数据管理标准。
  
  data-security ai-startups risk-management
Visit annotations in context

Tags

data-security

risk-management

ai-startups

Annotators

fxp007

URL

every.to/playtesting/the-market-for-making-ai-better
cal.com cal.com

https://cal.com/blog/cal-com-goes-closed-source-why

5
1. fxp007 17 Apr 2026
  
  in Public
  
  Each platform surfaces different vulnerabilities, making it difficult to establish a single, reliable source of truth for what is actually secure.
  
  这一观察揭示了AI安全工具的碎片化问题，不同AI平台发现的漏洞各不相同，导致难以确定真正的安全状态。这种不确定性不仅增加了防御难度，还可能引发安全评估的混乱，需要建立新的行业标准来应对AI时代的安全挑战。
  
  ai-fragmentation security-standards
2. fxp007 17 Apr 2026
  
  in Public
  
  AI uncovered a 27-year-old vulnerability in the BSD kernel, one of the most widely used and security-focused open source projects, and generated working exploits in a matter of hours.
  
  这一事实令人震惊，展示了AI发现漏洞的惊人能力。即使是经过数十年审查的安全项目，AI也能在几小时内发现并生成利用代码，这表明传统的安全审查方法已无法应对AI驱动的威胁，需要全新的防御策略。
  
  ai-capabilities security-threat
3. fxp007 17 Apr 2026
  
  in Public
  
  AI can be pointed at an open source codebase and systematically scan it for vulnerabilities.
  
  这是一个令人警醒的观察，揭示了AI技术如何从根本上改变了安全威胁的格局。AI自动化扫描使攻击门槛大幅降低，从需要专业技能转变为任何人都能使用的工具，这可能导致开源软件面临前所未有的安全挑战。
  
  ai-security open-source
4. fxp007 16 Apr 2026
  
  in Public
  
  Each platform surfaces different vulnerabilities, making it difficult to establish a single, reliable source of truth for what is actually secure.
  
  令人惊讶的是：AI安全工具之间存在不一致性，导致难以确定真正的安全状况。这种混乱局面使得企业面临更大的决策困境，即使有先进的安全工具，也无法保证全面保护，这反映了AI安全领域尚未成熟的现实。
  
  surprising ai-tools security-uncertainty
5. fxp007 16 Apr 2026
  
  in Public
  
  AI uncovered a 27-year-old vulnerability in the BSD kernel, one of the most widely used and security-focused open source projects, and generated working exploits in a matter of hours.
  
  令人惊讶的是：AI能够在几小时内发现并利用一个存在了27年的BSD内核漏洞，这展示了AI在安全领域的惊人能力。这个事实揭示了传统安全审计方法在面对AI加速攻击时的脆弱性，即使是像BSD这样经过长期审查的开源项目也无法幸免。
  
  surprising ai-security open-source
Visit annotations in context

Tags

ai-capabilities

security-uncertainty

ai-security

surprising

security-threat

ai-tools

security-standards

open-source

ai-fragmentation

Annotators

fxp007

URL

cal.com/blog/cal-com-goes-closed-source-why
x.com x.com

https://x.com/starzq/status/2042618690777288796

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Mythos reportedly autonomously discovered thousands of zero-day vulnerabilities within weeks
  
  令人惊讶的是：Claude Mythos AI系统能在短短几周内自主发现数千个零日漏洞，这种发现速度远超人类安全专家团队的能力，展示了AI在网络安全领域的惊人潜力，同时也引发了政策制定者对AI攻击能力可能威胁金融基础设施的担忧。
  
  surprising ai-security zero-day
Visit annotations in context

Tags

zero-day

ai-security

surprising

Annotators

fxp007

URL

x.com/starzq/status/2042618690777288796
x.com x.com

https://x.com/AlphaSignalAI/status/2043706039334252599

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Same clinical question, two framings. One as a patient, one as a doctor.
  
  令人惊讶的是：完全相同的医疗问题，仅因提问者身份从"患者"变为"医生"，AI就会给出截然不同的回答。这种简单的措辞变化就能触发或绕过安全限制，表明AI的安全机制极其脆弱且容易被规避。
  
  surprising ai-security prompt-engineering
Visit annotations in context

Tags

prompt-engineering

ai-security

surprising

Annotators

fxp007

URL

x.com/AlphaSignalAI/status/2043706039334252599
x.com x.com

https://x.com/19026149/status/2042257892116439094

1
1. fxp007 16 Apr 2026
  
  in Public
  
  that's what anthropic says it cost Mythos to find those zero days. per repo.
  
  令人惊讶的是：Anthropic声称每次代码库扫描要花费2万美元来发现零日漏洞，这个价格远高于人们对于代码安全审计成本的预期，揭示了AI安全分析的高昂门槛。
  
  surprising ai-security cost-analysis
Visit annotations in context

Tags

cost-analysis

ai-security

surprising

Annotators

fxp007

URL

x.com/19026149/status/2042257892116439094
www.understandingai.org www.understandingai.org

https://www.understandingai.org/p/why-anthropic-believes-its-latest

2
1. fxp007 16 Apr 2026
  
  in Public
  
  Across 1,000 runs, Claude Mythos Preview was able to find several bugs in OpenBSD, including one that allows any attacker to remotely crash a computer running it. The notable thing was that the bug had existed for 27 years.
  
  令人惊讶的是：一个存在了27年的漏洞在OpenBSD这一以安全性著称的操作系统中被AI模型发现，而在此期间人类安全专家却未能察觉。这突显了AI在安全审计方面的独特优势和潜在价值。
  
  surprising ai-auditing security-history fun-fact
2. fxp007 16 Apr 2026
  
  in Public
  
  Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.
  
  令人惊讶的是：一个AI模型能够在已经经过严格安全审查的主流操作系统和浏览器中发现数千个高危漏洞，这表明AI的漏洞发现能力已经达到了令人震惊的水平，远超人类安全专家的传统能力范围。
  
  surprising ai-security fun-fact
Visit annotations in context

Tags

ai-auditing

surprising

ai-security

fun-fact

security-history

Annotators

fxp007

URL

understandingai.org/p/why-anthropic-believes-its-latest
www.cybergym.io www.cybergym.io

https://www.cybergym.io/

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Agents show only ~10% success on instances with PoCs longer than 100 bytes, which represent 65.7% of the benchmark
  
  令人惊讶的是：AI助手在处理复杂输入时表现极差，对于超过100字节的概念验证(PoC)，成功率仅为10%。这表明尽管AI在网络安全领域取得了进展，但在处理需要深度分析和复杂输入生成的任务时仍面临重大挑战，而这类任务恰恰代表了大多数现实世界中的安全漏洞。
  
  surprising ai-limitations security-challenges
Visit annotations in context

Tags

security-challenges

ai-limitations

surprising

Annotators

fxp007

URL

cybergym.io/
www.anthropic.com www.anthropic.com

Project Glasswing: Securing critical software for the AI era

2
1. fxp007 16 Apr 2026
  
  in Public
  
  The window between a vulnerability being discovered and being exploited by an adversary has collapsed—what once took months now happens in minutes with AI.
  
  令人惊讶的是：AI的出现将漏洞被发现到被利用的时间窗口从几个月缩短到了几分钟。这种根本性的变化意味着传统的安全响应机制已经不再适用，网络安全领域正在经历前所未有的加速变革。
  
  surprising ai-threats security-response
2. fxp007 09 Apr 2026
  
  in Public
  
  AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.
  
  大多数人认为AI在安全领域仍处于辅助地位，需要人类专家的指导和监督。但作者认为AI已经超越几乎所有人类专家，能够自主发现和利用软件漏洞。这是一个颠覆性的观点，因为它挑战了人类在网络安全领域的传统主导地位。
  
  non-consensus ai-security counterintuitive
Visit annotations in context

Tags

counterintuitive

surprising

non-consensus

ai-security

security-response

ai-threats

Annotators

fxp007

URL

anthropic.com/glasswing
openai.com openai.com

https://openai.com/index/the-next-evolution-of-the-agents-sdk/

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Agent systems should be designed assuming prompt-injection and exfiltration attempts. Separating harness and compute helps keep credentials out of environments where model-generated code executes.
  
  令人惊讶的是：OpenAI明确指出AI代理系统应假设存在提示注入和数据泄露尝试，并建议将控制层与计算层分离以保护凭据。这种安全设计理念表明，OpenAI对AI安全威胁有深刻理解，并采取了主动防御措施，这与许多开发者可能采用的被动安全方法形成鲜明对比。
  
  surprising security ai-safety
Visit annotations in context

Tags

ai-safety

security

surprising

Annotators

fxp007

URL

openai.com/index/the-next-evolution-of-the-agents-sdk/
every.to every.to

Every Is Half Agent Now

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Mythos found zero-day bugs in every major OS and browser, without human guidance.
  
  令人惊讶的是：Anthropic最新的Mythos模型能够自主发现所有主流操作系统和浏览器中的零日漏洞，无需人类指导。这表明AI安全能力已经达到了令人难以置信的水平，能够自主识别人类可能忽略的安全威胁，预示着AI在网络安全领域的革命性潜力。
  
  surprising ai-security zero-day-bugs
Visit annotations in context

Tags

zero-day-bugs

ai-security

surprising

Annotators

fxp007

URL

every.to/context-window/every-is-half-agent-now
www.theaivalley.com www.theaivalley.com

https://www.theaivalley.com/p/the-claude-mythos-era

2
1. fxp007 16 Apr 2026
  
  in Public
  
  The model reportedly scored 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro, but its strongest signal came from real-world results, including uncovering a 27-year-old flaw in OpenBSD, a 16-year-old vulnerability in FFmpeg, and autonomously chaining Linux kernel exploits without human input.
  
  令人惊讶的是：Claude Mythos不仅在高标准测试中表现出色，还能独立发现长达27年和16年的严重安全漏洞，甚至能自主链接Linux内核漏洞，展示了AI在网络安全领域的惊人能力，这种自主发现和利用漏洞的能力远超人类专家。
  
  surprising ai-security fun-fact
2. fxp007 16 Apr 2026
  
  in Public
  
  The model reportedly scored 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro, but its strongest signal came from real-world results, including uncovering a 27-year-old flaw in OpenBSD, a 16-year-old vulnerability in FFmpeg, and autonomously chaining Linux kernel exploits without human input.
  
  这些惊人的安全漏洞发现能力表明AI已经超越了传统安全工具，能够自主发现几十年未被发现的漏洞。特别是能够自主链接Linux内核漏洞的能力，展示了AI在网络安全领域的革命性潜力，这可能彻底改变安全研究和漏洞修复的方式。
  
  ai-security benchmark-data real-world-results
Visit annotations in context

Tags

surprising

ai-security

fun-fact

real-world-results

benchmark-data

Annotators

fxp007

URL

theaivalley.com/p/the-claude-mythos-era
deepmind.google deepmind.google

https://deepmind.google/models/gemma/gemma-4/

1
1. fxp007 09 Apr 2026
  
  in Public
  
  Gemma 4 models undergo the same rigorous infrastructure security protocols as our proprietary models.
  
  「与专有模型相同的安全协议」——这句话针对的是企业和主权机构客户，暗示 Google 正在用开源模型打「安全牌」吸引政府和监管严格行业。对于不愿依赖 OpenAI/Anthropic 闭源 API 的企业，E2B/E4B 提供了一条「可审计、可部署、可监管」的路径，而 Google DeepMind 的安全背书是这条路的核心说服力。
  
  enterprise-security sovereign-AI open-weight-trust compliance
Visit annotations in context

Tags

compliance

enterprise-security

sovereign-AI

open-weight-trust

Annotators

fxp007

URL

deepmind.google/models/gemma/gemma-4/
blogs.cisco.com blogs.cisco.com

https://blogs.cisco.com/news/rising-to-the-era-of-ai-powered-cyber-defense

1
1. fxp007 09 Apr 2026
  
  in Public
  
  New AI models, especially those from Anthropic,have triggered a new set of actions for how we build and secure our products.
  
  令人惊讶的是：Anthropic等公司的新型AI模型不仅仅是工具，它们直接触发了思科改变构建和保障产品的方式。这种由模型能力反向驱动工程流程重构的现象，说明AI已经不再是业务的附属品，而是正在成为定义行业基础设施形态的决定性力量。
  
  fun-fact ai-model-impact product-security
Visit annotations in context

Tags

ai-model-impact

product-security

fun-fact

Annotators

fxp007

URL

blogs.cisco.com/news/rising-to-the-era-of-ai-powered-cyber-defense
arxiv.org arxiv.org

https://arxiv.org/abs/2604.02947

1
1. fxp007 08 Apr 2026
  
  in Public
  
  computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments
  
  主流观点认为文本语言模型和计算机使用代理的安全挑战本质上是相同的，只需将文本安全措施扩展即可。但作者指出，计算机使用代理引入了持久状态、工具使用和执行环境等全新维度，创造了与纯文本系统完全不同的安全挑战，这挑战了简单的安全扩展假设。
  
  non-consensus ai-agents security-paradigm
Visit annotations in context

Tags

non-consensus

security-paradigm

ai-agents

Annotators

fxp007

URL

arxiv.org/abs/2604.02947
Nov 2025
www.anthropic.com www.anthropic.com

Disrupting the first reported AI-orchestrated cyber espionage campaign

1
1. TylerRick 17 Nov 2025
  
  in Public
  
  security AI threat actor
Visit annotations in context

Tags

threat actor

AI

security

Annotators

TylerRick

URL

anthropic.com/news/disrupting-AI-espionage
Apr 2025
superintelligence.gladstone.ai superintelligence.gladstone.ai

America's Superintelligence Project

1
1. stopresetgo 23 Apr 2025
  
  in Public
  
  To this day, if you know the right people, the Silicon Valley gossip mill is a surprisingly reliable source of information if you want to anticipate the next beat in frontier AI – and that’s a problem. You can’t have your most critical national security technology built in labs that are almost certainly CCP-penetrated
  
  for - high security risk - US AI labs
  
  high security risk - US AI labs
Visit annotations in context

Tags

high security risk - US AI labs

Annotators

stopresetgo

URL

superintelligence.gladstone.ai/
Dec 2024
louisville.edu louisville.edu

Q&A: UofL AI safety expert says artificial superintelligence could harm humanity

1
1. stopresetgo 27 Dec 2024
  
  in Public
  
  for - progress trap - AI superintelligence - interview - AI safety researcher and director of the Cyber Security Laboratory at the University of Louisville - Roman Yampolskiy - progress trap - over 99% chance AI superintelligence arriving as early as 2027 will destroy humanity - article UofL - Q&A: UofL AI safety expert says artificial superintelligence could harm humanity - 2024, July 15
  
  progress trap - over 99% chance AI superintelligence arriving as early as 2027 will destroy humanity article UofL - Q&A: UofL AI safety expert says artificial superintelligence could harm humanity - 2024, July 15 progress trap - AI superintelligence - interview - AI safety researcher and director of the Cyber Security Laboratory at the University of Louisville - Roman Yampolskiy
Visit annotations in context

Tags

article UofL - Q&A: UofL AI safety expert says artificial superintelligence could harm humanity - 2024, July 15

progress trap - over 99% chance AI superintelligence arriving as early as 2027 will destroy humanity

progress trap - AI superintelligence - interview - AI safety researcher and director of the Cyber Security Laboratory at the University of Louisville - Roman Yampolskiy

Annotators

stopresetgo

URL

louisville.edu/news/qa-uofl-ai-safety-expert-says-artificial-superintelligence-could-harm-humanity
Jun 2024
docdrop.org docdrop.org

Video: Ex-OpenAI Employee Just Revealed it ALL! (DocDrop)

6
1. stopresetgo 22 Jun 2024
  
  in Public
  
  this company's got not good for safety
  
  for - AI - security - Open AI - examples of poor security - high risk for humanity
  
  AI - security - Open AI - examples of poor security - high risk for humanity - ex-employees report very inadequate security protocols - employees have had screenshots capture while at cafes outside of Open AI offices - People like Jimmy Apple report future releases on twitter before Open AI does
  
  AI - security - Open AI - poor security - high risk for humanity
2. stopresetgo 22 Jun 2024
  
  in Public
  
  open AI literally yesterday published securing research infrastructure for advanced AI
  
  for - AI - Security - Open AI statement in response to this essay
  
  AI - Security - Open AI statement in response to this essay
3. stopresetgo 22 Jun 2024
  
  in Public
  
  this is a serious problem because all they need to do is automate AI research 00:41:53 build super intelligence and any lead that the US had would vanish the power dynamics would shift immediately
  
  for - AI - security risk - once automated AI research is known, bad actors can easily build superintelligence
  
  AI - security risk - once automated AI research is known, bad actors can easily build superintelligence - Any lead that the US had would immediately vanish.
  
  AI - security risk - once automated AI research is known, bad actors can easily build superintelligence
4. stopresetgo 22 Jun 2024
  
  in Public
  
  the model Waits are just a large files of numbers on a server and these can be easily stolen all it takes is an adversary to match your trillions 00:41:14 of dollars and your smartest minds of Decades of work just to steal this file
  
  for - AI - security risk - model weight files - are a key leverage point
  
  AI - security risk - model weight files - are a key leverage point for bad actors - These files are critical national security data that represent huge amounts of investment in time and research and they are just a file so can be easily stolen.
  
  AI - security risk - model weight files - are a key leverage point for bad actors
5. stopresetgo 22 Jun 2024
  
  in Public
  
  our failure today will be irreversible soon in the next 12 to 24 months we will leak key AGI breakthroughs to the CCP it will 00:38:56 be to the National security establishment the greatest regret before the decade is out
  
  for - AI - security risk - next 1 to 2 years is vulnerable time to keep AI secrets out of hands of authoritarian regimes
  
  AI - security risk - next 1 to 2 years is vulnerable time to keep AI secrets out of hands of authoritarian regimes
6. stopresetgo 22 Jun 2024
  
  in Public
  
  here are so many loopholes in our current top AI Labs that we could literally have people who are infiltrating these companies and there's no way to even know what's going on because we don't have any true security 00:37:41 protocols and the problem is is that it's not being treated as seriously as it is
  
  for - key insight - low security at top AI labs - high risk of information theft ending up in wrong hands
  
  key insight - low security at top AI labs - high risk of information theft ending up in wrong hands
Visit annotations in context

Tags

AI - Security - Open AI statement in response to this essay

AI - security risk - next 1 to 2 years is vulnerable time to keep AI secrets out of hands of authoritarian regimes

AI - security risk - model weight files - are a key leverage point for bad actors

key insight - low security at top AI labs - high risk of information theft ending up in wrong hands

AI - security risk - once automated AI research is known, bad actors can easily build superintelligence

AI - security - Open AI - poor security - high risk for humanity

Annotators

stopresetgo

URL

docdrop.org/video/om5KAKSSpNg/
Oct 2023
cdn.openai.com cdn.openai.com

gpt-4.pdf

1
1. rmkn 15 Oct 2023
  
  in Public
  
  Openai is looking to predict performance and safety because models are too big to be evaluated directly. To me this implies a high probability that people start to replace their own capabilities with models not enough safe and relevant. It could cause misalignment between people and their environment, or worse their perception of their environment.
  
  ai security
Visit annotations in context

Tags

security

ai

Annotators

rmkn

URL

cdn.openai.com/papers/gpt-4.pdf
May 2023
incidentdatabase.ai incidentdatabase.ai

Welcome to the Artificial Intelligence Incident Database

1
1. kael 07 May 2023
  
  in Public
  
  aiid security ai wikipedia:en=AI_safety
Visit annotations in context

Tags

aiid

security

wikipedia:en=AI_safety

ai

Annotators

kael

URL

incidentdatabase.ai/
Apr 2023
9to5mac.com 9to5mac.com

Terrifying study shows how fast AI can crack your passwords; here’s how to protect yourself

2
1. LaCAfJH 09 Apr 2023
  
  in Public
  
  Seeing how powerful AI can be for cracking passwords is a good reminder to not only make sure you‘re using strong passwords but also check:↳ You‘re using 2FA/MFA (non-SMS-based whenever possible) You‘re not re-using passwords across accounts Use auto-generated passwords when possible Update passwords regularly, especially for sensitive accounts Refrain from using public WiFi, especially for banking and similar accounts
  
  看到人工智能在破解密码方面有多么强大，这很好地提醒了我们，不仅要确保你在使用强密码，还要检查：
  
  你正在使用 2FA/MFA（尽可能不使用基于短信的）。
  
  你没有在不同的账户间重复使用密码
  
  尽可能使用自动生成的密码
  
  定期更新密码，特别是敏感账户的密码
  
  避免使用公共WiFi，尤其是银行和类似账户
  
  Tech Privacy Security AI
2. LaCAfJH 09 Apr 2023
  
  in Public
  
  Now Home Security Heroes has published a study showing how scary powerful the latest generative AI is at cracking passwords. The company used the new password cracker PassGAN (password generative adversarial network) to process a list of over 15,000,000 credentials from the Rockyou dataset and the results were wild. 51% of all common passwords were cracked in less than one minute, 65% in less than an hour, 71% in less than a day, and 81% in less than a month.
  
  Tech Privacy Security AI
Visit annotations in context

Tags

AI

Privacy

Tech

Security

Annotators

LaCAfJH

URL

9to5mac.com/2023/04/07/ai-cracks-passwords-this-fast-how-to-protect/
Jul 2020
www.youtube.com www.youtube.com

Opening Talk | EA Global: Virtual Conference

1
1. edampf 23 Jul 2020
  
  in BehSci
  
  Centre for Effective Altruism. (2020, June 13 & 14). EAGxVirtual 2020 Virtual Conference. https://www.youtube.com/playlist?list=PLwp9xeoX5p8NfF4UmWcwV0fQlSU_zpHqc
  
  is:youtube lang:en COVID-19 altruism virtual conference webinar video AI security conflict climate change global health animal advocacy decision making research
Visit annotations in context

Tags

climate change

AI

video

research

is:youtube

altruism

decision making

conflict

animal advocacy

security

lang:en

virtual conference

global health

COVID-19

webinar

Annotators

edampf

URL

youtube.com/watch
Mar 2020
polisen.se polisen.se

Ärende överlämnat till särskilda utredningar | Polismyndigheten

1
1. pivic 11 Mar 2020
  
  in Public
  
  Enligt Polismyndighetens riktlinjer ska en konsekvensbedömning göras innan nya polisiära verktyg införs, om de innebär en känslig personuppgiftbehandling. Någon sådan har inte gjorts för det aktuella verktyget.
  
  Swedish police have used Clearview AI without any 'consequence judgement' having been performed.
  
  In other words, Swedish police have used a facial-recognition system without being allowed to do so.
  
  This is a clear breach of human rights.
  
  Swedish police has lied about this, as reported by Dagens Nyheter.
  
  sweden sverige polis police abuse ai clearview ai surveillance privacy security facial recognition dagens nyheter
Visit annotations in context

Tags

facial recognition

surveillance

clearview ai

ai

dagens nyheter

privacy

polis

abuse

police

sverige

security

sweden

Annotators

pivic

URL

polisen.se/aktuellt/nyheter/2020/mars/arende-overlamnat-till-sarskilda-utredningar/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators