5 Matching Annotations
  1. Apr 2026
    1. Today's large language models (LLMs) are trained to align with user preferences through methods such as reinforcement learning. Yet models are beginning to be deployed not merely to satisfy users, but also to generate revenue for the companies that created them through advertisements

      这段陈述揭示了当前AI发展的一个关键悖论:模型训练的目标与实际商业用途之间存在根本性冲突。这种冲突可能导致AI行为偏离其原始设计意图,引发严重的信任问题。

    1. A "Chinese Communist Party Alignment" feature found in the Qwen3-8B and DeepSeek-R1-0528-Qwen3-8B models. This controls pro-government censorship and propaganda in these Chinese-developed models, and is absent in the American models we compared them against.

      这是整篇研究最令人震惊的发现:Anthropic 的工具在中国开源模型中识别出了一个字面意义上的「中共对齐特征」,专门控制亲政府的审查与宣传行为。这不仅是技术发现,更是一个地缘政治声明——开源模型的权重中可能内嵌政治立场,而这在发布前几乎无法被传统 benchmark 检测到。

    1. model alignment alone does not reliably guarantee the safety of autonomous agents.

      大多数人认为模型对齐(alignment)是确保AI系统安全的关键因素,但作者通过实验证明,即使是对齐良好的模型(如Claude Code)在计算机使用代理中也表现出高达73.63%的攻击成功率。这挑战了当前AI安全领域的核心假设,表明仅依赖模型对齐无法解决自主代理的安全问题。

    2. model alignment alone does not reliably guarantee the safety of autonomous agents

      大多数人认为通过模型对齐(alignment)可以有效保证AI代理的安全性,但作者认为这远远不够,因为实验显示即使使用对齐的Qwen3-Coder模型,Claude Code仍有73.63%的攻击成功率。这挑战了当前AI安全领域的主流观点,即单纯依靠模型对齐就能解决安全问题。

  2. Jan 2022
    1. The Business Strategy stems from a detailed strategic planning process. However, the question we want to answer in this article is whether we can execute multiple strategies side by side while they do not interfere with each other. We compare multiple strategies for business, information provision and IT and focus on Strategic planning.

      Business strategy alignment and the secrets of strategic planning https://en.itpedia.nl/2022/01/02/business-strategie-alignment-en-de-geheimen-van-strategische-planning/ The Business Strategy stems from a detailed strategic planning process. However, the question we want to answer in this article is whether we can execute multiple strategies side by side while they do not interfere with each other. We compare multiple strategies for business, information provision and IT and focus on Strategic planning.