1 Matching Annotations
  1. Last 7 days
    1. our DFC is architecturally designed with three distinct sections: A shared dictionary, A "French-only" section, An "English-only" section

      Dedicated Feature Crosscoder(DFC)的三段式架构设计是这项研究的核心技术突破:通过分别建立「共享词典」和两个「专属词典」,强制让模型差异特征有独立的表示空间,而非被混入共享特征中。令人惊讶的是,如此影响深远的安全工具,其设计思路竟然与字典编纂学高度同构。