8 Matching Annotations
  1. Apr 2026
    1. we use the distance preference characterized by these centers to score keys according to their positions, and also leverage Q/K norms as an additional signal for importance estimation

      大多数人认为KV缓存压缩主要基于注意力分数或内容相似性,但作者提出使用向量中心决定的距离偏好和Q/K范数作为重要性估计的信号。这一方法将注意力机制从传统的基于内容相似性转向基于几何特征,是一种全新的压缩思路。

  2. Sep 2023
  3. Mar 2023
  4. Jan 2023
    1. Data that combine multiplicatively, like rates, are actually very common outside of economics too. The key is to recognize when a measured variables is affected by many (semi) independent forces, each of which scales that variable up or down — rather than simply adding or subtracting a fixed amount to it. This is often true in the natural sciences.
  5. Jul 2021
  6. Jun 2020
  7. Mar 2018