3 Matching Annotations
  1. Aug 2023
    1. Fortunately, this problem can be resolved by the following observation: The final rounding decisionsfor column i are only affected by updates performed on this very column, and so updates to latercolumns are irrelevant at this point in the process. T

      算法层OK,但具体实现时需要频繁计算更新,内存调用效率并不高,因此设置处理区间,区间的参数量化完成后在计算更新

      合理设置读取块的大小,使得GPU最快的部分能一次完成计算,而不是频繁置换

  2. Jul 2023
    1. The AdaRound method (Nagel et al., 2020) computes adata-dependent rounding by annealing a penalty term, which encourages weights to move towardsgrid points corresponding to quantization levels. BitSplit (Wang et al., 2020) constructs quantizedvalues bit-by-bit using a squared error objective on the residual error, while AdaQuant (Hubara et al.,2021) performs direct optimization based on straight-through estimates. BRECQ (Li et al., 2021)introduces Fisher information into the objective, and optimizes layers within a single residual blockjointly. Finally, Optimal Brain Quantization (OBQ) (Frantar et al., 2022) generalizes the classicOptimal Brain Surgeon (OBS) second-order weight pruning framework (Hassibi et al., 1993; Singh& Alistarh, 2020; Frantar et al., 2021) to apply to quantization. OBQ quantizes weights one-by-one,in order of quantization error, always adjusting the remaining weights.

      通常,精确的方法是通过量化单个层或连续层的小块来操作的。 逐个量化权重,并调整剩余权重是有效的 1. AdaRound方法(Nagel等人,2020年)通过退火一个惩罚项来计算数据依赖的舍入,这鼓励权值向与量化水平相对应的网格点移动 . 2. BitSplit(Wang等人,2020)使用残差的平方误差目标逐位构建量化值, 3. Adabara等人,2021)进行直接优化 直接的估计。 4. BRECQ(Li et al.,2021)将Fisher信息引入到目标中,并联合优化单个残差块内的图层。 5. 最佳的大脑量化( OBQ)(Frantar等人,2022年)概括了经典的最佳脑外科医生(OBS)二阶权重修剪框架 OBQ按量化误差的顺序,逐个量化权重,总是调整剩余的权重。而这些方法可以对高达1亿≈pa的模型产生良好的结果 在几个GPU小时内,将它们扩展到更大的网络数量级是具有挑战性的。

    2. our method currentlydoes not provide speedups for the actual multiplications, due to the lack of hardware support formixed-precision operands (e.g. FP16 x INT4) on mainstream architectures

      主流架构无混合乘法的支持 此方法未验证混合乘法的有效性