5 Matching Annotations
  1. Last 7 days
    1. A high loss scale in the L-BFGS-B minimizer, caused by averaging Huber-loss values over examples instead of summing them, which led to premature termination of the optimization.

      技术细节如损失函数的求和方式而非平均,可能导致优化提前终止,影响缩放定律拟合结果。这提醒我们,在实现算法时需注意细节,即使是看似微小的实现差异也可能导致显著不同的结果。

  2. Mar 2021
  3. Nov 2020
  4. Apr 2020