11 Matching Annotations
  1. Jun 2019
    1. However, this doesn’t mean that Min-Max scaling is not useful at all! A popular application is image processing, where pixel intensities have to be normalized to fit within a certain range (i.e., 0 to 255 for the RGB color range). Also, typical neural network algorithm require data that on a 0-1 scale.

      Use min-max scaling for image processing & neural networks.

    2. in clustering analyses, standardization may be especially crucial in order to compare similarities between features based on certain distance measures. Another prominent example is the Principal Component Analysis, where we usually prefer standardization over Min-Max scaling, since we are interested in the components that maximize the variance

      Use standardization, not min-max scaling, for clustering and PCA.

    3. As a rule of thumb I’d say: When in doubt, just standardize the data, it shouldn’t hurt.
    4. The result of standardization (or Z-score normalization) is that the features will be rescaled so that they’ll have the properties of a standard normal distribution with μ=0μ=0\mu = 0 and σ=1σ=1\sigma = 1 where μμ\mu is the mean (average) and σσ\sigma is the standard deviation from the mean
  2. Mar 2019
    1. One of the challenges of deep learning is that the gradients with respect to the weights in one layerare highly dependent on the outputs of the neurons in the previous layer especially if these outputschange in a highly correlated way. Batch normalization [Ioffe and Szegedy, 2015] was proposedto reduce such undesirable “covariate shift”. The method normalizes the summed inputs to eachhidden unit over the training cases. Specifically, for theithsummed input in thelthlayer, the batchnormalization method rescales the summed inputs according to their variances under the distributionof the data

      batch normalization的出现是为了解决神经元的输入和当前计算值交互的高度依赖的问题。因为要计算期望值,所以需要拿到所有样本然后进行计算,显然不太现实。因此将取样范围和训练时的mini-batch保持一致。但是这就把局限转移到mini-batch的大小上了,很难应用到RNN。因此需要LayerNormalization.

  3. Feb 2019
    1. LocalNorm: Robust Image Classification through Dynamically Regularized Normalization

      提出了新的 LocalNorm。既然Norm都玩得这么嗨了,看来接下来就可以研究小 GeneralizedNorm 或者 AnyRandomNorm 啥的了。。。[doge]

    2. Fixup Initialization: Residual Learning Without Normalization

      关于拟合的表现,Regularization 和 BN 的设计总是很微妙,尤其是 learning rate 再掺和进来以后。此 paper 的作者也就相关问题结合自己的文章在 Reddit 上有所讨论。

  4. Dec 2018
    1. Generalized Batch Normalization: Towards Accelerating Deep Neural Networks

      核心是这么一句话: Generalized Batch Normalization (GBN) to be identical to conventional BN but with

      1. standard deviation replaced by a more general deviation measure D(x)

      2. and the mean replaced by a corresponding statistic S(x).

    2. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)


  5. Sep 2018
    1. Normalization



      如果我要构造一个均值为0,标准差为 0.1 的数列怎么做?

      1. \(x_i \leftarrow x_i - \mu\)

      2. \(x_i \leftarrow x_i / \sigma\)

      3. \(x_i \leftarrow x_i * 0.1\)

      经过这三步归一化的动作,既能保持原来分布的特点,又能做到归一化为均值为0,标准差为 0.1 的分布。