5 Matching Annotations
 Jul 2019

sebastianraschka.com sebastianraschka.com

in clustering analyses, standardization may be especially crucial in order to compare similarities between features based on certain distance measures. Another prominent example is the Principal Component Analysis, where we usually prefer standardization over MinMax scaling, since we are interested in the components that maximize the variance
Use standardization, not minmax scaling, for clustering and PCA.

As a rule of thumb I’d say: When in doubt, just standardize the data, it shouldn’t hurt.


scikitlearn.org scikitlearn.org

many elements used in the objective function of a learning algorithm (such as the RBF kernel of Support Vector Machines or the l1 and l2 regularizers of linear models) assume that all features are centered around zero and have variance in the same order. If a feature has a variance that is orders of magnitude larger than others, it might dominate the objective function and make the estimator unable to learn from other features correctly as expected.

 Jun 2019

sebastianraschka.com sebastianraschka.com

However, this doesn’t mean that MinMax scaling is not useful at all! A popular application is image processing, where pixel intensities have to be normalized to fit within a certain range (i.e., 0 to 255 for the RGB color range). Also, typical neural network algorithm require data that on a 01 scale.
Use minmax scaling for image processing & neural networks.

The result of standardization (or Zscore normalization) is that the features will be rescaled so that they’ll have the properties of a standard normal distribution with μ=0μ=0\mu = 0 and σ=1σ=1\sigma = 1 where μμ\mu is the mean (average) and σσ\sigma is the standard deviation from the mean
