- Nov 2018
Multi-dimensional scaling (MDS) and Principla Coordinate Analysis(PCoA) are very similar to PCA, except that instead of converting correlations into a 2-D graph, they convert distance among the samples into a 2-D graph.
So, in order to do MDS or PCoA, we have to calculate the distance between Cell1 and Cell2, and distance between Cell1 and Cell3...
- 1 2
- 1 3
- 1 4
- 2 3
- 2 4
- 3 4
One very common way to calculate distance between two things is to calculate the Euclidian distance.
And once we calculated the distance between every pair of cells, MDS and PCoA would reduce them to a 2-D graph.
The bad newsis that if we used the Euclidean Distance, the graph would be identical to a PCA graph!!
In other words, clustering based on minimizing the linear distances is the same with maximzing the linear correlations.
我想这里也就是为什么，李宏毅老师在 t-SNE 课程一开始时说，其他非监督降维算法都只是专注于【如何让·簇内距小·】，而 t-SNE 还考虑了【如何让·簇间距大·】
The good newsis that there are tons of other ways to measure distance!!!
For example, another way to measure distances between cells is to calcualte between cells is to calculate the average of the absolute value of the log fold changes among genes.
Finally， we get a plot different from the PCA plot
A biologist might choose to use log fold change to calculate distance because they are frequently interested in log fold changes among genes...
But there are lots of distance to choose from...
- Manhattan Distance
- Hamming Distance
- Great Circle Distance
- PCA creates plots based on correlations among samples;
- MDS and PCoA create plots based on distances among samples