On 2017 Jan 05, Pablo Tamayo commented:
A key aspect of the Gene Set Enrichment Analysis (GSEA) approach is the preservation of gene-gene dependency in estimating the significance of enrichment (i.e., coordinate expression) of a set of genes. Because we know that genes are not independent variables, modeling dependencies as done by GSEA better mirrors the underlying biology of the systems we seek to study. While simpler statistical tests may reduce computational complexity, they do so at the expense of making unrealistic assumptions not supported by the data. In our response (http://www.ncbi.nlm.nih.gov/pubmed/23070592) to the claims in http://www.ncbi.nlm.nih.gov/pubmed/20048385 we carefully considered the assumptions of the proposed “simplified” SEA method and its results. This included a comparative analysis on a large collection of 50 benchmarks. By randomizing phenotypes, we showed that gene-gene correlations produce significant variance inflation in SEA results, which in turn produce very high false positive rates and inflated p- and q-values, while GSEA does not. These results provide strong empirical evidence that gene-gene correlations cannot be ignored and agree with the extensive literature providing theoretical or empirical evidence against the gene independence assumption. See, e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091509/ https://www.ncbi.nlm.nih.gov/pubmed/16646853 https://www.ncbi.nlm.nih.gov/pubmed/17303618
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.