- Jul 2018
-
www.pnas.org www.pnas.org
-
On 2017 Jan 05, Pablo Tamayo commented:
A key aspect of the Gene Set Enrichment Analysis (GSEA) approach is the preservation of gene-gene dependency in estimating the significance of enrichment (i.e., coordinate expression) of a set of genes. Because we know that genes are not independent variables, modeling dependencies as done by GSEA better mirrors the underlying biology of the systems we seek to study. While simpler statistical tests may reduce computational complexity, they do so at the expense of making unrealistic assumptions not supported by the data. In our response (http://www.ncbi.nlm.nih.gov/pubmed/23070592) to the claims in http://www.ncbi.nlm.nih.gov/pubmed/20048385 we carefully considered the assumptions of the proposed “simplified” SEA method and its results. This included a comparative analysis on a large collection of 50 benchmarks. By randomizing phenotypes, we showed that gene-gene correlations produce significant variance inflation in SEA results, which in turn produce very high false positive rates and inflated p- and q-values, while GSEA does not. These results provide strong empirical evidence that gene-gene correlations cannot be ignored and agree with the extensive literature providing theoretical or empirical evidence against the gene independence assumption. See, e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091509/ https://www.ncbi.nlm.nih.gov/pubmed/16646853 https://www.ncbi.nlm.nih.gov/pubmed/17303618
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2013 Nov 12, James Hadfield commented:
Who would run a differential gene expression experiment today and NOT look at gene set enrichment analysis? It is difficult to remember the impact this paper had on the field and what we were doing just five years earlier with the first microarrays!
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2013 Jun 28, Rafael Irizarry commented:
I agree with Rob that this is an important paper. The idea of analyzing differential expression for groups of genes, as opposed to individual genes, was an important step forward in the analysis of gene expression data. In one of the papers Rob links to we point out that the method would have worked just as well (or perhaps better) using simple (existing) statistical tests rather than the novel versions of the KS-test presented in the paper. Examples of some of these simpler tests are implemented in the Bioconductor limma package. The critique is not just about power, but about interpretability and ease of implementation. But these critiques should not take away from the important contribution made by this paper.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2013 Jun 16, Robert Tibshirani commented:
This is an important paper, one that introduced the idea of analyzing the differential expression of groups of genes, rather than individual genes. The advantage is both in power and interpretability. Importantly, this group has also created and curated an impressive collection of gene sets for this purpose. See http://www.broadinstitute.org/gsea/index.jsp
The particular test here- the KS statistic- has been criticized for lack of power and alternatives have been suggested (see eg http://arxiv.org/pdf/math/0610667.pdf and http://www.ncbi.nlm.nih.gov/pubmed/20048385.
A response to the latter is in http://www.ncbi.nlm.nih.gov/pubmed/23070592
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
www.pnas.org www.pnas.org
-
On 2013 Jun 16, Robert Tibshirani commented:
This is an important paper, one that introduced the idea of analyzing the differential expression of groups of genes, rather than individual genes. The advantage is both in power and interpretability. Importantly, this group has also created and curated an impressive collection of gene sets for this purpose. See http://www.broadinstitute.org/gsea/index.jsp
The particular test here- the KS statistic- has been criticized for lack of power and alternatives have been suggested (see eg http://arxiv.org/pdf/math/0610667.pdf and http://www.ncbi.nlm.nih.gov/pubmed/20048385.
A response to the latter is in http://www.ncbi.nlm.nih.gov/pubmed/23070592
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2013 Jun 28, Rafael Irizarry commented:
I agree with Rob that this is an important paper. The idea of analyzing differential expression for groups of genes, as opposed to individual genes, was an important step forward in the analysis of gene expression data. In one of the papers Rob links to we point out that the method would have worked just as well (or perhaps better) using simple (existing) statistical tests rather than the novel versions of the KS-test presented in the paper. Examples of some of these simpler tests are implemented in the Bioconductor limma package. The critique is not just about power, but about interpretability and ease of implementation. But these critiques should not take away from the important contribution made by this paper.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2013 Nov 12, James Hadfield commented:
Who would run a differential gene expression experiment today and NOT look at gene set enrichment analysis? It is difficult to remember the impact this paper had on the field and what we were doing just five years earlier with the first microarrays!
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2017 Jan 05, Pablo Tamayo commented:
A key aspect of the Gene Set Enrichment Analysis (GSEA) approach is the preservation of gene-gene dependency in estimating the significance of enrichment (i.e., coordinate expression) of a set of genes. Because we know that genes are not independent variables, modeling dependencies as done by GSEA better mirrors the underlying biology of the systems we seek to study. While simpler statistical tests may reduce computational complexity, they do so at the expense of making unrealistic assumptions not supported by the data. In our response (http://www.ncbi.nlm.nih.gov/pubmed/23070592) to the claims in http://www.ncbi.nlm.nih.gov/pubmed/20048385 we carefully considered the assumptions of the proposed “simplified” SEA method and its results. This included a comparative analysis on a large collection of 50 benchmarks. By randomizing phenotypes, we showed that gene-gene correlations produce significant variance inflation in SEA results, which in turn produce very high false positive rates and inflated p- and q-values, while GSEA does not. These results provide strong empirical evidence that gene-gene correlations cannot be ignored and agree with the extensive literature providing theoretical or empirical evidence against the gene independence assumption. See, e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091509/ https://www.ncbi.nlm.nih.gov/pubmed/16646853 https://www.ncbi.nlm.nih.gov/pubmed/17303618
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-