- Jul 2018
-
europepmc.org europepmc.org
-
On 2017 Sep 13, Haibao Tang commented:
No major flaws in "Identification of individuals by trait prediction using whole-genome sequencing data"
For a complete discussion, please also read authors' response to Erlich's critique:
http://www.biorxiv.org/content/early/2017/09/11/187542
Abstract
In a recently published PNAS article, we studied the identifiability of genomic samples using machine learning methods [Lippert et al., 2017]. In a response, Erlich [2017] argued that our work contained major flaws. The main technical critique of Erlich [2017] builds on a simulation experiment that shows that our proposed algorithm, which uses only a genomic sample for identification, performed no better than a strategy that uses demographic variables. Below, we show why this comparison is misleading and provide a detailed discussion of the key critical points in our analysis that have been brought up in Erlich [2017] and in the media. We also want to point out that it is not only faces that may be derived from DNA, but a wide range of phenotypes and demographic variables. In this light, the main contribution of Lippert et al. [2017] is an algorithm that identifies genomes of individuals by combining DNA-based predictive models for multiple traits.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2017 Sep 08, Yaniv Erlich commented:
Major flaws in "Identification of individuals by trait prediction using whole-genome"
Check the following bioRxiv link for a full explanation on the methodological problems in this paper: http://www.biorxiv.org/content/early/2017/09/06/185330
Abstract
Genetic privacy is an area of active research. While it is important to identify new risks, it is equally crucial to supply policymakers with accurate information based on scientific evidence. Recently, Lippert et al. (PNAS, 2017) investigated the status of genetic privacy using trait-predictions from whole genome sequencing. The authors sequenced a cohort of about 1000 individuals and collected a range of demographic, visible, and digital traits such as age, sex, height, face morphology, and a voice signature. They attempted to use the genetic features in order to predict those traits and re-identify the individuals from small pool using the trait predictions. Here, I report major flaws in the Lippert et al. manuscript. In short, the authors' technique performs similarly to a simple baseline procedure, does not utilize the power of whole genome markers, uses technically wrong metrics, and finally does not really identify anyone.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
europepmc.org europepmc.org
-
On 2017 Sep 08, Yaniv Erlich commented:
Major flaws in "Identification of individuals by trait prediction using whole-genome"
Check the following bioRxiv link for a full explanation on the methodological problems in this paper: http://www.biorxiv.org/content/early/2017/09/06/185330
Abstract
Genetic privacy is an area of active research. While it is important to identify new risks, it is equally crucial to supply policymakers with accurate information based on scientific evidence. Recently, Lippert et al. (PNAS, 2017) investigated the status of genetic privacy using trait-predictions from whole genome sequencing. The authors sequenced a cohort of about 1000 individuals and collected a range of demographic, visible, and digital traits such as age, sex, height, face morphology, and a voice signature. They attempted to use the genetic features in order to predict those traits and re-identify the individuals from small pool using the trait predictions. Here, I report major flaws in the Lippert et al. manuscript. In short, the authors' technique performs similarly to a simple baseline procedure, does not utilize the power of whole genome markers, uses technically wrong metrics, and finally does not really identify anyone.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2017 Sep 13, Haibao Tang commented:
No major flaws in "Identification of individuals by trait prediction using whole-genome sequencing data"
For a complete discussion, please also read authors' response to Erlich's critique:
http://www.biorxiv.org/content/early/2017/09/11/187542
Abstract
In a recently published PNAS article, we studied the identifiability of genomic samples using machine learning methods [Lippert et al., 2017]. In a response, Erlich [2017] argued that our work contained major flaws. The main technical critique of Erlich [2017] builds on a simulation experiment that shows that our proposed algorithm, which uses only a genomic sample for identification, performed no better than a strategy that uses demographic variables. Below, we show why this comparison is misleading and provide a detailed discussion of the key critical points in our analysis that have been brought up in Erlich [2017] and in the media. We also want to point out that it is not only faces that may be derived from DNA, but a wide range of phenotypes and demographic variables. In this light, the main contribution of Lippert et al. [2017] is an algorithm that identifies genomes of individuals by combining DNA-based predictive models for multiple traits.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-