4 Matching Annotations
  1. Jul 2018
    1. On 2014 Sep 15, Casey M Bergman commented:

      In follow-up work estimating allele frequencies of the TE insertion site data in this paper, we identified a small error in the data processing underlying the S1, S2, S3, and S4 supplementary files. These files provided incorrect read support counts based only on the first strain in which the TE insertion was identified, rather than the total number of read counts from all strains merged across the entire dataset.

      For 461 TE insertions that are present in more than one strain identified using 454 sequencing, the corrected number of reads supporting the TE insertion is higher than originally reported. For 1,606 TE insertions that are present in more than one strain identified using Illumina sequencing, the corrected number of reads supporting the TE insertion is higher than originally reported.

      The location, strand and TE family for 3,379 out of 3,386 TE insertion sites identified using 454 sequencing in Linheiro & Bergman 2012 is unchanged. For 7 out of the 3,386 TE insertions identified using 454 sequencing, properly merging reads across strains led to differences in location, strand or TE family. The location, strand and TE family for all 8,024 TE insertion sites identified using Illumina sequencing in Linheiro & Bergman 2012 is unchanged.

      None of the main conclusions of Linheiro & Bergman 2012 are affected by this error, since the Illumina data set formed the basis of the target site duplication and motif analyses. However, four values in the first paragraph of the results should be corrected to read as follows (original --> corrected):

      "For the 454 data, we processed 209,979,997 reads from a total of 34 strains and retained 44,254-->53,940 reads (0.021%-->0.026% of the total) across 34 strains that included a TE start/end for a TIR or LTR element that could be mapped to the reference genome. For the Illumina data we processed 7,835,189,604 reads from a total of 176 strains and retained 65,488 --> 97,854 reads (0.00084% --> 0.00124% of the total) across 166 strains that uniquely matched a start or end of a TE for a TIR and LTR element that could be mapped to the reference genome."

      Revised versions of Files S1, S2, S3, and S4 that correct this error can be found here:

      Revised version of File S1 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1170046

      Revised version of File S2 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1170047

      Revised version of File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.683836

      Revised version of File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.683834

      In addition to making these revised files, we have also generated alternate versions of the S3 and S4 .bed files that encode the number of DGRP strains in which the TE insertion is found (rather than the read support count) in the score field. These alternate versions allow estimation of the allele frequency of TE insertions in the DGRP population, and can be found here:

      Alternate version of File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168882

      Alternate version of File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168883

      Finally, to allow determination of which DGRP strain each TE insertion was detected in, we have generated .zip archives of strain-specific .bed files (with read support count in the score field). These datasets can be found here:

      Strain-specific annotation files for data in File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168885

      Strain-specific annotation files for data in File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168884

      The new alternate and strain-specific files correspond to data in the revised S1, S2, S3, and S4 files.

      We apologize for any inconvenience this error could have caused.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2014 Sep 15, Casey M Bergman commented:

      We have released an implementation of the approach described in this paper to detect non-reference TE insertions using next-generation whole-genome resequencing data here: https://github.com/bergmanlab/ngs_te_mapper


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2014 Sep 15, Casey M Bergman commented:

      We have released an implementation of the approach described in this paper to detect non-reference TE insertions using next-generation whole-genome resequencing data here: https://github.com/bergmanlab/ngs_te_mapper


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2014 Sep 15, Casey M Bergman commented:

      In follow-up work estimating allele frequencies of the TE insertion site data in this paper, we identified a small error in the data processing underlying the S1, S2, S3, and S4 supplementary files. These files provided incorrect read support counts based only on the first strain in which the TE insertion was identified, rather than the total number of read counts from all strains merged across the entire dataset.

      For 461 TE insertions that are present in more than one strain identified using 454 sequencing, the corrected number of reads supporting the TE insertion is higher than originally reported. For 1,606 TE insertions that are present in more than one strain identified using Illumina sequencing, the corrected number of reads supporting the TE insertion is higher than originally reported.

      The location, strand and TE family for 3,379 out of 3,386 TE insertion sites identified using 454 sequencing in Linheiro & Bergman 2012 is unchanged. For 7 out of the 3,386 TE insertions identified using 454 sequencing, properly merging reads across strains led to differences in location, strand or TE family. The location, strand and TE family for all 8,024 TE insertion sites identified using Illumina sequencing in Linheiro & Bergman 2012 is unchanged.

      None of the main conclusions of Linheiro & Bergman 2012 are affected by this error, since the Illumina data set formed the basis of the target site duplication and motif analyses. However, four values in the first paragraph of the results should be corrected to read as follows (original --> corrected):

      "For the 454 data, we processed 209,979,997 reads from a total of 34 strains and retained 44,254-->53,940 reads (0.021%-->0.026% of the total) across 34 strains that included a TE start/end for a TIR or LTR element that could be mapped to the reference genome. For the Illumina data we processed 7,835,189,604 reads from a total of 176 strains and retained 65,488 --> 97,854 reads (0.00084% --> 0.00124% of the total) across 166 strains that uniquely matched a start or end of a TE for a TIR and LTR element that could be mapped to the reference genome."

      Revised versions of Files S1, S2, S3, and S4 that correct this error can be found here:

      Revised version of File S1 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1170046

      Revised version of File S2 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1170047

      Revised version of File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.683836

      Revised version of File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.683834

      In addition to making these revised files, we have also generated alternate versions of the S3 and S4 .bed files that encode the number of DGRP strains in which the TE insertion is found (rather than the read support count) in the score field. These alternate versions allow estimation of the allele frequency of TE insertions in the DGRP population, and can be found here:

      Alternate version of File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168882

      Alternate version of File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168883

      Finally, to allow determination of which DGRP strain each TE insertion was detected in, we have generated .zip archives of strain-specific .bed files (with read support count in the score field). These datasets can be found here:

      Strain-specific annotation files for data in File S3 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168885

      Strain-specific annotation files for data in File S4 from Linheiro & Bergman 2012. http://dx.doi.org/10.6084/m9.figshare.1168884

      The new alternate and strain-specific files correspond to data in the revised S1, S2, S3, and S4 files.

      We apologize for any inconvenience this error could have caused.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.