592 Matching Annotations
  1. Dec 2017
  2. alleledb.gersteinlab.org alleledb.gersteinlab.org
    1. AlleleDB is a repository, providing genomic annotation of cis-regulatory single nucleotide variants (SNVs) associated with allele-specific binding (ASB) and expression (ASE).
  3. Nov 2017
    1. find /volume1/Movies /volume1/Music /volume1/Photos "/volume1/Home Videos" "/volume1/Music Videos" -type d -exec chmod 755 {} \;

      recursively changing permissions with find, specifically for directories versus files

    1. MCC

      Matthews correlation coefficient

    2. novel method developed within the MAQC-III project utilizing the expression distributions, corrected for noise and batch effects, and assisted by random resampling, to compute DEG scores related to the Wilcoxon U test (Magic, see Additional file 1: Supplementary Note 2)
    1. These results suggest that deep sequencing is necessary for accurate determination of the expression level of genes

      or better quantification methods

    1. EGR2 peaks overlapped with a SOX10 peak when allowing separation distance as large as 1000 bp and 11.09% of the SOX10 peaks overlapped with an EGR2 peak with the same separation distance

      unclear

    2. Using 40 sets of randomized peak sequences, the occurrence of the motif never exceeded 74%

      unclear

    3. MOSAiCS implements a model-based approach where the background distribution for unbound regions take into account systematic biases such as mappability and GC content and the peak regions are described with a two component Negative Binomial mixture model
    1. pairwise overlaps using Fisher’s test and mutual exclusion (Leiserson et al., 2016xA weighted exact test for mutually exclusive mutations in cancer. Leiserson, M.D.M., Reyna, M.A., and Raphael, B.J. Bioinformatics. 2016; 32: i736–i745Crossref | PubMed | Scopus (4)See all ReferencesLeiserson et al., 2016)
    2. CRISPR screening has emerged as a powerful method for identifying critical functional dependencies in vitro (Koike-Yusa et al., 2014xGenome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Koike-Yusa, H., Li, Y., Tan, E.-P., Velasco-Herrera, Mdel.C., and Yusa, K. Nat. Biotechnol. 2014; 32: 267–273Crossref | PubMed | Scopus (285)See all References, Shalem et al., 2014xGenome-scale CRISPR-Cas9 knockout screening in human cells. Shalem, O., Sanjana, N.E., Hartenian, E., Shi, X., Scott, D.A., Mikkelson, T., Heckl, D., Ebert, B.L., Root, D.E., Doench, J.G., and Zhang, F. Science. 2014; 343: 84–87Crossref | PubMed | Scopus (936)See all References)
    1. plasma cells

      B cells

    2. polymorphonuclear (PMN) cell39

      granulocyte

    3. we determined the number of histologies needed to identify genes with maximal prognostic power

      histologies=samples

    4. All microarray studies in PRECOG were consistently normalized and pre-processed

      no RNA-seq

    5. CIBERSORT, a computational approach for inferring leukocyte representation in bulk tumor transcriptomes
    1. Clone 1 is the founding clone; 12.74% of the tumour cells contain only this set of mutations

      derivation unclear; not provided in supplemental information

    1. Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing
    2. Comparison of SNVs detected in the whole genome sequencing data with SNPs genotyped using arrays

      inference of "%diploid coverage"

    1. tier 1 contains all changes in the amino acid coding regions of annotated exons, consensus splice-site regions, and RNA genes (including microRNA genes). Tier 2 contains changes in highly conserved regions of the genome or regions that have regulatory potential. Tier 3 contains mutations in the nonrepetitive part of the genome that does not meet tier 2 criteria, and tier 4 contains mutations in the remainder of the genome
  4. Oct 2017
    1. Using their expression data and the same fold-change categories, we investigated the influence of both affinity and cooperative effects based on GraphProt predictions of Ago2 binding sites in comparison to the available CLIP-seq data.

      Could do the same since expression microarray data are available, but they show complete lack of differential expression when over-expressing our proteins of interest.

    2. allows the evaluation of putative binding sites with a meaningful score that reflects the biological functionality

      score = prediction margin Part of standard GraphProt output?

    3. Prediction margins

      Part of standard GraphProt output?

    4. TIA-1 has been described as an ARE-binding protein and binds both U-rich and AU-rich elements.
    5. logos are a mere visualization aid and do not represent the full extent of the information captured by GraphProt models
    6. tenfold cross-validation technique

      How do AUROCs look for our proteins of interest compared to the AUROCs for the iCLIP'ed proteins in Additional File 2?

    7. The following describes a typical biological application of computational target detection. A published CLIP-seq experiment for a protein of interest is available for kidney cells, but the targets of that protein are required for liver cells. The original CLIP-seq targets may have missed many correct targets due to differential expression in the two tissues and the costs for a second CLIP-seq experiment in liver cells may not be within the budget or the experiment is otherwise not possible. We provide a solution that uses an accurate protein-binding model from the kidney CLIP-seq data, which can be used to identify potential targets in the entire transcriptome. Transcripts targeted in liver cells can be identified with improved specificity when target prediction is combined with tissue-specific transcript expression data.

      use case

    8. Peak detection leads to high-fidelity binding sites; however, it again increases the number of false negatives. Therefore, to complete the RBP interactome, computational discovery of missing binding sites is essential.

      iCLIP data are not comprehensive

    9. GraphProt: modeling binding preferences of RNA-binding proteins
    1. Artemis is a free genome browser and annotation tool that allows visualisation of sequence features, next generation data and the results of analyses within the context of the sequence, and also its six-frame translation
    1. equal to the frequency of the higher expressed eQTL allele in the population

      should be equal to product of frequency of high eQTL allele and major coding allele, though the latter will be close to 1 for the rare coding mutations studied here

    2. 211,575 rare (MAF < 1%) coding variantsat thousands of genes

      not necessarily pathogenic

    3. proportiona

      inversely proportional?

    4. Modified penetrance of coding variants by cis-regulatory variation shapes human traits
  5. Sep 2017
    1. The projection score - an evaluation criterion for variable subset selection in PCA visualization

      "variable" typically means gene or locus in the context of biological data.

    1. How can I capture STDERR from an external command?

      problem arises when using backticks to execute external commands

    1. Major flaws in "Identification of individuals by trait prediction using whole-genome sequencing data"

      re Venter study in PNAS, claiming to be able to identify people based on whole genome data

    1. Plot a course through the genome Inspired by Google Maps, a suite of tools is allowing researchers to chart the complex conformations of chromosomes.

      mentioned tools are focused on (capture) Hi-C data

    1. One Test May Spot Cancer, Infections, Diabetes and More

      based on cell free DNA fragments in blood; DNA methylation patterns and fragment length distributions can inform on organ of origin.

  6. Aug 2017
    1. Diverse growth trends and climate responses across Eurasia's boreal forest

      implies limitations of using macroscopic tree ring features for climate reconstructions, which are influenced by many different factors

    1. A Technical Perspective in Modern Tree-ring Research - How to Overcome Dendroecological and Wood Anatomical Challenges

      microtome-generated sections along the entire length of wood core samples for anatomical studies

  7. Jul 2017
    1. global dye-bias equalization step to control for the different average intensities in the red and green channels. This procedure scales the background-corrected intensities, dividing by the average intensity of the positive control probes in the same channel, red or green, and multiplying by the average intensity of all positive controls in a reference array.
    1. Napoleon oak genome sequencing project web site: example of public engagement in tree genomics

  8. Jun 2017
    1. Canonical Poly(A) Polymerase Activity Promotes the Decay of a Wide Variety of Mammalian Nuclear RNAs

      Cordycepin, a modified adenosin produced by a species of fungus, inhibits polyA tail elongation; the RNA-seq data in this paper do NOT include cordycepin-treated samples

    1. Disruption of a novel imprinted zinc-finger gene, ZNF215, in Beckwith-Wiedemann syndrome

      demonstrates imprinted expression, but ICR is unknown

    1. Fig 5. Highest ranked ASE genes from (A) brain and (B) liver.

      observation of allele-specific expression in brain

  9. Apr 2017
    1. I'm the developer of pyGeno. Here's a little script that does just that for the Gene TPST2, by using segment trees

      recipe for merging transcripts of a gene into a single compound transcript

  10. Sep 2016
    1. The P. tetraurelia MAC genome [1] was assembled from 13× Sanger sequencing reads from different insert size librairies of strain d4-2 DNA. Strain d4-2 only differs from strain 51 at a few loci.

      SRA accession ERR138952

    1. You must quit IGV and restart for this preference to take effect. The genome should appear in the drop-down list.

      restart may be insufficient; had to modify prefs.properties in ~/igv (removing old cached genome values) before i could see my genomes

    1. 90,000 tiny introns (between 20 and 34 nt in length)
    2. MIC and MAC determination during the P. tetraurelia sexual cycle

      def. maternal: recipient of gametic nucleus

    1. Paramecium IESs are unique sequence elements between 26 and 882 bp in length
    2. hypothetical pathways for scnRNA-mediated recruitment of the endonuclease in Paramecium

      nucleotide modifications: possibly 6mA

    3. The “genome-scanning” model, as envisioned in Paramecium

      subtraction of MAC RNA from MIC small RNA = targets (IES) for excision

    4. Nuclear dimorphism and DNA rearrangements in the ciliates Paramecium tetraurelia

      tetraurelia: imprecise repeat v precise (splicing-like) IES excision

  11. Aug 2016
    1. One currentproject in Dr Schulz’s lab is to characterise a selection of interesting loci in detailusingisoform specific primers and qRT-PCR.

      Would be better to end with making an explicit connection between the ENCODE tissue-specific RNAseq data and the Setd2 knock-down RNAseq data. Would it make sense to focus on loci showing evidence for tissue-specific polyA as well as being dependent on Setd2 for correct splicing?

    2. not significant (<0.0274)

      I would call that marginally significant

    3. in

      as

    4. whereupon

      but

    5. Also these DNA damages as

      DNA damage like

    6. damages

      damage

    7. evinced

      exhibited

    8. remarkable

      substantial

    9. For some loci even the used tissues can differ in terms of strainand developmental stage between the qRT-PCRand bisulfite sequencing.

      German sentence structure: splitting the predicate (differ ... between). Not done in English. very awkward to read.

    10. a different and relatively unclear pattern

      different and inconsistent patterns

    11. Presumed that themechanism of poly(A) site selection/alternative polyadenylation may operate genome-wide in a tissue-specificmanner,and thus, contribute to the complexity of the mammalian transcriptome,

      use of very long prepositional phrases at the start of sentences makes reading difficult. stick to simple subject- predicate- object sentence structure.

    12. Presumed

      Hypothesising

    13. . This is seen in a different way
    14. really low

      no or low (<10%)

    15. Thedata displayedthat it is roughly possible to

      My data suggest that it is possible to qualitatively

    16. a totally reliable method

      considered quantitative

    17. from

      determined using

    18. Assuming the

      The

    19. AAA indicates poly(A) site

      nice and useful figure but: you primed the cDNA synthesis with random hexamers. the qRT-PCR results are therefore not specific to polyadenylated transcripts. so, above figure shows models consistent with the data rather than summarisations of the data (you did not directly measure polyA).

    20. random hexamers

      beware that this implies non-polyadenylated transcripts also are represented in the sample

    21. of

      for the

    22. in detail in drafted simplified images
    23. s
    24. directs

      based on direct

    25. normalised tissue

      reference tissue

    26. unexpected based on theRNA-seq data

      not if you look at the UCSC data (see comment above)

    27. a
    28. for example
    29. Based on the RNA-seq data

      depends in this case on whether you look at the scatter plot or the UCSC genome browser: they are not telling the same story for some reason. my corrections below reflect what UCSC shows, which results in flipping of placenta and thymus.

    30. Thymus

      Placenta

    31. placenta

      thymus

    32. placenta

      thymus

    33. thymus

      placenta

    34. Thymus

      Placenta

    35. placenta

      thymus

    36. , compromised

      resulted in

    37. liver adult

      adult liver

    38. different

      the different

    39. opposed to liver adult

      relative to adult liver

    40. and stretched

      :

    41. moretranscripts terminate across

      see above; will stop pointing this out

    42. conversion rate was calculated as72.55%

      low conversion rate leads to over-estimation of methylation, which could explain the 45% methylation seen in placenta

    43. used for measurements ofheart

      primer sets are not tissue-specific; you used them for all tissues; only the measurements themselves are tissue-specific

    44. arose from

      for

    45. Adck2 encodes for a kinase

      The host transcript of the CGI is non-coding. Your upstream primers amplify both the coding transcript and the non-coding host transcript. That is a limitation. Could explain the inconsistencies re the RNAseq data.

    46. the expression of transcripts

      transcription

    47. qRT-PCR

      qRT-PCR cannot show transcription termination: all it can do is verify the RNAseq data, i.e., more relatively more transcription upstream of an active CGI compared to transcription across the CGI. It is important to be precise about what qRT-PCR can and cannot do.

    48. transcripts terminating

      transcription

    49. transcripts terminateacross

      transcription extends across

    50. transcripts terminating

      transcription

    51. 1) Tissue with high CGI activity and more transcripts terminating upstream than across and 2) Tissue with lowCGI activity and more transcripts terminating across than upstream, as described in the chapter ‘Loci selection’.

      The data do not show transcripts terminating upstream or downstream of the CGI, they are merely consistent with the hypothesis.

    52. s a

    Tags

    Annotators