21 Matching Annotations

Feb 2023
gatk.broadinstitute.org gatk.broadinstitute.org

Variant Quality Score Recalibration (VQSR)

1
1. shenzhongjun 08 Feb 2023
  
  in Public
  
  This tool expects large numbers of variant sites in order to achieve decent modeling with the Gaussian mixture model. It's difficult to put a hard number on mimimum requirements because it depends a lot on the quality of the data (clean, well-behaved data requires fewer sites because the clustering tends to be less noisy), but empirically we find that in humans, the procedure tends to work well enough with at least one whole genome or 30 exomes. Anything smaller than that scale is likely to run into difficulties, especially for the indel recalibration.
  
  VQSR需要WGS或者30个WES数据才能进行
  
  GATK VQSR
Visit annotations in context

Tags

VQSR

GATK

Annotators

shenzhongjun

URL

gatk.broadinstitute.org/hc/en-us/articles/360035531612-Variant-Quality-Score-Recalibration-VQSR-
Jul 2022
www.sohu.com www.sohu.com

STAR-Fusion带你深入了解融合基因_reads

1
1. shenzhongjun 25 Jul 2022
  
  in Public
  
  将reads通过STAR比对到参考基因组，筛选出Junction reads（1条read含有两个基因融合断点的read ）和Spanning reads （R1,R2比对到不同基因上的reads）作为候选融合基因序列。
  
  star-fusion 融合支持reads定义
  
  fusion
Visit annotations in context

Tags

fusion

Annotators

shenzhongjun

URL

sohu.com/a/426522668_100295124
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

Comparison of Structural and Short Variants Detected by Linked-Read and Whole-Exome Sequencing in Multiple Myeloma

1
1. shenzhongjun 25 Jul 2022
  
  in Public
  
  Small variants were called using the previously established GATK protocol [48]. Briefly, raw sequencing reads were trimmed and filtered using the Trimmomatic software [54]. Paired-end reads passing processing were then aligned to the GRCh38 human reference genome using Burrows-Wheeler Aligner, duplicates were marked with Picard, and alignment quality was improved using the Genome Analysis Toolkit [51] local realigner and base quality score recalibrator. Short somatic variants were then called using MuTect2 [55]. The analysis protocol with version information of all tools and details of reference datasets used in analyses have been explained earlier [48]. Following variant calling, variants were annotated with Annovar [56] and variants not passing all MuTect2 filters, falling into intronic and intergenic regions, classified as synonymous or non-frameshift variation, with an ExAC [57], ESP [58], 1KG [59] minor allele frequency higher than 1%, with a variant calling quality less than 40, residing in sites covered by less than 10 reads, with variant allele frequency less than 2% or higher than 30%, and with SNV strand-odd-ratio higher than 3 or indel strand-odd-ratio higher than 11 were removed.
  
  WES snv过滤可参考
  
  wes snv
Visit annotations in context

Tags

wes

snv

Annotators

shenzhongjun

URL

ncbi.nlm.nih.gov/pmc/articles/PMC7999337/
Nov 2021
gatk.broadinstitute.org gatk.broadinstitute.org

Base Quality Score Recalibration (BQSR)

1
1. shenzhongjun 24 Nov 2021
  
  in Public
  
  For example we can identify that, for a given run, whenever we called two A nucleotides in a row, the next base we called had a 1% higher rate of error. So any base call that comes after AA in a read should have its quality score reduced by 1%.
  
  连续序列后的碱基更容易出错
Visit annotations in context

Annotators

shenzhongjun

URL

gatk.broadinstitute.org/hc/en-us/articles/360035890531-Base-Quality-Score-Recalibration-BQSR-
support.sentieon.com support.sentieon.com

2. Typical usage for DNAseq® — Sentieon 202010.04 documentation

1
1. shenzhongjun 08 Nov 2021
  
  in Public
  
  Base quality score recalibration (BQSR): This step modifies the quality scores assigned to individual read bases of the sequence read data. This action removes experimental biases caused by the sequencing methodology.
  
  UMI数据是否要进行此步骤？
Visit annotations in context

Annotators

shenzhongjun

URL

support.sentieon.com/manual/DNAseq_usage/dnaseq/
support.sentieon.com support.sentieon.com

Unique Molecular Identifiers — Sentieon Appnotes 202010.04 documentation

4
1. shenzhongjun 05 Nov 2021
  
  in Public
  
  -K $BWA_K_SIZE \
  
  这个不设置的话默认按照所给线程数自动设置。为确保不同线程下结果一致，官方建议设置为10000000（https://support.sentieon.com/manual/DNAseq_usage/dnaseq/#map-reads-to-reference）。然而真设了10000000跑出来结果一条read都没比对上。。。具体原因待查。
2. shenzhongjun 04 Nov 2021
  
  in Public
  
  The --umi_post_process option is used to instruct the tool to perform the necessary post-processing of the consensus reads.
  
  具体做了哪些操作官方文档也没写。可以问一下。
3. shenzhongjun 04 Nov 2021
  
  in Public
  
  util sort
  
  此工具支持多线程，官方推荐与bwa设置相同线程数
4. shenzhongjun 04 Nov 2021
  
  in Public
  
  identify errors raised during sample preparation.
  
  通过比较双链来检查样本准备过程引入的错误
Visit annotations in context

Annotators

shenzhongjun

URL

support.sentieon.com/appnotes/umi/
Oct 2021
med.china.com.cn med.china.com.cn

干货收藏丨肿瘤基因检测必须知道的十问十答！__中国医疗

2
1. shenzhongjun 25 Oct 2021
  
  in Public
  
  最好提供外周血进行对照，这样可以确保检测出的基因变异都是肿瘤细胞特有的。有的公司是通过公共的SNP数据库进行过滤，比如1000g，ExAC_ALL，gnomAD，这些SNP数据其实绝大多数都是外国人的，中国人群自己的SNP数据则是非常少的，如果保证种系变异能够精准的过滤，最好还是提供外周血作一个对照吧！
  
  计算TMB最好提供对照，以肿瘤突变减去对照突变，即为体细胞突变，比去筛外国人数据库要准。
2. shenzhongjun 25 Oct 2021
  
  in Public
  
  比如 ctDNA上样量为 33ng（10000拷贝），文库构建转化率50%，就是5000个拷贝，测序深度增加再深，理论上最多也只能测出5000个拷贝，LOD也不可能达到万分之一的。
  
  检出限的根本限制因素，DNA总量与文库转化率。33ng的DNA可以获得约10000拷贝，文库转化率50%，剩余5000个拷贝，靠增加测序深度最多也只能测到5000个拷贝，LOD万分之二，无法达到万分之一。
Visit annotations in context

Annotators

shenzhongjun

URL

med.china.com.cn/content/pid/237773/tid/1026
med.china.com.cn med.china.com.cn

年度巨献：神刊CA发布最新癌症数据：全球1000万人死于癌症，中国占比超3成__中国医疗

1
1. shenzhongjun 12 Oct 2021
  
  in Public
  
  2020年中国癌症新发病例457万例，乳腺癌在全球发病数高居第一，但在中国则在肺癌、结直肠癌、胃癌之后，位居第四。
  
  按14亿人口算，癌症发病率0.326%。第七次人口普查，我国60岁及以上人口有2.6亿人，其中，65岁及以上人口1.9亿人。
Visit annotations in context

Annotators

shenzhongjun

URL

med.china.com.cn/content/pid/235506/tid/1026
Jul 2021
www.hedaoxueyuan.com www.hedaoxueyuan.com

黄元吉内丹思想研究（一） _合道学院,丹道修炼,修真入门,绝学传承

1
1. shenzhongjun 07 Jul 2021
  
  in Public
  
  实际上，将黄元吉丹道思想归入中派更重要的理由即是他的“玄关”理论，黄元吉全部丹诀即以“玄关一窍”为轴心。在虚寂杳冥的玄关态，非阴非阳，即阴即阳，非虚非实，即虚即实，这样的中道零点位正是诸子百家共同奉守之圭臬。此外，黄元吉在《乐育堂语录》中指陈了千古丹经所未发的日用伦常之间的阳生活子时，这种居尘出尘，身在尘世而不被尘世所蔽，不离现实生活而超越现实生活，举凡无不恰到好处、随心所欲而无不中规中矩，在行住坐卧之间无处不阳生、无时不阳生，此岸即是彼岸。这样的丹道阳生观正折射了中国文化的核心精神气质，依时而中，依中而行。阳生活子时本来是丹道景象，黄元吉将阳生与“孔颜之乐”、贞女烈士之舍生取义、私欲褪尽之灵光独耀融会贯通，这一思想颠覆了认为只有深山静坐才能触发阳生的狭隘丹道观念，其意义在丹道史上、乃至中国思想史上皆是非凡的。
  
  秒啊！秒！
Visit annotations in context

Annotators

shenzhongjun

URL

hedaoxueyuan.com/wenhua/148.html
May 2021
satijalab.org satijalab.org

Seurat - Guided Clustering Tutorial

1
1. shenzhongjun 08 May 2021
  
  in Public
  
  For example, performing downstream analyses with only 5 PCs does significantly and adversely affect results
  
  这个要注意，居然不能只用5个PC
  
  单细胞单细胞转录组
Visit annotations in context

Tags

单细胞转录组

单细胞

Annotators

shenzhongjun

URL

satijalab.org/seurat/articles/pbmc3k_tutorial.html
Apr 2021
academic.oup.com academic.oup.com

Quality control, imputation and analysis of genome-wide genotyping data from the Illumina HumanCoreExome microarray

3
1. shenzhongjun 26 Apr 2021
  
  in Public
  
  This protocol uses a window of 1500 variants, shifted by 10% for each new round of comparisons, and a threshold of R 2 > 0.2. The window size of 1500 variants corresponds to the large, high LD chromosome 8 inversion, while the shift of 10% represents a trade-off between efficiency and thoroughness
  
  测试过，pure LD后没有关联位点了。可能是假阳性？
  
  gwas
2. shenzhongjun 26 Apr 2021
  
  in Public
  
  It is necessary to remove rare variants from GWAS because the certainty of the genotype call is reduced by their low minor allele count. Even in common variants, however, genotyping and genotype recalling are subject to technical error, with the result that a proportion of variants and samples are of low quality, and should be removed from the analysis.
  
  稀有变异的检出率不是很可靠？
  
  gwas
3. shenzhongjun 26 Apr 2021
  
  in Public
  
  For the smallest studies, where fewer than 1000 individuals are investigated, a cut-off of 5% should be considered—this is in line with the analysis program GenAbel, for example, which uses a minor allele count of 5 as its cut-off [ 18 ].
  
  1000个样本以下推荐5%的maf，有空要测试一下
  
  gwas
Visit annotations in context

Tags

gwas

Annotators

shenzhongjun

URL

academic.oup.com/bfg/article/15/4/298/2412127
Mar 2021
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

Data quality control in genetic case-control association studies

2
1. shenzhongjun 19 Mar 2021
  
  in Public
  
  The expectation is that IBD = 1 for duplicates or monozygotic twins, IBD = 0.5 for first-degree relatives, IBD = 0.25 for second-degree relatives and IBD = 0.125 for third-degree relatives. Due to genotyping error, LD and population structure there is often some variation around these theoretical values and it is typical to remove one individual from each pair with an IBD > 0.1875, which is halfway between the expected IBD for third- and second-degree relatives. For these same reasons an IBD > 0.98 identifies duplicates.
  
  IBD过滤标准.IBD即plink结果中的PI_HAT。
  
  gwas qc
2. shenzhongjun 17 Mar 2021
  
  in Public
  
  The method works best when only independent SNPs are included in the analysis. To achieve this, regions of extended linkage disequilibrium (LD) (such as the HLA) are entirely removed from the dataset8 and remaining regions are typically pruned so that no pair of SNPs within a given window (say, 50kb) is correlated (typically taken as r2>0.2)
  
  LD过滤，但实际发现过滤后位点很少，关联效应很弱
  
  gwas qc
Visit annotations in context

Tags

gwas

qc

Annotators

shenzhongjun

URL

ncbi.nlm.nih.gov/pmc/articles/PMC3025522/
www.nature.com www.nature.com

A genome-wide association meta-analysis of prognostic outcomes following cognitive behavioural therapy in individuals with anxiety and depressive disorders

2
1. shenzhongjun 11 Mar 2021
  
  in Public
  
  Our primary aim was to generate a cohort large enough to examine the heritability of prognostic therapy outcomes. However, the meta-analysis estimate of SNP heritability was low and non-significant (h2SNP = 0.09, SE = 0.17). A sample size of 2724 has 80% and 99% power to detect a SNP-heritability of 33% and 50%, respectively94. To achieve 80% power to detect a heritability of 20%, a sample of 4500 individuals will be required. A meta-analysis of 2 799 individuals was sufficient to detect a significant heritability estimate for therapy outcome to antidepressant drugs (h2SNP = 0.42, SE = 0.18) and this was the first evidence of a genetic component for treatments outcome of any kind
  
  统计方法值得学习
  
  gwas meta分析
2. shenzhongjun 11 Mar 2021
  
  in Public
  
  The meta-analysis sample (n = 2724) had 80% power to detect variants explaining 1.5% of the variance and 42% power to detect variants explaining 1% of the variance. Therefore, it is not especially surprising that we do not detect any variants at genome-wide significance. Typically, GWAS of psychological traits have required tens of thousands of participants to detect SNPs at genome-wide significance
  
  心理疾病GWAS研究艰难，2724个样本的meta分析结果还是阴性
  
  gwas meta分析
Visit annotations in context

Tags

gwas

meta分析

Annotators

shenzhongjun

URL

nature.com/articles/s41398-019-0481-y

shenzhongjun

Annotations: 21

Joined: February 26, 2021

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL