35 Matching Annotations

Jun 2017
arxiv.org arxiv.org

1706.05137.pdf

1
1. colelyman 27 Jun 2017
  
  in Public
  
  adding these computational blocks never hurts performance,even on tasks they were not designed fo
  
  It is interesting that domain specific mechanisms always improve performance for other domains. I wonder how performance would be if you applied these mechanisms independently on cross domain tasks.
Visit annotations in context

Annotators

colelyman

URL

arxiv.org/pdf/1706.05137.pdf
May 2017
arxiv.org arxiv.org

1701.08734.pdf

4
1. colelyman 16 May 2017
  
  in Public
  
  reset to their random initial values
  
  The values are reset to the original initial random values, or to new random values?
2. colelyman 16 May 2017
  
  in Public
  
  A maximum ofNdistinct modules per layer arepermitted in a pathway (typicallyN= 3 or 4).
  
  Why would this be beneficial? Obviously this limit prevents pathways from using entire layers, but how does that help the agents find a suitable pathway? Does this make the agent less localized?
  
  question
3. colelyman 16 May 2017
  
  in Public
  
  Async Advantage Actor-Critic (A3C
  
  Asynchronus Advantage Actor-Critic (A3C)
4. colelyman 15 May 2017
  
  in Public
  
  Agents are pathways (views) throughthe network which determine the subset of parameters thatare used and updated by the forwards and backwards passesof the backpropogation algorithm
  
  Agents find pathways in the giant neural network that would be useful in other networks.
Visit annotations in context

Tags

question

Annotators

colelyman

URL

arxiv.org/pdf/1701.08734.pdf
Mar 2017
www.fractal.org www.fractal.org

Introduction to Fractal Geometry

2
1. colelyman 29 Mar 2017
  
  in Public
  
  many natural phenomena are better described using a dimension between two whole numbers
  
  This is a nice definition of non-integer dimension.
2. colelyman 29 Mar 2017
  
  in Public
  
  If you look carefully at a fern leaf, you will notice that every little leaf - part of the bigger one - has the same shape as the whole fern leaf. You can say that the fern leaf is self-similar. The same is with fractals: you can magnify them many times and after every step you will see the same shape, which is characteristic of that particular fractal.
  
  This is a nice example of self-similarity.
Visit annotations in context

Annotators

colelyman

URL

fractal.org/Bewustzijns-Besturings-Model/Fractals-Useful-Beauty.htm
www.nature.com www.nature.com

How does multiple testing correction work?

1
1. colelyman 20 Mar 2017
  
  in Public
  
  This probability—the probability that a score at least as large as the observed score would occur in data drawn according to the null hypothesis—is called the P-value.
  
  A good description of p-value.
  
  definition
Visit annotations in context

Tags

definition

Annotators

colelyman

URL

nature.com/articles/nbt1209-1135
Feb 2017
Local file Local file

Tidy data

1
1. colelyman 20 Feb 2017
  
  in Public
  
  Like families, tidy datasets are all alike but every messy dataset is messy in its own way.
  
  There are many ways to do something wrong, but only one way to do it right! When adhering to a standard, there is only one way that you do it correctly, but many ways to do it incorrectly.
Annotators

colelyman
bmcgenomics.biomedcentral.com bmcgenomics.biomedcentral.com

An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data

1
1. colelyman 17 Feb 2017
  
  in Public
  
  https://sourceforge.net/projects/aaf-phylogeny/
  
  The code is updated on Github at this repository.
Visit annotations in context

Annotators

colelyman

URL

bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1647-5
goldenhelix.com goldenhelix.com

GWAS_e-book.pdf

6
1. colelyman 07 Feb 2017
  
  in Public
  
  Very large sample sizes may be required to achieve such significance levels, especially for rare disease alleles and alleles with small effect sizes
  
  The irony is that large sample sizes are needed for rare diseases alleles and small effect size alleles.
2. colelyman 07 Feb 2017
  
  in Public
  
  Manhattan Plot
  
  It is helpful to note what the axises of a Manhattan Plot represent. As stated, the x-axis is the locus (location of a nucleotide) of the SNP in the genome, and the y-axis is the negative p-value that is scaled using logarithms. This means that the highest points in the Manhattan Plot have the lowest p-values, and are therefore the statistically significant SNPs.
3. colelyman 07 Feb 2017
  
  in Public
  
  GLM
  
  General Linear Model (GLM)- is a statistical linear model that generalizes multiple linear regression models such that there are multiple dependent variables.
  
  definition
4. colelyman 07 Feb 2017
  
  in Public
  
  ANOVA
  
  The ANOVA analysis compares one nominal variable with one measurement variable. In this case the measurement variable would be the genotype of an individual, and the nominal variable is whether that individual is case or control.
5. colelyman 07 Feb 2017
  
  in Public
  
  consanguinity
  
  Consanguinity- is the property of being related to someone, in essence having the same ancestor as someone else.
  
  definition
6. colelyman 07 Feb 2017
  
  in Public
  
  The specificset of alleles observed together on a single chromosome, or part of a chromosome, is called a haplotype.
  
  This is a good definition of a haplotype.
  
  definition
Visit annotations in context

Tags

definition

Annotators

colelyman

URL

goldenhelix.com/media/pdfs/ebooks/GWAS_e-book.pdf
bmcbioinformatics.biomedcentral.com bmcbioinformatics.biomedcentral.com

Computational algorithms to predict Gene Ontology annotations

12
1. colelyman 01 Feb 2017
  
  in Public
  
  Method
  
  It seems that there is no weighting scheme/method that consistently performs the best for each organism and for each validation criteria, except perhaps SIM-ATN.
2. colelyman 01 Feb 2017
  
  in Public
  
  pLSA performances are always improved by the NTN schema
  
  This is not the case for the Boss Taurus or the Danio rerio cmp groups.
3. colelyman 01 Feb 2017
  
  in Public
  
  APrate
  
  APrate- Annotation Predicted, equivalent to a false positive.
4. colelyman 01 Feb 2017
  
  in Public
  
  ACrate
  
  ACrate- Annotation Confirmed, equivalent to a true positive.
5. colelyman 01 Feb 2017
  
  in Public
  
  Receiver Operating Characteristic (ROC) curves
  
  Receiver Operating Characteristic (ROC) curves- these curves compare the true positive rate against the false positive rate.
  
  definition
6. colelyman 01 Feb 2017
  
  in Public
  
  −102,118
  
  Why were there 102,118 less annotations after 4 years?
7. colelyman 01 Feb 2017
  
  in Public
  
  the new values for P(f|t)as:
  
  missing equation
  
  correction
8. colelyman 01 Feb 2017
  
  in Public
  
  we can interpret each of those vectors as multinomial distributions of probabilities over the set of topics
  
  I think that this is very similar to a softmax layer in a neural network.
9. colelyman 01 Feb 2017
  
  in Public
  
  overcome this issue, by adding a gene clustering step and defining a specific model for each cluster,
  
  I still see a limitation with a bias towards genes that have few or no annotations. How can you cluster based on annotation terms if it has no annotation terms? What about if there are only a few terms for a gene?
10. colelyman 01 Feb 2017
  
  in Public
  
  orthonormal
  
  Orthonormal matrix- a square matrix where each row and column are orthogonal unit vectors.
  
  Orthogonal- when two vectors are perpendicular.
  
  Unit vector- a vector of length one.
  
  definition
11. colelyman 01 Feb 2017
  
  in Public
  
  For each function term f it provides an estimation of the importance of an annotation to that term, decreasing the relevance of the annotations to common terms, such as the ones close to the ontology root
  
  The inverse gene frequency (IGF) function provides a way to quantify how influential a term is based on how far away the term is from the ontology root.
  
  Basically, the terms that are more specific are more important than the general terms.
12. colelyman 01 Feb 2017
  
  in Public
  
  the relevance of a function term for a given gene is proportional to the number of descendant of that terms that are annotated to the gene and (b) if a term is rare (i.e. it is annotated only to a small subset of G), it is a better discriminator among the set of genes than common function terms
  
  This is the basis for the weighted matrix:
  
  The relevance of a function term increases as there are more descendant terms annotated to the gene.
  
  The more rare a term is, the better it describes a set of genes.
Visit annotations in context

Tags

definition

correction

Annotators

colelyman

URL

bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-16-S6-S4
Jan 2017
www.biostathandbook.com www.biostathandbook.com

Correlation and linear regression - Handbook of Biological Statistics

2
1. colelyman 25 Jan 2017
  
  in Public
  
  There are three things you can do with this kind of data.
  
  Hypothesis test, use a t-test or something similar to see if your hypothesis is supported by the data.
  
  Find how tightly the variables are associated, calculate r and the higher the r the stronger the relationship is between the two variables (a.k.a. a large r means that one can accurately predict one variable from the other).
  
  Determine the equation of a line that generalizes the data, allowing for predictions given only one variable.
2. colelyman 25 Jan 2017
  
  in Public
  
  Graph of my pulse rate vs. speed on an elliptical exercise machine.
  
  I believe that the labels for the x- and y-axis should be switched.
  
  correction
Visit annotations in context

Tags

correction

Annotators

colelyman

URL

biostathandbook.com/linearregression.html
neuralnetworksanddeeplearning.com neuralnetworksanddeeplearning.com

Neural Networks and Deep Learning

1
1. colelyman 19 Jan 2017
  
  in Public
  
  Their most successful network had hidden layers containing 2,5002,5002,500, 2,0002,0002,000, 1,5001,5001,500, 1,0001,0001,000, and 500500500 neurons, respectively. They used ideas similar to Simard et al to expand their training data. But apart from that, they used few other tricks, including no convolutional layers: it was a plain, vanilla network, of the kind that, with enough patience, could have been trained in the 1980s
  
  I find it interesting that even simple neural networks can achieve good results. However, I'm not sure that this network would be considered simple given that it has so many parameters, it is hardly a simple function.
Visit annotations in context

Annotators

colelyman

URL

neuralnetworksanddeeplearning.com/chap6.html
www.biostathandbook.com www.biostathandbook.com

Kinds of variables - Handbook of Biological Statistics

3
1. colelyman 18 Jan 2017
  
  in Public
  
  Personally, I don't see how treating values of a Likert item as a measurement variable will cause any statistical problems.
  
  I agree that there wouldn't be any problems, but one issue could be in comparing studies that use different intervals for their Likert items.
2. colelyman 18 Jan 2017
  
  in Public
  
  Converting measurement variables to nominal variables
  
  This is an important concept in Data Science, which allows you to generalize the data so that you can get a high level picture of what the data is. It is important to keep the original measured values so that you can do more precise testing later.
3. colelyman 18 Jan 2017
  
  in Public
  
  You might plot 52.3% on a graph as a simple way of summarizing the data, but you should use the 34 female and 31 male numbers in all statistical tests.
  
  There can be fundamentally different ways to display data, and to analyze using statistics. Keep in mind to use raw values as opposed to percentages, averages, etc. for statistical tests.
Visit annotations in context

Annotators

colelyman

URL

biostathandbook.com/variabletypes.html
genome.cshlp.org genome.cshlp.org

OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes

1
1. colelyman 17 Jan 2017
  
  in Public
  
  orthologous groups
  
  Orthologous groups are groups of genes that developed from a common ancestor, source. Orthologous genes have the same (or similar) function, but may vary in sequence.
  
  definition
Visit annotations in context

Tags

definition

Annotators

colelyman

URL

genome.cshlp.org/content/13/9/2178.full

Cole Lyman

Annotations: 35

Joined: January 17, 2017

Link: colelyman.com

ORCID: 0000-0001-7921-165X

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Tags

Annotators

URL

Annotators

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Annotators

URL

Tags

Annotators

URL