Hypothesis

31 Matching Annotations

Mar 2025
www.biorxiv.org www.biorxiv.org

Systematic comparison of Generative AI-Protein Models reveals fundamental differences between structural and sequence-based approaches

1
1. ryanayork 31 Mar 2025
  
  in Arcadia Science
  
  Monomers generated through structural diffusion appear to only occupy a small region in comparison to both the UniRef50 and PISCES sequences, whereas generative models of sequences appear to more evenly populate the space of similar length natural proteins.
  
  Taxonomic biases of the training data likely also play an important role here. The data sources aren't equal in how they've sampled the protein universe. This is especially apparent when comparing the structure and sequence databases. For example, certain taxa (e.g., humans) are overrepresented in the PDB, while others dominate UniRef.
  
  It's not hard to imagine how the distribution differences in the t-SNE might reflect this, especially given the strong overlap of sequence-based methods with the UniRef samples. Do you know if the same is true for the structure-based methods? If you visualized where, say, PDB proteins are, would there be strong overlap?
  
  Any ideas on how to disentangle approach from the taxonomic makeup of training data?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2025.03.23.644844v1
Feb 2025
www.biorxiv.org www.biorxiv.org

Brain size dependent speciation and extinction rates in birds and the cognitive buffer hypothesis

1
1. ryanayork 07 Feb 2025
  
  in Arcadia Science
  
  Passeriformes (perching birds) and non-passeriforms show distinct relative brain size237dependent diversification patterns when fitting BiQuaSSE models, which allow both groups to have238different speciation and extinction rates (Fig. 1D-G).
  
  Did you perform any comparisons other than Passeriformes vs. not?
  
  An examination of the 3 parameters (speciation, extinction, diversification) as a function of all taxonomic comparisons seems like a useful and potentially more agnostic analysis. More generally, I wonder about the extent to which taxonomic coarseness might influence sensitivity to detecting "cognitive buffer" over "behavioral drive." Might there be smaller clades of large-brained species that defy the macro-level extinction trends?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2025.02.04.634049v1.full.pdf
www.biorxiv.org www.biorxiv.org

Protein codes promote selective subcellular compartmentalization

1
1. ryanayork 07 Feb 2025
  
  in Arcadia Science
  
  The area under the receiver operator curve (AUC-ROC) showed that protein compartments could be predicted with remarkable accuracy (0.83-0.95) across the 12 different compartments (Fig. 1D).
  
  ESM2 performance can be sensitive to the makeup of training data used (e.g. https://www.biorxiv.org/content/10.1101/2024.03.07.584001v1.abstract). Specifically, class biases in training data can be recapitulated in generated sequences.
  
  Given that AUC-ROC varies as a function of compartment type (Fig 1D) and the compartments themselves are associated with diverse input sequence numbers (Fig 1B), I wonder if you examined possible biases in ProtGPS's behavior? Does ProtGPS more readily generate sequences that are suited for certain compartments than others? Is this explainable by the statistical distribution of the training data?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.04.15.589616v2
Dec 2024
www.biorxiv.org www.biorxiv.org

Fast evolutionary turnover and overlapping variances of sex-biased gene expression patterns defy a simple binary classification of sexes

1
1. ryanayork 06 Dec 2024
  
  in Arcadia Science
  
  Nine age-matched adult females and adult males each were chosen from each of the four taxa, 72 individuals are included in total in the overall analysis. As somatic organs we included brain (whole brain), heart, liver (left medial lobe), kidney (right) and mammary gland (fourth, right). Note that the mammary glands in mice have similar sizes in both sexes before lactation and are therefore directly comparable.
  
  There's some evidence of sex-specific cell type heterogeneity in organs (e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC10210449/; https://pmc.ncbi.nlm.nih.gov/articles/PMC7615307/#S1). It seems possible that consistent sex-specific organ heterogeneity might be another explanation for the patterns you see and, if present, could change interpretations/conclusions. E.g., sex-biased differences could arise from cell number variation rather than intrinsic transcriptional differences. How much of a concern is that here?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.05.22.595301v2
www.biorxiv.org www.biorxiv.org

Adaptive cellular evolution in the intestinal tracts of hyperdiverse African cichlid fishes

1
1. ryanayork 05 Dec 2024
  
  in Arcadia Science
  
  46.4% - 86.2% of sequencing reads (mean 67.5%) mapped confidently to the reference genome, and 32.5% - 75.6% (mean 52.8%) mapped confidently to the reference transcriptome (Table S2). We obtained a cell / gene count matrix for each sample, which consisted of 1,133-8,226 cells (mean 4,498 cells), with a means of 33,058 reads, 2,255 UMI counts, and 713 detected genes (Figure S1A-C). In total, we detected in each sample between 20,877 and 26,535 genes (mean 24,195 genes).
  
  Does mapping percentage or gene count vary with phylogeny and/or ecology? Put differently, is there any reason to worry that technical variation here might influence your sensitivity for detecting cell type abundance, especially given the low number of replicates per species?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.11.28.625862v1
Oct 2024
www.biorxiv.org www.biorxiv.org

Evolutionary trends in the emergence of skeletal cell types

1
1. ryanayork 04 Oct 2024
  
  in Arcadia Science
  
  The phylostratigraphy map of M. musculus and D.rerio was constructed by comparing 22,769 M. musculus and 25,787 D.rerio protein sequences with the protein sequence database by blastp algorithm V2.9.0 with a 10-3 e-value threshold[101].
  
  Can you expand on why you chose blastp? There are a number of other (likely more sensitive) alignment methods. Given that many of the analyses in this manuscript rely on specific assumptions with respect to evolutionary age, it seems that identifying the most accurate approach possible would be useful.
  
  Also, why use this specific e-value threshold for all proteins? Proteins often vary in e-value distributions due to differences in sequence length/composition, evolutionary history, etc. Methods that account for this (e.g. OrthoFinder) might be worth exploring.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.09.26.615131v1
Sep 2024
www.biorxiv.org www.biorxiv.org

Description of a novel extremophile green algae, Chlamydomonas pacifica, and its potential as a biotechnology host

1
1. ryanayork 06 Sep 2024
  
  in Arcadia Science
  
  The ability to move strategically allows these algae to seek desirable niches for growth and survival, especially in extreme habitats where resources are scarce or conditions are rapidly changing.
  
  Is C. pacifica's capacity to live in higher salinity environments accompanied by variation in their motility patterns with respect to non-extremophiles? Given that the Reynold's number varies with salinity, it might be enlightening to measure C. pacifica's speed distribution at different salinity concentrations. I wonder if these experiments might uncover more interesting axes of diversity within C. pacifica that differentiate them from species like C. reinhardtii.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.09.03.611117v1
Aug 2024
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

Rapid protein evolution by few-shot learning with a protein language model

1
1. ryanayork 02 Aug 2024
  
  in Arcadia Science
  
  A schematic showing the evolution of higher activity variants with EVOLVEpro. The mutagenesis landscape of proteins is often conceptualized as a complex terrain with numerous potential paths. Shown here is a gray road that conceptualizes the protein mutagenesis landscape where traversing upwards results in higher protein activity and traversing downwards reduces protein fitness. Traditional frameworks of evolutionary plausibility attempt to navigate this terrain based on natural selection, which is constrained by historical and environmental factors.
  
  In the manuscript, "fitness" generally refers to landscapes learned by pLMs. However, at other times, it is used to describe the actual landscapes traversed by evolution (via processes like natural selection). Given the limitations of pLMs - including those you cover in the introduction - it feels dangerous to conflate these two. It is far from established that language models are able to infer the true structure of evolutionary processes, much less model the complex activities of natural selection.
  
  This feels important to note since the discontinuity between fitness and trait distributions has been recognized for a long time (e.g. Fisher 1930). Many factors contribute to this relationship, both at the individual gene/protein level and at the level of genetic interactions. It is likely that variation in relationships between pLM fitness/activity will also be affected by multiple such factors (as evidenced by the differences observed even here across the 5 proteins of focus). It is also likely that these will at least somewhat differ from the factors influencing empirical fitness landscapes. Delineating these differences clearly seems to be useful for future model development/refinement.
Visit annotations in context

Annotators

ryanayork

URL

ncbi.nlm.nih.gov/pmc/articles/PMC11275896/
Jul 2024
www.biorxiv.org www.biorxiv.org

Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences

1
1. ryanayork 12 Jul 2024
  
  in Arcadia Science
  
  Strikingly, the resulting landscape was dominated by generated proteins, which comprised 94.1% of the total phylogenetic diversity (as measured by cumulative branch length) and resulted in a 10.3-fold increase in diversity relative to the entire CRISPR-Cas Atlas (Fig. 2b). Novel phylogenetic groups were distributed across the tree, suggesting that the model has captured the full diversity of Cas9 and is not overfitting to any particular lineage.
  
  I find it hard to interpret the importance of these results without more context.
  
  For example, how surprising is it to see this enrichment given the initial n of natural and generated proteins?
  
  How might decisions with respect to tree construction effect the branch length distribution? It seems possible that you would get different a different outcome if you varied the mmseqs parameters or implemented different criteria for choosing representative proteins.
  
  Furthermore - though novel phylogenetic groups are distributed throughout the tree - it would be interesting to know if the overall distribution across clades is predicted by the abundance of natural proteins across the tree. I.e. do clades with more natural proteins in the training data tend to produce more generated proteins?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.04.22.590591v1
www.biorxiv.org www.biorxiv.org

Estimates of molecular convergence reveal pleiotropic genes underlying adaptive variation across teleost fish

1
1. ryanayork 01 Jul 2024
  
  in Arcadia Science
  
  We ran the analysis using a rooted time-calibrated species tree obtained from timetree.org 33.
  
  What was the rationale for using a tree from timetree.org as opposed inferring one from the gene families?
  
  I imagine that a comparing the effects of using timetree vs. an inferred tree on CSUBST outputs would be enlightening. Such a comparison could be an empirical way to assess the effects of topological error in this data set (and would be a nice complement to some of the analyses in Fukushima and Pollock).
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.06.24.600426v1
May 2024
www.biorxiv.org www.biorxiv.org

Protein language models are biased by unequal sequence sampling across the tree of life

2
1. ryanayork 10 May 2024
  
  in Arcadia Science
  
  where nj is the raw sequence count for species j, d(i, j) is the time to last common ancestor between species i and j collected from the TimeTree of Life resource (Kumar et al., 2022), and α ∈ R≥0 is a hyperparameter used to scale d appropriately. Under the assumption that mutations occur at a fixed rate, <img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2024/03/12/2024.03.07.584001/embed/inline-graphic-3.gif"/> gives the expected overlap in sequence between two species’ orthologs, to approximate the effective sequence counts they contribute to each other4.
  
  It's great that even with the use of fixed rates you see a substantial increase in fraction of bias explained. Since mutation rates obviously do vary, I wonder just how much better you might do using a model that doesn't explicitly fix them...
2. ryanayork 10 May 2024
  
  in Arcadia Science
  
  Under the assumption that mutations occur at a fixed rate, <img class="highwire-embed" alt="Embedded Image" src="https://www.biorxiv.org/sites/default/files/highwire/biorxiv/early/2024/03/12/2024.03.07.584001/embed/inline-graphic-3.gif"/> gives the expected overlap in sequence between two species’ orthologs, to approximate the effective sequence counts they contribute to each other4.
  
  What does it look like if you just use the branch lengths from the phylogeny to do this weighting? I would guess you get at least some increase in the Spearman correlations and it's a straightforward approach.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.03.07.584001v1
Mar 2024
www.biorxiv.org www.biorxiv.org

Behavioral sequences across multiple animal species in the wild share common structural features

1
1. ryanayork 12 Mar 2024
  
  in Arcadia Science
  
  or each behavior, all individuals seem to exhibit very similar bout duration distributions.
  
  It is hard not to notice that the distributions for certain states (e.g. Meerkat vigilant/resting state) are noisier than others. It would be interesting to see a comparison of the variance of these distributions as a function of species and/or state to see if the claim in this sentence is statistically supported.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2024.01.20.576411v3
Feb 2024
www.biorxiv.org www.biorxiv.org

Evolution of novel sensory organs in fish with legs

2
1. ryanayork 02 Feb 2024
  
  in Arcadia Science
  
  We hypothesize that sea robins initially developed fin ray-like legs for locomotion. Ancestral organs then evolved limited sensory capability to facilitate manipulation of the visible substrate in search of food. Finally, evolution of sensory papillae further specialized legs to localize and uncover buried prey.
  
  How much history/ecological data are there available for these species? It could be interesting to pair the phylogenetic patterns with other trait data to explicitly test different evolutionary hypotheses. e.g. is there a relationship with prey type? substrate? depth? biotic diversity?
2. ryanayork 02 Feb 2024
  
  in Arcadia Science
  
  To test this ability, we developed a simple behavioral assay in which sea robins (Prionotus carolinus) were housed in a controlled tank with either mussels or capsules containing crude or filtered mussel extract buried in sand without visual cues (Fig. 1a, b, Supplementary movie 1). Sea robins alternated between short bouts of swimming and walking (Fig. 1b) and appeared to “scratch” at the sand surface with their legs while walking, which we hypothesized represented sensory behavior.
  
  Do these behaviors vary at all as a function of what prey are used? I'm guessing you tested squid and crabs with P. carolinus as you did with P. evolans?
  
  Presumably motile (squid/crabs) prey would give off a different set of cues that less/non-motile prey (mussels)? Specifically, I wonder if there is a tradeoff between chemo- and mechanosensation that is dependent on the amount of movement? Examining this relationship could be a potential route into the neural computations underlying digging behavior...
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.10.14.562285v1
Jan 2024
www.biorxiv.org www.biorxiv.org

Single-cell eQTL mapping in yeast reveals a tradeoff between growth and reproduction

1
1. ryanayork 05 Jan 2024
  
  in Arcadia Science
  
  We classified individual haploid yeast cells into five different cell cycle stages (M/G1, G1, G1/S, S, G2/M) via unsupervised clustering of the expression of 787 cell-cycle-regulated genes30 in combination with 22 cell-cycle-informative marker genes (Figures 1B, S2 and S3)
  
  How sparse is this matrix? Given an average of ~1,500 UMIs and ~800 cell-cycle genes, I'm assuming the distribution of expression for the cell-cycle genes is quite distributed/uneven across the cells?
  
  If very uneven, I wonder if some of the cell cycle designations might be driven by sparsity as opposed to canonical expression signatures associated with each stage? One way to parse this out might be to look at the PC loadings using as input to clustering/UMAP/etc. Do any show signatures of extreme sparsity (e.g. binary expression only one or several genes)?
  
  More broadly, it might be helpful to report the average # of cell-cycle genes detected in each cell.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.12.07.570640v2
Dec 2023
www.biorxiv.org www.biorxiv.org

Contextual and Combinatorial Structure in Sperm Whale Vocalisations

1
1. ryanayork 22 Dec 2023
  
  in Arcadia Science
  
  Second, we evaluated whether sequences of codas reflect longer-term trends. To do so, we collected coda triples of the same discrete coda type, and measured the correlation between tempo drift across adjacent pairs. We found a significant positive correlation, compared to a null hypothesis that drift between adjacent pairs is uncorrelated (test: Spearman’s rank-order correlation (two-sided), r(2586) = 0.57, p = 2e−220, 95% CI= [0.54, 0.60], n = 2588). Thus, rubato is distributed across sequences of multiple codas.Finally, we evaluated whether rubato is perceived and controlled by measuring whales’ ability to match their interlocutors’ coda durations when chorusing. We measured the average absolute difference in duration between (1) pairs of overlapping codas from different whales, and (2) pairs of non-overlapping codas of the same discrete coda type. Durations are significantly more closely matched for overlapping codas (0.099s on average) than would be expected under a null hypothesis that chorusing whales match only discrete coda type (which would give a drift of 0.129s on average) (test: permutation test (one-sided), p = 0.0001, n = 908; see Supplementary Section 6).
  
  I wonder if calculating the autocorrelation of coda durations might be a nice complementary measure here. Autocorrelation could give you a sense of the time scale over which the rubatos decay and, seemingly, might also provide a sense for the timescale of longer-term trends.
  
  Similarly, I wonder if cross-correlation might be useful for comparing the information quantity shared with interlocutors? The correlation value would be interesting, in addition to any patterns of temporal lag between codas. It might be a comprehensive metric for comparing the similarities of codas over time (as opposed to just looking at overlapping codas).
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.12.06.570484v1
Nov 2023
www.biorxiv.org www.biorxiv.org

IL-15-induced modulation of NK cell migration identified using a data-driven approach

3
1. ryanayork 03 Nov 2023
  
  in Arcadia Science
  
  cellPLATO performs UMAP on morphological/motility parameters then uses HDBSCAN cluster analysis to define behavioural clusters
  
  It is hard to tell from the text if HDBSCAN is run on the behavioral parameters or on the UMAP output. If the latter, then I would take extreme caution in thinking about the generalizability of the method given the numerous issues with clustering on nonlinear manifolds. Either way, it would also be helpful to report more information on what the specific morphological/motility parameters are and any normalizations/manipulations that were done on them prior to UMAP and clustering.
  
  Also, any justification for choosing of UMAP and HDBSCAN would be useful.
2. ryanayork 03 Nov 2023
  
  in Arcadia Science
  
  UMAPs 1, 2 and 3
  
  This might be a slightly confusing way to refer to UMAP dimensions (is it accepted that a UMAP dimension = a single 'UMAP'?)
3. ryanayork 03 Nov 2023
  
  in Arcadia Science
  
  We first investigated two fundamental measurements of cell migration and morphology, namely cell speed and cell area. When comparing conditions, the median migration speed of NK cells on VCAM-1 was 3.48 μm/min and 2.54 μm/min on ICAM-1 (Fig. 2A). The effect size distribution for VCAM-1 was greater, demonstrating statistical significance (p <0.00001) (41), and its distribution did not overlap with the control condition (ICAM-1). NK cells migrating on VCAM-1 also had smaller median cell area (114 μm2) compared with ICAM-1 (175 μm2) (Fig. 2B), with nonoverlapping effect size distribution (p < 0.00001).
  
  Does donor identify have any effect here? Do the donors differ at all in their speed/area distributions and effect sizes? This would be useful to know here and for many other analyses presented in the manuscript. More broadly, it is a little hard to assess the generalizability of the behavioral results presented here (including the cellPLATO analyses) without knowing more about the influence of experimental variables like this.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.10.28.564355v1
Oct 2023
www.biorxiv.org www.biorxiv.org

Diverse prey capture strategies in teleost larvae

1
1. ryanayork 13 Oct 2023
  
  in Arcadia Science
  
  We next adapted an experimental paradigm used to study prey capture in zebrafish for these other species (Mearns et al., 2020). Individual larvae were placed in chambers with prey items (either artemia or paramecia).
  
  These species are ecologically diverse (e.g. benthic vs. riverine) and likely possess corresponding sensory differences. Given this, it seems possible that their prey capture behaviors may vary as a function of sensory environment. For example, benthic species may display different repertoires in dark conditions.
  
  Have you tested the effect of varying the sensory environment on prey capture behaviors? Is there intra-specific variation? Are species-specific behaviors invariant? Whatever the outcome, these experiments would help refine the picture of how these behaviors evolved and could lead to more specific sensorineural hypotheses.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.10.03.560453v1
Sep 2023
www.biorxiv.org www.biorxiv.org

Evolutionary prediction for new echolocators

1
1. ryanayork 29 Sep 2023
  
  in Arcadia Science
  
  Finally, 14 convergent amino acid substitutions with high confidence among known echolocating mammalian lineages were obtained (Table S3), and these sites were found to be effective in differentiating echolocating and nonecholocating mammals (Fig. 1A; Fig. S2).
  
  I wonder if it might be worth including a brief comment on the identity of these genes and/or their potential relationships with echolocation? Do they seem to be sensible functional hits? Seeing as the echolocation score appears to work quite well it would be interesting to known a bit more about any molecular context for these predictive loci.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.09.13.556757v1
www.biorxiv.org www.biorxiv.org

Systematic creation and phenotyping of Mendelian disease models in C. elegens: towards large-scale drug repurposing

2
1. ryanayork 01 Sep 2023
  
  in Arcadia Science
  
  phonotypes
  
  phenotypes
2. ryanayork 01 Sep 2023
  
  in Arcadia Science
  
  (1) at least two orthology prediction algorithms agree the human and worm genes are orthologs; (2) the WormBase (version WS270) (Harris et al., 2020) gene description includes either ‘neuro’ or ‘musc’ (this captures variants of neuronal, neural, muscle, muscular etc.);
  
  I'm wondering about how varying these criteria would effect the number of/which genes were detected.
  
  For criteria 1, what was the rationale for choosing agreement between >2 algorithms? From Fig 1C, it's hard to tell if there is a relationship between %homology and #of agreeing algorithms. What benefit do you get from using this cutoff? What are the tradeoffs? It might be helpful to include a figure similar to 1C, but including the full set of genes before filtering and to walkthrough the outcomes of different cutoffs.
  
  Similarly, for criteria 2, what type of/how many hits do you get if you don't select for 'neuro' or 'musc'? Is there any chance that, though you are using a behavioral readout, genes not annotated 'neuro'/'musc' might still contribute to a behavioral phenotype (e.g. via pleiotropy/epistasis)? Would be useful to include a statement of your thinking on this!
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.08.25.554786v1
Jun 2023
www.biorxiv.org www.biorxiv.org

Reconstructing cell type evolution across species through cell phylogenies of single-cell RNAseq data

2
1. ryanayork 02 Jun 2023
  
  in Arcadia Science
  
  Our study is unique in that instead of using gene expression values directly, we use principal components calculated from gene expression values as our phylogenetic characters. In addition, we remove later principal components that may represent highly heterogeneous cell-specific signal.
  
  Seems like it would be worth including a direct comparison of Brownian motion to other evolutionary models. The computational overhead shouldn't be very high and, if the comparison supports the use of Brownian motion, it could be a more compelling argument than this.
2. ryanayork 02 Jun 2023
  
  in Arcadia Science
  
  This dataset was chosen for the uniformity of sampling, consistency of lab and sequencing protocols, the high quality of its cell type annotations, and the abundance of genomic resources available for the five model species. UMI counts were downloaded as CSV files from the NCBI GEO database (GSE146188). A file containing meta-data, including cluster assignment and cell type labels, was obtained from the Broad Institute Single Cell Portal
  
  I wonder about the effect of scRNA-seq methodology on downstream results here. How do droplet-based approaches (like that used for van Zyl et al.) compare to others (e.g. Smart-seq2) when generating cell type trees? There can substantial differences in the # of genes detected by these methods, with droplet-based approaches often generating datasets with less genes. Does this affect the estimation of rank and/or the outputs of the PCA you use for evolutionary modeling? It seems like this would be an important issue to solve since droplet-based methods are essentially downsampling informative data in a nonrandom way that may bias evolutionary inference.
  
  TLDR: are cell tree topologies consistent independent of sequencing methodologies?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.05.18.541372v1
www.biorxiv.org www.biorxiv.org

Canalisation and plasticity on the developmental manifold of Caenorhabditis elegans

1
1. ryanayork 02 Jun 2023
  
  in Arcadia Science
  
  we designed and constructed a low-cost parallel imaging platform capable of measuring C. elegans growth for 60 individual animals simultaneously over the course of their ≈ 70 hour development at a temporal resolution of 0.001 Hz, resulting in a time series of ≈ 200 observations per animal. In addition to length and area measured automat-ically, egg hatching, and first egg-laying by mature adults are manually recorded.
  
  Is there a reason for the coarse sampling at 0.001 Hz? Mechanical constraints of the XY plotting robot? Data size constraints? Obviously faster sampling would open up locomotion/behavior as a read out of other possibly interesting, orthogonal phenotypes (with their own developmental modes). Given the video data you are already collecting, it seems like if faster sampling is possible this would be a relatively straightforward - and informative - set of phenotypes to add in?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.04.14.536891v2
May 2023
www.biorxiv.org www.biorxiv.org

Canalisation and plasticity on the developmental manifold of Caenorhabditis elegans

1
1. ryanayork 05 May 2023
  
  in Public
  
  we designed and constructed a low-cost parallel imaging platform capable of measuring C. elegans growth for 60 individual animals simultaneously over the course of their ≈ 70 hour development at a temporal resolution of 0.001 Hz, resulting in a time series of ≈ 200 observations per animal. In addition to length and area measured automat-ically, egg hatching, and first egg-laying by mature adults are manually recorded.
  
  Is there a reason for the coarse sampling at 0.001 Hz? Mechanical constraints of the XY plotting robot? Data size constraints? Obviously faster sampling would open up locomotion/behavior as a read out of other possibly interesting, orthogonal phenotypes (with their own developmental modes). Given the video data you are already collecting, it seems like if faster sampling is possible this would be a relatively straightforward - and informative - set of phenotypes to add in?
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.04.14.536891v2
Mar 2023
www.biorxiv.org www.biorxiv.org

JAX Animal Behavior System (JABS): A video-based phenotyping platform for the laboratory mouse

2
1. ryanayork 04 Mar 2023
  
  in Public
  
  This inter-annotator variability can be associated with (a) subjective differences of behavior definition among human labelers (b) varying level of annotator’s expertise, and (c) training with-in and across labs.
  
  What about intra-annotator variability? Seemingly this could also be an important contributor to inter-annotator variation. Might it make sense to compare multiple annotations from a single annotator and use the average as the basis for the ethograph generation?
2. ryanayork 04 Mar 2023
  
  in Public
  
  In order to test inter-annotator variability, we use generated a set of single mouse behavior classifiers for two simple behaviors, left and right turn. We inferred behavior from all four classifies on a large set of videos and compared the two pairs of classifiers from each annotator
  
  How do these comparisons look for other (potentialyl more 'complex') behaviors? Presumably, turning should be among the more straightforward behaviors for a human to recognize. Do the patterns of inter-annotator agreement change with other behaviors (e.g. grooming) and, if so, would accounting for this increase/descrease performance of the neural network? This is a general risk when using human-based annotations for behavioral classification and it seems to me not easily solved by focussing on a single behavior.
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2022.01.13.476229v2
Feb 2023
www.biorxiv.org www.biorxiv.org

The nuclear to cytoplasmic ratio drives cellularization in the close animal relative Sphaeroforma arctica

1
1. ryanayork 04 Feb 2023
  
  in Public
  
  To assess nuclear density, we measured the average distance from each nucleus to its nearest neighbour
  
  I wonder if it might be useful to do some analyses of the specific spatial orientation of nuclei across the centrifugation experiments. While it is sensible that density may be the primary signal driving cellularization, it is also interesting to consider that there may be higher order relationships between cell distribution or spatial organization that are predictive of the different outcomes (i.e Flip, lysis, irregular invaginations) since centrifugation is a relatively forceful and disruptive approach. Spatial relationships of nuclei could theoretically be extracted by segmentation/registration and performing some basic statistical comparisons to uncover the relationship between the images (e.g. PCA).
Visit annotations in context

Annotators

ryanayork

URL

biorxiv.org/content/10.1101/2023.01.19.524795v1

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL