21 Matching Annotations
  1. Nov 2025
  2. Oct 2025
    1. The accuracy is increased with the number of barcodes usedfor matching.

      The steep drop is quite noticeable from 3->6 bits but it seems to reach some diminishing returns by 12-18. Do you have ideas as to why this plateau occurs? It suggests we're no longer diversity limited. My guess is it's either residual merge errors contaminating the segment-averaged barcode or the KDTree matcher's local radius might obscure some true pairs?

    2. Crucially, we showed automatic proofreading canbridge spatial gaps to reconnect neurite segments bothlocally and even across hundreds of microns, a major steptowards addressing signal discontinuity challenges

      The barcode proofreading approach in Fig 4 is fantastic. I can't, however, find empirical reconnections across hundreds of microns, apologies if I missed it. As you pointed out, though, ~10–30 µm is quite local. Block-wise execution would still be local, it doesn't impose a global consistency.

  3. Jun 2025
    1. We employ Qwen3 [33,34], anautoregressive Transformer-based LLM, initialized with its original pre-trained weights.

      Did you filter out from Qwen any biomedical corpus contamination (e.g., ClinVar abstracts) to avoid data leakage into downstream evaluation?

    2. structuring reasoning steps (e.g.,<think>. . .</think>),

      The explicit traces are a really cool thing to include. Feels like it can be great for hypothesis generation and wet-lab follow-up.

    3. Genomic information, as DNA embeddings fromfDNA, is integrated into thefLLM’s input by stackingthese with embeddings of the user’s queryQTEXTand special tokens such as<dna_start>and<dna_end>

      Stacking them at the token level instead of using adapters is pretty creative!

    4. frequency (MAF) > 5%). The data was split by chromosome (Chr 1–7, 9–22, X, Y for training;Chr 8 for testing)

      Nice, clever guard against leakage.

    5. If an inputSDNAsequence, after tokenization byTDNA, exceeds adefined context length (e.g., 2048 DNA tokens), it is truncated.

      Could this truncation drop functional motifs (e.g., regulatory elements) that lie > 2 kbp from a locus? Are there checks you can do to see whether performance collapses on longer genes?

  4. Apr 2025
    1. Multiple sequence alignment further demonstrated that the key catalytic site, cofactor binding site, and metal ion binding site were highly conserved in the long-chain Fe2+-containing ADH sequences, suggesting that Pinal is capable of designing enzyme sequences that retain critical catalytic activity sites based solely on natural language input (Figure. 7B)

      It would be interesting to design a sequence predicted to be inactive (e.g., by mutating the key catalytic residue or ion binding site) and then confirming its inactivity experimentally, to demonstrate that the model can distinguish functional from non-functional sequences. Similarly, would be interesting to compare ProTrek score (or any of the ranks/scores) against measured enzymatic activity to see if there's a correlation there.

    2. These sequences ranked highest across all evaluation metrics.

      The ADH validation confirming enzyme-dependent activity is a promising proof-of-concept. It will be interesting to see how Pinal performs with proteins with more complex functions. I also think it would be interesting to test a predicted inactive mutant as an additional control.

    1. We considered only varicose projections when they exhibited more than two bulbous structures, discernible processes, and positive staining for either GFAP or S100β.

      This is helpful to know. I'm wondering given the morphological variability of reactive astrocytes across different models (2D, 3D, postmortem) and species, you might consider automated 3D morphometry to supplement manual counting. Or even some kind of inter-rater reliability assessment. This would also help differentiate it from other reactive morphologies like astrocytic clasmatodendrosis.

    2. but rather a reactive phenotype induced under specific pathological conditions.

      I think the conclusion that VP astrocytes represent a reactive phenotype rather than a physiological subtype is very well-supported by the data presented in this work.

    1. The increased complexity ofbiobotscould be a result of increased variability in the beating frequency of cilia in MCCs

      If I'm understanding correctly, this might be a typo, should be neurobots.

    2. We speculate that this region is not truly empty and may be filled with extracellular matrix-like structures (ECM).

      This is intriguing, would be nice to stain for fibronecting or collagen to see if ECM proteins are actually there. Or WFA labeling.

    3. nterestingly, we found that more than 54% of upregulated genesin neurobots fall into the two categories of most ancient genes (“All living organisms” and “Eukaryota”, Fig. 13a).

      The comparison to sham NBs helps here, but there are also existing Xenopus developmental transcriptome data (e.g., early neural induction stages vs. later stages) that could provide context for whether activating these genes is a general feature of neural differentiation? Discussing this could help clarify and strengthen this claim. Additionally, some kind of ontology analysis of these genes would be interesting.

    4. In addition to Cluster 1, Cluster 12 (Supp. Fig. 7c) also included genes related to eye, lens,and retina development, including genes found in major retinal cell types42,i.e. retinal ganglion cells was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted April 19, 2025. ; https://doi.org/10.1101/2025.04.14.648732doi: bioRxiv preprint

      I found the upregulation of retinal genes in the neurobots without specific induction to be particularly interesting It makes me wonder if this hints at the activation of highly conserved, intrinsic transcriptional programs for light sensing that are relatively accessible or perhaps even a default pathway for neural precursors to explore during self-organization in such novel contexts? Seems reminiscent of observations in brain organoids, which have also demonstrated an intrinsic ability to self-organize primitive sensory structures, sometimes even with spatial constraints. Though these neurobots are different than typical organoids, the fact that both systems originate neural precursors might suggests these cells might retain a latent capacity to access fundamental sensory development routines when placed in a permissive environment.

  5. Jan 2025
    1. Surprisingly, intersectional analysis revealed that none of these DEGs were wave regulated (Fig. 4C), suggesting that spontaneous activity-mediated DSGC wiring does not occur by modulating cell-intrinsic genetic programs. While this result may also reflect the limited resolution of single-cell sequencing, which captures only about 15% of the cell’s transcriptome (79), it is also possible that retinal waves exerts their influence through post-transcriptional or translational processes in DSGCs (80).

      This to me is of the major thought-provoking findings of this paper and suggests that our understanding of how early activity shapes circuit development is worth expanding. I agree that one interpretation of the seemingly wave-independent DEGs could be that there may be post-transcriptional modifications or local protein synthesis at synapses. But we also are assuming that major functional changes need to be reflected in transcriptional signatures, and it might be worth exploring the possibility that other mechanisms are at play, such as micro-circuit properties (e.g. synaptic organization), ion channel localization, or something else.

    1. Optimized seeds were selected based on several criteria: computed binding energy (ddG), shape complementarity, the number of interface hydrogen bonds, the number of buried unsatisfied polar atoms, and the number of atoms in contact with the small molecule.

      Could you consider for instance, including a table or brief flowchart showing the specific cutoff values (e.g., ddG < -X kcal/mol) and how many designs passed or failed at each step would help readers replicate your pipeline. This more detailed account of each criterion (and its relative weight in eliminating or selecting designs) would be helpful.

    2. Additionally, all scripts for executing the workflow on a large scale, defining unbound-state scores for binding sites, analyzing the results, and optimizing binders resulting from MaSIF-seed-search can be found at https://github.com/hamedkhakzad/SURFACE-Bind.

      Congratulations on this comprehensive work. The referenced GitHub repository, a Zenodo data deposit, and the SURFACE-Bind database as key resources for replicating and extending the findings. However, the GitHub link appears inactive, the Zenodo record is unavailable, and the database link does not load a page. As these resources are essential for reproducibility and broader community use, please ensure that working links are provided or that readers are informed of the timeline for making them accessible.