10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel usage of fluorescence life-time imaging microscopy (FLIM) to measure NAD(P)H autofluorescence in the Drosophila brain, as a proxy for cellular metabolic/redox states. This new method relies on the fact that both NADH and NADPH are autofluorescent, with a different excitation lifetime depending on whether they are free (indicating glycolysis) or protein-bound (indicating oxidative phosphorylation). The authors successfully use this method in Drosophila to measure changes in metabolic activity across different areas of the fly brain, with a particular focus on the main center for associative memory: the mushroom body.

      Strengths:

      The authors have made a commendable effort to explain the technical aspects of the method in accessible language. This clarity will benefit both non-experts seeking to understand the methodology and researchers interested in applying FLIM to Drosophila in other contexts.

      Weaknesses:

      Despite being statistically significant, the learning-induced change in f-free in α/β Kenyon cells is minimal (a decrease from 0.76 to 0.73, with a high variability). It is unclear whether this small effect represents a meaningful shift in neuronal metabolic state.

      Whether this method can be valuable to examine the effects of long-term memory (after spaced or massed conditioning) remains to be established.

    2. Reviewer #2 (Public review):

      This revised manuscript presents a valuable application of NAD(P)H fluorescence lifetime imaging (FLIM) to study metabolic activity in the Drosophila brain. The authors reveal regional differences in oxidative and glycolytic metabolism, with particular emphasis on the mushroom body, a key center for associative learning and memory. They also report metabolic shifts in α/β Kenyon cells following classical conditioning, in line with their known role in energy-demanding memory processes.

      The study is well-executed and the authors have added more detailed methodological descriptions in this version, which strengthen the technical contribution. The analysis pipeline is rigorous, with careful curve fitting and appropriate controls. However, the metabolic shifts observed after conditioning are small and only weakly significant, raising questions about the sensitivity of FLIM for detecting subtle physiological changes. The authors acknowledge these limitations in the revised discussion, which helps place the findings in proper context.

      Despite this, the work provides a solid foundation for future applications of label-free FLIM in vivo and serves as a valuable technical resource for researchers interested in neural metabolism. Overall, this study represents a meaningful step toward integrating metabolic imaging with the study of neural activity and cognitive function.

    3. Reviewer #3 (Public review):

      This study investigates the characteristics of the autofluorescence signal excited by 740 nm 2-photon excitation, in the range of 420-500 nm, across the Drosophila brain. The fluorescence lifetime (FL) appears bi-exponential, with a short 0.4 ns time constant followed by a longer decay. The lifetime decay and the resulting parameter fits vary across the brain. The resulting maps reveal anatomical landmarks, which simultaneous imaging of genetically encoded fluorescent proteins help identify. Past work has shown that the autofluorescence decay time course reflects the balance of the redox enzyme NAD(P)H vs. its protein bound form. The ratio of free to bound NADPH is thought to indicate relative glycolysis vs. oxidative phosphorylation, and thus shifts in the free-to-bound ratio may indicate shifts in metabolic pathways. The basics of this measure have been demonstrated in other organisms, and this study is the first to use the FLIM module of the STELLARIS 8 FALCON microscope from Leica to measure autofluorescence lifetime in the brain of the fly. Methods include registering brains of different flies to a common template and masking out anatomical regions of interest using fluorescence proteins.

      The analysis relies on fitting a FL decay model with two free parameters, f_free and T_bound. F_free is the fraction of the normalized curve contributed by a decaying exponential with a time constant 0.4 ns, thought to represent the FL of free NADPH or NADH, which apparently cannot be distinguished. T_bound is the time constant of the second exponential, with scalar amplitude = (1-f_free). The T_bound fit is thought to represent the decay time constant of protein bound NADPH, but can differ depending on the protein. The study shows that across the brain, T_bound can range from 0 to >5 ns, whereas f_free can range from 0.5 to 0.9 ns (Figure 1a). The paper beautifully lays out the analysis pipeline, providing a valuable resource. The full range of fits are reported, including maximum likelihood quality parameters, and can be benchmarks for future studies.

      The authors measure properties of NADPH related autofluorescence of Kenyon Cells (KCs) of the fly mushroom body. The somata and calyx of mushroom bodies have a longer average tau_bound than other regions (Figure 1e); the f_free fit is higher for the calyx (input synapses) region than for KC somata; and the average across flies of average f_free fits in alpha/beta KC somata decreases slightly following paired presentation of odor and shock, compared to unpaired presentation of the same stimuli. Though the change is slight, no comparable change is detected in gamma KCs, suggesting that distributions of f_free derived from FL may be sensitive enough to measure changes in metabolic pathways following conditioning.

      FLIM as a method is not yet widely prevalent in fly neuroscience, but recent demonstrations of its potential are likely to increase its use. Future efforts will benefit from the description of the properties of the autofluorescence signal to evaluate how autofluorescence may impact measures of FL of genetically engineered indicators.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Wang et al. reports the potential involvement of an asymmetric neurocircuit in the sympathetic control of liver glucose metabolism.

      Strengths:

      The concept that the contralateral brain-liver neurocircuit preferentially regulates each liver lobe may be interesting.

      Weaknesses:

      However, the experimental evidence presented did not support the study's central conclusion.

      (1) Pseudorabies virus (PRV) tracing experiment:<br /> The liver not only possesses sympathetic innervations but also vagal sensory innervations. The experimental setup failed to distinguish whether the PRV-labeling of LPGi (Lateral Paragigantocellular Nucleus) is derived from sympathetic or vagal sensory inputs to the liver.

      (2) Impact on pancreas:<br /> The celiac ganglia not only provide sympathetic innervations to the liver but also to the pancreas, the central endocrine organ for glucose metabolism. The chemogenetic manipulation of LPGi failed to consider a direct impact on the secretion of insulin and glucagon from the pancreas.

      (3) Neuroanatomy of the brain-liver neurocircuit:<br /> The current study and its conclusion are based on a speculative brain-liver sympathetic circuit without the necessary anatomical information downstream of LPGi.

      (4) Local manipulation of the celiac ganglia:<br /> The left and right ganglia of mice are not separate from each other but rather anatomically connected. The claim that the local injection of AAV in the left or right ganglion without affecting the other side is against this basic anatomical feature.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Wang and colleagues aims to determine whether the left and right LPGi differentially regulate hepatic glucose metabolism and to reveal decussation of hepatic sympathetic nerves.

      The authors used tissue clearing to identify sympathetic fibers in the liver lobes, then injected PRV into the hepatic lobes. Five days post-injection, PRV-labeled neurons in the LPGi were identified. The results indicated contralateral dominance of premotor neurons and partial innervation of more than one lobe. Then the authors activated each side of the LPGi, resulting in a greater increase in blood glucose levels after right-sided activation than after left-sided activation, as well as changes in protein expression in the liver lobes. These data suggested modulation of HGP (hepatic glucose production) in a lobe-specific manner. Chemical denervation of a particular lobe did not affect glucose levels due to compensation by the other lobes. In addition, nerve bundles decussate in the hepatic portal region.

      Strengths:

      The manuscript is timely and relevant. It is important to understand the sympathetic regulation of the liver and the contribution of each lobe to hepatic glucose production. The authors use state-of-the-art methodology.

      Weaknesses:

      (1) The wording/terminology used in the manuscript is misleading, and it is not used in the proper context. For instance, the goal of the study is "to investigate whether cerebral hemispheres differentially regulate hepatic glucose metabolism..." (see abstract); however, the authors focus on the brainstem (a single structure without hemispheres). Similarly, symmetric is not the best word for the projections.

      (2) Sparse labeling of liver-related neurons was shown in the LPGi (Figure 1). It would be ideal to have lower magnification images to show the area. Higher quality images would be necessary, as it is difficult to identify brainstem areas. The low number of labeled neurons in the LPGi after five days of inoculation is surprising. Previous findings showed extensive labeling in the ventral brainstem at four days post-inoculation (Desmoulins et al., 2025). Unfortunately, it is not possible to compare the injection paradigm/methods because the PRV inoculation is missing from the methods section. If the PRV is different from the previously published viral tracers, time-dependent studies to determine the order of neurons and the time course of infection would be necessary.

      (3) Not all LPGi cells are liver-related. Was the entire LPGi population stimulated, or was it done in a cell-type-specific manner? What was the strain, sex, and age of the mice? What was the rationale for using the particular viral constructs?

      (4) The authors should consider the effect of stimulation of double-labeled neurons (innervating more than one lobe) and potential confounding effects regarding other physiological functions.

      (5) The authors state that "central projections directly descend along the sympathetic chain to the celiac-superior mesenteric ganglia". What they mean is unclear. Do the authors refer to pre-ganglionic neurons or premotor neurons? How does it fit with the previous literature?

      (6) How was the chemical denervation completed for the individual lobes?

      (7) The Western Blot images look like they are from different blots, but there are no details provided regarding protein amount (loading) or housekeeping. What was the reason to switch beta-actin and alpha-tubulin? In Figures 3F -G, the GS expression is not a good representative image. Were chemiluminescence or fluorescence antibodies used? Were the membranes reused?

      (8) Key references using PRV for liver innervation studies are missing (Stanley et al, 2010 [PMID: 20351287]; Torres et al., 2021 [PMID: 34231420]; Desmoulins et al., 2025 [PMID: 39647176]).

    3. Reviewer #3 (Public review):

      Summary:

      This study found a lobe-specific, lateralized control of hepatic glucose metabolism by the brain and provides anatomical evidence for sympathetic crossover at the porta hepatis. The findings are particularly insightful to the researchers in the field of liver metabolism, regeneration, and tumors.

      Strengths:

      Increasing evidence suggests spatial heterogeneity of the liver across many aspects of metabolism and regenerative capacity. The current study has provided interesting findings: neuronal innervation of the liver also shows anatomical differences across lobes. The findings could be particularly useful for understanding liver pathophysiology and treatment, such as metabolic interventions or transplantation.

      Weaknesses:

      Inclusion of detailed method and Discussion:

      (1) The quantitative results of PRV-labeled neurons are presented, and please include the specific quantitative methods.

      (2) The Discussion can be expanded to include potential biological advantages of this complex lateralized innervation pattern.

    4. Reviewer #4 (Public review):

      Summary:

      The studies here are highly informative in terms of anatomical tracing and sympathetic nerve function in the liver related to glucose levels, but given that they are performed in a single species, it is challenging to translated them to humans, or to determine whether these neural circuits are evolutionarily conserved. Dual-labeling anatomical studies are elegant, and the addition of chemogenetic and optogenetic studies is mechanistically informative. Denervation studies lack appropriate controls, and the role of sensory innervation in the liver is overlooked.

      Specific Weaknesses - Major:

      (1) The species name should be included in the title.

      (2) Tyrosine hydroxylase was used to mark sympathetic fibers in the liver, but this marker also hits a portion of sensory fibers that need to be ruled out in whole-mount imaging data

      (3) Chemogenetic and optogenetic data demonstrating hyperglycemia should be described in the context of prior work demonstrating liver nerve involvement in these processes. There is only a brief mention in the Discussion currently, but comparing methods and observations would be helpful.

      (4) Sympathetic denervation with 6-OHDA can drive compensatory increases to tissue sensory innervation, and this should be measured in the liver denervation studies to implicate potential crosstalk, especially given the increase in LPGi cFOS that may be due to afferent nerve activity. Compensatory sympathetic drive may not be the only culprit, though it is clearly assumed to be. The sensory or parasympathetic/vagal innervation of the liver is altogether ignored in this paper and could be better described in general.

    1. Reviewer #1 (Public review):

      Summary:

      In this paper, the authors investigate the effects of Miro1 on VSMC biology after injury. Using conditional knockout animals, they provide the important observation that Miro1 is required for neointima formation. They also confirm that Miro1 is expressed in human coronary arteries. Specifically, in conditions of coronary diseases, it is localized in both media and neointima and, in atherosclerotic plaque, Miro1 is expressed in proliferating cells.

      However, the role of Miro1 in VSMC in CV diseases is poorly studied and the data available are limited; therefore, the authors decided to deepen this aspect. The evidence that Miro-/- VSMCs show impaired proliferation and an arrest in S phase is solid and further sustained by restoring Miro1 to control levels, normalizing proliferation. Miro1 also affects mitochondrial distribution, which is strikingly changed after Miro1 deletion. Both effects are associated with impaired energy metabolism due to the ability of Miro1 to participate in MICOS/MIB complex assembly, influencing mitochondrial cristae folding. Interestingly, the authors also show the interaction of Miro1 with NDUFA9, globally affecting super complex 2 assembly and complex I activity.<br /> Finally, these important findings also apply to human cells and can be partially replicated using a pharmacological approach, proposing Miro1 as a target for vasoproliferative diseases.

      Strengths:

      The discovery of Miro1 relevance in neointima information is compelling, as well as the evidence in VSMC that MIRO1 loss impairs mitochondrial cristae formation, expanding observations previously obtained in embryonic fibroblasts.<br /> The identification of MIRO1 interaction with NDUFA9 is novel and adds value to this paper. Similarly, the findings that VSMC proliferation requires mitochondrial ATP support the new idea that these cells do not rely mostly on glycolysis.

      The revised manuscript includes additional data supporting mitochondrial bioenergetic impairment in MIRO1 knockout VSMCs. Measurements of oxygen consumption rate (OCR), along with Complex I (ETC-CI) and Complex V activity, have been added and analyzed across multiple experimental conditions. Collectively, these findings provide a more comprehensive characterization of the mitochondrial functional state. Following revision, the association between MIRO1 deficiency and impaired Complex I activity is more robust.

      Although the precise molecular mechanism of action remains to be fully elucidated, in this updated version, experiments using a MIRO1 reducing agent are presented with improved clarity

      Although some limitations remain, the authors have addressed nearly all the concerns raised, and the manuscript has substantially improved

      Weaknesses:

      Figure 6: The authors do not address the concern regarding the cristae shape; however, characterization of the cristae phenotype with MIRO1 ΔTM would have strengthened the mechanistic link between MIRO1 and the MIB/MICOS complex

      Although the authors clarified their reasoning, they did not explore in vivo validation of key biochemical findings, which represents a limitation of the current study. While their justification is acknowledged, at least a preliminary exploratory effort could have been evaluated to reinforce the translational relevance of the study.

      Finally, in line with the explanations outlined in the rebuttal, the Discussion section should mention the limits of MIRO1 reducer treatment.

    2. Reviewer #2 (Public review):

      Summary:

      This study identifies the outer‑mitochondrial GTPase MIRO1 as a central regulator of vascular smooth muscle cell (VSMC) proliferation and neointima formation after carotid injury in vivo and PDGF-stimulation ex vivo. Using smooth muscle-specific knockout male mice, complementary in vitro murine and human VSMC cell models, and analyses of mitochondrial positioning, cristae architecture and respirometry, the authors provide solid evidence that MIRO1 couples mitochondrial motility with ATP production to meet the energetic demands of the G1/S cell cycle transition. However, a component of the metabolic analyses are suboptimal and would benefit from more robust methodologies. The work is valuable because it links mitochondrial dynamics to vascular remodelling and suggests MIRO1 as a therapeutic target for vasoproliferative diseases, although whether pharmacological targeting of MIRO1 in vivo can effectively reduce neointima after carotid injury has not been explored. This paper will be of interest to those working on VSMCs and mitochondrial biology.

      Strengths:

      The strength of the study lies in its comprehensive approach assessing the role of MIRO1 in VSMC proliferation in vivo, ex vivo and importantly in human cells. The subject provides mechanistic links between MIRO1-mediated regulation of mitochondrial mobility and optimal respiratory chain function to cell cycle progression and proliferation. Finally, the findings are potentially clinically relevant given the presence of MIRO1 in human atherosclerotic plaques and the available small molecule MIRO1.

      Weaknesses:

      (1) High-resolution respirometry (Oroboros) to determine mitochondrial ETC activity in permeabilized VSMCs would be informative.

      (2) Therapeutic targeting of MIRO1 failed to prevent neointima formation, however, the technical difficulties of such an experiment is appreciated.

    3. Reviewer #3 (Public review):

      Summary:

      This study addresses the role of MIRO1 in vascular smooth muscle cell proliferation, proposing a link between MIRO1 loss and altered growth due to disrupted mitochondrial dynamics and function. While the findings are useful for understanding the importance of mitochondrial positioning and function in this specific cell type, the main bioenergetic and mechanistic claims are not strongly supported.

      Strengths:

      This study focuses on an important regulatory protein, MIRO1, and its role in vascular smooth muscle cell (VSMC) proliferation, a relatively underexplored context.

      This study explores the link between smooth muscle cell growth, mitochondrial dynamics, and bioenergetics, which is a significant area for both basic and translational biology.

      The use of both in vivo and in vitro systems provides a useful experimental framework to interrogate MIRO1 function in this context.

      Weaknesses:

      The proposed link between MIRO1 and respiratory supercomplex biogenesis or function is not clearly defined.

      Completeness and integration of mitochondrial assays is marginal, undermining the strength of the conclusions regarding oxidative phosphorylation.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Conformational Variability of HIV-1 Env Trimer and Viral Vulnerability", the authors study the fully glycosylated HIV-1 Env protein using an all-atom forcefield. It combines long all-atom simulations of Env in a realistic asymmetric bilayer with careful data analysis. This work clarifies how the CT domain modulates the overall conformation of the Env ectodomain and characterizes different MPER-TMD conformations. The authors also carefully analyze the accessibility of different antibodies to the Env protein.

      Strengths:

      This paper is state-of-the-art, given the scale of the system and the sophistication of the methods. The biological question is important, the methodology is rigorous, and the results will interest a broad audience.

      Weaknesses:

      The manuscript lacks a discussion of previous studies. The authors should consider addressing or comparing their work with the following points:

      (1) Tilting of the Env ectodomain has also been reported in previous experimental and theoretical work:

      https://doi.org/10.1101/2025.03.26.645577

      (2) A previous all-atom simulation study has characterized the conformational heterogeneity of the MPER-TMD domain:

      https://doi.org/10.1021/jacs.5c15421

      (3) Experimental studies have shown that MPER-directed antibodies recognize the prehairpin intermediate rather than the prefusion state:

      https://doi.org/10.1073/pnas.1807259115

      (4) How does the CT domain modulate the accessibility of these antibodies studied? The authors are in a strong position to compare their results with the following experimental study:

      https://doi.org/10.1126/science.aaa9804

    2. Reviewer #2 (Public review):

      (1) Summary

      In this work, the authors aim to elucidate how a viral surface protein behaves in a membrane environment and how its large-scale motions influence the exposure of antibody-binding sites. Using long-timescale, all-atom molecular dynamics simulations of a fully glycosylated, full-length protein embedded in a virus-like membrane, the study systematically examines the coupling between ectodomain motion, transmembrane orientation, membrane interactions, and epitope accessibility. By comparing multiple model variants that differ in cleavage state, initial transmembrane configuration, and presence of the cytoplasmic tail, the authors aim to identify general features of protein-membrane dynamics relevant to antibody recognition.

      (2) Strengths

      A major strength of this study is the scope and ambition of the simulations. The authors perform multiple microsecond-scale simulations of a highly complex, biologically realistic system that includes the full ectodomain, transmembrane region, cytoplasmic tail, glycans, and a heterogeneous membrane. Such simulations remain technically challenging, and the work represents a substantial computational and methodological effort.

      The analysis provides a clear and intuitive description of large-scale protein motions relative to the membrane, including ectodomain tilting and transmembrane orientation. The finding that the ectodomain explores a wide range of tilt angles while the transmembrane region remains more constrained, with limited correlation between the two, offers useful conceptual insight into how global motions may be accommodated without large rearrangements at the membrane anchor.

      Another strength is the explicit consideration of membrane and glycan steric effects on antibody accessibility. By evaluating multiple classes of antibodies targeting distinct regions of the protein, the study highlights how membrane proximity and glycan dynamics can differentially influence access to different epitopes. This comparative approach helps place the results in a broader immunological context and may be useful for readers interested in antibody recognition or vaccine design.

      Overall, the results are internally consistent across multiple simulations and model variants, and the conclusions are generally well aligned with the data presented.

      (3) Weaknesses

      The main limitations of the study relate to sampling and model dependence, which are inherent challenges for simulations of this size and complexity. Although the simulations are long by current standards, individual trajectories explore only portions of the available conformational space, and several conclusions rely on pooling data across a limited number of replicas. This makes it difficult to fully assess the robustness of some quantitative trends, particularly for rare events such as specific epitope accessibility states.

      In addition, several aspects of the model construction, including the treatment of missing regions, loop rebuilding, and initial configuration choices, are necessarily approximate. While these approaches are reasonable and well motivated, the extent to which some conclusions depend on these modeling choices is not always fully clear from the current presentation.

      Finally, the analysis of antibody accessibility is based on geometric and steric criteria, which provide a useful first-order approximation but do not capture potential conformational adaptations of antibodies or membrane remodeling during binding. As a result, the accessibility results should be interpreted primarily as model-based predictions rather than definitive statements about binding competence.

      Despite these limitations, the study provides a valuable and carefully executed contribution, and its datasets and analytical framework are likely to be useful to others interested in protein-membrane interactions and antibody recognition.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses large-scale all-atom molecular dynamics simulations to examine the conformational plasticity of the HIV-1 envelope glycoprotein (Env) in a membrane context, with particular emphasis on how the transmembrane domain (TMD), cytoplasmic tail (CT), and membrane environment influence ectodomain orientation and antibody epitope exposure. By comparing Env constructs with and without the CT, explicitly modeling glycosylation, and embedding Env in an asymmetric lipid bilayer, the authors aim to provide an integrated view of how membrane-proximal regions and lipid interactions shape Env antigenicity, including epitopes targeted by MPER-directed antibodies.

      Strengths:

      A key strength of this work is the scope and realism of the simulation systems. The authors construct a very large, nearly complete Env-scale model that includes a glycosylated Env trimer embedded in an asymmetric bilayer, enabling analysis of membrane-protein interactions that are difficult to capture experimentally. The inclusion of specific glycans at reported sites, and the focus on constructs with and without the CT, are well motivated by existing biological and structural data.

      The simulations reveal substantial tilting motions of the ectodomain relative to the membrane, with angles spanning roughly 0-30{degree sign} (and up to ~50{degree sign} in some analyses), while the ectodomain itself remains relatively rigid. This framing, that much of Env's conformational variability arises from rigid-body tilting rather than large internal rearrangements, is an important conceptual contribution. The authors also provide interesting observations regarding asymmetric bilayer deformations, including localized thinning and altered lipid headgroup interactions near the TMD and CT, which suggest a reciprocal coupling between Env and the surrounding membrane.

      The analysis of antibody-relevant epitopes across the prefusion state, including the V1/V2 and V3 loops, the CD4 binding site, and the MPER, is another strength. The study makes effective use of existing experimental knowledge in this context, for example, by focusing on specific glycans known to occlude antibody binding, to motivate and interpret the simulations.

      Weaknesses:

      While the simulations are technically impressive, the manuscript would benefit from more explicit cross-validation against prior experimental and computational work throughout the Results and Discussion, and better framing in the introduction. Many of the reported behaviors, such as ectodomain tilting, TMD kinking, lipid interactions at helix boundaries, and aspects of membrane deformation, have been described previously in a range of MD studies of HIV Env and related constructs (e.g., PMC2730987, PMC2980712, PMC4254001, PMC4040535, PMC6035291, PMC12665260, PMID: 33882664, PMC11975376). Clearly situating the present results relative to these studies would strengthen the paper by clarifying where the simulations reproduce established behavior and where they extend it to more complete or realistic systems.

      A related limitation is that the work remains largely descriptive with respect to conformational coupling. Numerous experimental studies have demonstrated functional and conformational coupling between the TMD, CT, and the antigenic surface, with effects on Env stability, infectivity, and antibody binding (e.g., PMC4701381, PMC4304640, PMC5085267). In this context, the statement that ectodomain and TMD tilting motions are independent is a strong conclusion that is not fully supported by the analyses presented, particularly given the authors' acknowledgment that multiple independent simulations are required to adequately sample conformational space. More direct analyses of coupling, rather than correlations inferred from individual trajectories, would help align the simulations with the existing experimental literature. Given the scale of these simulations, a more thorough analysis of coupling could be this paper's most seminal contribution to the field.

      The choice of membrane composition also warrants deeper discussion. The manuscript states that it relies on a plasma membrane model derived from a prior simulation-based study, which itself is based on host plasma membrane (PMID: 35167752), but experimental analyses have shown that HIV virions differ substantially from host plasma membranes (e.g., PMC46679, PMC1413831, PMC10663554, PMC5039752, PMC6881329). In particular, virions are depleted in PC, PE, and PI, and enriched in phosphatidylserine, sphingomyelins, and cholesterol. These differences are likely to influence bilayer thickness, rigidity, and lipid-protein interactions and, therefore, may affect the generality of the conclusions regarding Env dynamics and antigenicity. Notably, the citation provided for membrane composition is a laboratory self-citation, a secondary source, rather than a primary experimental study on plasma membrane composition.

      Finally, there are pervasive issues with citation and methodological clarity. Several structural models are referred to only by PDB ID without citation, and in at least one case, a structure described as cryo-EM is in fact an NMR-derived model. Statements regarding residue flexibility, missing regions in structures, and comparisons to prior dynamics studies are often presented without appropriate references. The Methods section also lacks sufficient detail for a system of this size and complexity, limiting readers' ability to assess robustness or reproducibility.

      With stronger integration of prior experimental and computational literature, this work has the potential to serve as a valuable reference for how Env behaves in a realistic, glycosylated, membrane-embedded context. The simulation framework itself is well-suited for future studies incorporating mutations, strain variation, antibodies, inhibitors, or receptor and co-receptor engagement. In its current form, the primary contribution of the study is to consolidate and extend existing observations within a single, large-scale model, providing a useful platform for future mechanistic investigations.

    1. Joint Public Review:

      Quite obviously, the brain encodes "time", as we are able to tell if something happened before or after something else. How this is done, however, remains essentially not understood. In the context of Working Memory tasks, many experiments have shown that the neural activity during the retention period "encodes" time, besides the stimulus to be remembered; that is, the time elapsed from stimulus presentation can be reliably inferred from the recordings, even if time per se is not important for the task. This implies 'mixed selectivity', in the weak sense of neural activity varying with both stimulus identity and time elapsed (since presentation).

      In this paper, the authors investigate the implications of a specific form of such mixed selectivity, that is, conjunctive coding of what (stimulus) and when (time) at the single-neuron level, on the resulting dynamics of the population activity when 'viewed' through linear dimensionality-reduction techniques, essentially Principal Component Analysis (PCA). The theoretical/modeling results presented provide a useful guide to the interpretation of the experimental results; in particular, with respect to what can, or cannot, be rightfully inferred from those experimental results (using PCA-like techniques). The results are essentially theoretical in nature; there are, however, some conclusions that require a more precise justification, in my opinion. More generally, as the authors themselves discuss in the paper, it is not clear how to generalize this coding scheme to more complicated, but behaviorally and cognitively relevant, situations, such as multi-item WM or WM for sequences.

      (1) It is unclear to me how the conjunctive code that the authors use (i.e., Equation (3)) is constrained by the theoretical desiderata (i.e., compositionality) they list, or whether it is simply an ansatz, partly motivated by theoretical considerations and experimental observations.

      The "what" part: What the authors mean by "relationships" between stimuli is never clearly defined. From their argument (and from Figure 1b), it would seem that what they mean is "angles" between population vectors for all pairs of stimuli. If this is so, then the effect of the passing time can only amount to a uniform rescaling of the components of the population vector (i.e., it must be a similarity transformation; rotations are excluded, if the linear-decoder vectors are to be time-independent); the scaling factor, then, must be a strictly monotonous function of time (increasing or decreasing), if one is to decode time. In other words, the "when" receptive fields must be the same for all neurons.

      The "when" part: The condition, \tau_3=\tau_1+\tau_2, does not appear to be used at all. In fact, it is unclear (to me at least) whether the model, as it is formulated, is able to represent time intervals between stimuli.

      (2) For the specific case considered, i.e., conjunctive coding, it would seem that one should be able to analytically work out the demixed PCA (see Kobak et al., 2016). More generally, it seems interesting to compare the results of the PCA and the demixed PCA in this specific case, even just using synthetic data.

      (3) In the Section "Dimensionality of neural trajectories...", there is some claim about how the dimensionality of the population activity goes up with the observation window T, backed up by numerical results that somehow mimic the results of Cueva et al. (2020) on experimental data. Is this a result that can be formally derived? Related to this point, it would be useful to provide a little more justification for Equation (17). Naively, one would think that the correlation matrix of the temporal component is always full-rank nominally, but that one can get excellent low-rank approximations (depending on T, following your argument).

    1. Reviewer #1 (Public review):

      Summary

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors, as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses

      Overall, I feel positive about this paper. However, there are a couple of aspects to the manuscript that I think could be improved.

      (1) Distinguishing NCCs from their prerequisites or consequences

      This section in the introduction was particularly confusing to me. Namely, in this section, the authors' aim is to explain how intracranial recordings can help distinguish 'pure' NCCs from their antecedents and consequences. However, the authors almost exclusively describe different tasks (e.g., no-report tasks) that have been used to help solve this problem, rather than elaborating on how intracranial recordings may resolve this issue. The authors claim that no-report designs rely on null findings, and invasive recordings can be more sensitive to smaller effects, which can help in such cases. However, this motivation pertains to the previous sub-section (limits of noninvasive methods), since it is primarily concerned with the lack of temporal and spatial resolution of fMRI and M/EEG. It is not, in and of itself, a means to distinguish NCCs from their confounds.

      As such, in its current formulation, I do not find the argument that intracranial recordings are better suited to identifying pure NCCs (i.e. separating them from pre- or post-processing) convincing. To me, this is a problem solved through novel paradigms and better-developed theories. As it stands, the paper justifies my position by highlighting task developments that help to distinguish NCCs from prerequisites and consequences, rather than giving a novel argument as to why intracranial recordings outperform noninvasive methods beyond the reasons they explained in the previous section. Again, this position is justified when, from lines 505-506, the authors describe how none of the reported single-cell studies were able to dissociate NCCs from post-perceptual processing. As such, it seems as if, even with intracranial recording, NCCs and their confounds cannot be disentangled without appropriate tasks.

      The section 'Towards Better Behavioural Paradigms' is a clear attempt to address these issues and, as such, I am sure the authors share the same concerns as I am raising. Still, I remain unconvinced that the distinguishing of NCCs from pre-/post- processing is a fair motivation for using intracranial over noninvasive measures.

      (2) Drawing misleading conclusions from certain studies

      There are passages of the manuscript where the authors draw conclusions from studies that are not necessarily warranted by the studies they cite. For instance:

      Lines 265 - 271: "The results of these two studies revealed a complex pattern: on the one hand, HGA in the lateral occipitotemporal cortex and the ventral visual cortex correlated with stimulus strength. On the other hand, it also correlated with another factor that does not appear to play a role in visibility (repetition suppression), and did not correlate with a non-sensory factor that affects visibility reports (prior exposure). These results suggest that activity in occipitotemporal cortex regions reflecting higher-order visual processing may be a precursor to the NCC but not an NCC proper."

      It's possible to imagine a theory that would predict HGA could correlate with stimulus strength and repetition suppression, or that it would not correlate with prior exposure (e.g. prior exposure could impact response bias without affecting subjective visibility itself). The authors describe this exact ambiguity in interpretation later in the article (line 664), but in its current form, at least in line 270 (when the study is most extensively discussed), the manuscript heavily implies that HGA is not an NCC proper. This generates a false impression that intracranial recordings have conclusively determined that occipitotemporal HGA is not a pure NCC, which is certainly a premature conclusion.

      Line 243: "Altogether, these early human intracranial studies indicate that early-latency visual processing steps, reflected in broadband and low gamma activity, occur irrespective of whether a stimulus is consciously perceived or not. They also identified a candidate NCC: later (>200 ms) activity in the occipitotemporal region responsible for higher-order visual processing."

      The authors claim in this section that later (>200ms) activity in occipitotemporal regions may be a candidate for an NCC. However, the Fisch et al. (2009) study they describe in support of this conclusion found that early (~150ms) activity could dissociate conscious and unconscious processing. This would suggest that it is early processing that lays claim to perceptual consciousness. The authors explicitly describe the Fisch et al results as showing evidence for early markers of consciousness (line 240: '...exhibited an early...response following recognized vs unrecognised stimuli.) Yet only a few lines later they use this to support the conclusion that a candidate NCC is 'later (>200ms) activity in the occipitotemporal region' (line 245). As such, I am not sure what conclusion the authors want me to make from these studies.

      This problem is repeated in lines 386-387: "Altogether, studies that investigated the cortical correlates of visual consciousness point to a role of neural responses starting ~250 ms after stimulus onset in the non-primary visual cortex and prefrontal cortex."

      This seems to be directly in conflict with the Fisch et al results, which show that correlates of consciousness can begin ~100ms earlier than the authors state in this passage.

      (3) Justifying single-neuron cortical correlates of consciousness

      The purpose of the present manuscript is to highlight why and how intracortical measures of neural activity can help reveal the neural correlates of perceptual consciousness. As such, in the section 'Single-neuron cortical correlates of perceptual consciousness', I think the paper is lacking an argument as to why single-neuron research is useful when searching for the NCC. Most theories of consciousness are based around circuit or system-level analyses (e.g., global ignition, recurrent feedback, prefrontal indexing, etc.) and usually do not make predictions about single cells. Without any elaboration or argument as to why single-cell research is necessary for a science of consciousness, the research described in this section, although excellent and valuable in its own right, seems out of place in the broader discussion of NCCs. A particularly strong interpretation here could be that intracranial recordings mislead researchers into studying single cells simply because it is the finest level of analysis, rather than because it offers helpful insight into the NCCs.

      (4) No mention of combined fMRI-EEG research

      A minor point, but I was surprised that the authors did not mention any combined fMRI-EEG research when they were discussing the limits of noninvasive recordings. Intracortical recordings are one way to surpass the spatial and temporal resolution limits of M/EEG and fMRI respectively, but studies that combine fMRI and EEG are also an alternative means to solve this problem: by combining the spatial resolution of fMRI with the temporal resolution of EEG, researchers can - in theory - compare when and where certain activity patterns (be they univariate ERPs or multivariate patterns) arise. The authors do cite one paper (Dellert et al., 2021 JNeuro) that used this kind of setup, but they discuss it only with respect to the task and ignore the recording method. The argument for using intracranial recordings is weaker for not mentioning a viable, noninvasive alternative that resolves the same issues.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with its own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review of the knowledge acquired by using invasive recordings in humans. This included population-level measurements in vision and in other sensory modalities, as well as single-neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC and for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      The intention of the authors is to argue how some of the problems faced when studying NCCs are alleviated by the use of intracranial recordings in humans. But in some cases, the link between the problems related to the study of NCCs and the advantages of intracranial recordings over non-invasive methods is not clear.

      For example, the authors explain the difficulties in distinguishing between true NCCs from their prerequisites and consequences. This constitutes a difficult conceptual problems that plague all recording techniques. The authors don't provide a convincing explanation of how intracranial recordings offer advantages over EEG or MEG when dealing with these problems.

      For example, the authors explain how the use of non-report designs to rule out post-perceptual processing relies on null results, which, according to them, are harder to interpret given the low resolution of non-invasive methods. But the interpretation of null results is actually more complicated in the case of intracranial recordings. As the coverage achieved by the electrodes is sparse, if a null result is attested, it remains possible that a true effect was present in a nearby patch of cortex out of coverage.

      The authors argue that the spatial resolution of intracranial recordings is better than that of EEG and MEG. While this is technically true (especially compared to EEG), the true spatial scale of the NCCs is unknown. If NCCs' span is in the mm range, then the additional spatial resolution of intracranial recordings might not be an advantage.

      Another factor that should be taken into consideration when assessing the spatial resolution of intracranial recordings is that while the listening zone of individual intracranial contacts is small, coverage is sparse and defined by clinical criteria (something that the authors discuss). In practice, the activity recorded by contacts is usually attributed to anatomically defined ROIs with a scale in the cm range. Given the sparse and uneven (across regions and patients) coverage afforded by intracranial recordings, the advantage of intracranial recordings in terms of spatial resolution is overstated.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      What is less clear is how the use of intracranial recordings per se holds potential to overcome problems such as the distinction between true NCCs and the prerequisites and consequences of conscious processing.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must-read for anyone working in the field of consciousness research.

    3. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current, sometimes contradicting evidence. As such, the manuscript is important as it calls for a concerted and better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript extensively discusses the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post-consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anaesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront and briefly explaining how states and contents interact would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how individuals with chronic temporomandibular disorder (TMD) learn from uncertain rewards, using a probabilistic three-armed bandit task and computational modelling. The authors aim to identify whether people living with chronic pain show altered learning under uncertainty and how such differences might relate to psychological symptoms.

      Strengths:

      The work addresses an important question about how chronic pain may influence cognition and motivation. The task design is appropriate for probing adaptive learning, and the modelling approach is novel. The findings of altered uncertainty updating in the TMD group are interesting.

      Weaknesses:

      Several aspects of the paper limit the strength of the conclusions. The group differences appear only in model-derived parameters, with no corresponding behavioural differences in task performance. Model parameters do not correlate with pain severity, making the proposed mechanistic link between pain and learning speculative. Some of the interpretations extend beyond what the data can directly support.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors report on a case-control study in which participants with chronic pain (TMD) were compared to controls on performance of a three-option learning task. The authors find no difference in task behavior, but fit a model to this behavior and suggest that differences in the model-derived metrics (specifically, change in learning rate/estimated volatility/model estimated uncertainty) reveal a relevant between-group effect. They report a mediation effect suggesting that group differences on self-report apathy may be partially mediated by this uncertainty adaptation result.

      Strengths:

      The role of sensitivity to uncertainty in pathological states is an interesting question and is the focus of a reasonable amount of research at present. This paper provides a useful assessment of these processes in people with chronic pain.

      Weaknesses:

      (1) The interpretation of the model in the absence of any apparent behavioral effect is not convincing. The model is quite complex with a number of free parameters (what these parameters are is not well explained in the methods, although they seem to be presented in the supplement). These parameters are fitted to participant choice behavior - that is, they explain some sort of group difference in this choice behavior. The authors haven't been able to demonstrate what this difference is. The graphs of learning rate per group (Figure 2) suggest that the control group has a higher initial learning rate and a lower later learning rate. If this were actually the case, you would expect to see it reflected in the choice data (the control group should show higher lose-shift behavior earlier on, with this then declining over time, and the TMD group should show no change). This behavior is not apparent. The absence of a clear effect on behavior suggests that the model results are more likely to be spurious.

      (2) As far as I could see, the actual parameters of the model are not reported. The results (Figure 2) illustrate the trial-level model estimated uncertainty/learning rate, etc, but these differ because the fitted model parameters differ. The graphs look like there are substantial differences in v0 (which was not well recovered), but presumably lambda, at least, also differs. The mean(SD) group values for these parameters should be reported, as should the correlations between them (it looks very much like they will be correlated).

      (3) The task used seems ill-suited to measuring the reported process. The authors report the performance of a restless bandit task and find an effect on uncertainty adaptation. The task does not manipulate uncertainty (there are no periods of high/low uncertainty) and so the only adaptation that occurs in the task is the change from what appears to be the participants' prior beliefs about uncertainty (which appear to be very different between groups - i.e. the lines in Figure 2a,b,c are very different at trial 0). If the authors are interested in measuring adaptation to uncertainty, it would clearly be more useful to present participants with periods of higher or lower uncertainty.

      (4) The main factor driving the better fit of the authors' preferred model over listed alternatives seems to be the inclusion of an additive uncertainty term in the softmax-this differentiates the chosen model from the other two Kalman filter-based models that perform less well. But a similar term is not included in the RW models-given the uncertainty of a binary outcome can be estimated as p(1-p), and the RW models are estimating p, this would seem relatively straightforward to do. It would be useful to know if the factor that actually drives better model fit is indeed in the decision stage (rather than the learning stage).

    3. Reviewer #3 (Public review):

      This paper applies a computational model to behavior in a probabilistic operant reward learning task (a 3-armed bandit) to uncover differences between individuals with temporomandibular disorder (TMD) compared with healthy controls. Integrating computational principles and models into pain research is an important direction, and the findings here suggest that TMD is associated with subtle changes in how uncertainty is represented over time as individuals learn to make choices that maximize reward. There are a number of strengths, including the comparison of a volatile Kalman filter (vKF) model to some standard base models (Rescorla Wagner with 1 or 2 learning rates) and parameter recovery analyses suggesting that the combination of task and vKF model may be able to capture some properties of learning and decision-making under uncertainty that may be altered in those suffering from chronic pain-related conditions.

      I've focused my comments in four areas: (1) Questions about the patient population, (2) Questions about what the findings here mean in terms of underlying cognitive/motivational processes, (3) Questions about the broader implications for understanding individuals with TMD and other chronic pain-related disorders, and (4) Technical questions about the models and results.

      (1) Patient population

      This is a computational modelling study, so it is light on characterization of the population, but the patient characteristics could matter. The paper suggests they were hospitalized, but this is not a condition that requires hospitalization per se. It would be helpful to connect and compare the patient characteristics with large-scale studies of TMD, such as the OPPERA study led by Maixner, Fillingim, and Slade.

      (2) What cognitive/motivational processes are altered in TMD

      The study finds a pattern of alterations in TMD patients that seems clear in Figure 2. Healthy controls (HC) start the task with high estimates of volatility, uncertainty, and learning rate, which drop over the course of the task session. This is consistent with a learner that is initially uncertain about the structure of the environment (i.e., which options are rewarded and how the contingencies change over time) but learns that there is a fixed or slowly changing mean and stationary variance. The TMD patients start off with much lower volatility, uncertainty, and learning rate - which are actually all near 0 - and they remain stable over the course of learning. This is consistent with a learner who believes they know the structure of the environment and ignores new information.

      What is surprising is that this pattern of changes over time was found in spite of null group differences in a number of aspects of performance: (1) stay rate, (2) switch rate, (3) win-stay/lose-switch behaviors, (4) overall performance (corrected for chance level), (5) response times, (6) autocorrelation, (7) correlations between participants' choice probability and each option's average reward rate, (7) choice consistency (though how operationalized is not described?), (8) win-stay-lose-shift patterns over time. I'm curious about how the patterns in Figure 2 would emerge if standard aspects of performance are essentially similar across groups (though the study cannot provide evidence in favor of the null). It will be important to replicate these patterns in larger, independent samples with preregistered analyses.

      The authors believe that this pattern of findings reveals that TMD patients "maintain a chronically heightened sensitivity to environmental changes" and relate the findings to predictive processing, a hallmark of which (in its simplest form) is precision-weighted updating of priors. They also state that the findings are not related to reduced overall attentiveness or failure to understand the task, but describe them as deficits or impairments in calibrating uncertainty.

      The pattern of differences could, in fact, result from differences in prior beliefs, conceptualization of the task, or learning. Unpacking these will be important steps for future work, along with direct measures of priors, cognitive processes during learning, and precision-weighted updating.

      (3) Implications for understanding chronic pain

      If the findings and conclusions of the paper are correct, individuals with TMD and perhaps other pain-related disorders may have fundamental alterations in the ways in which they make decisions about even simple monetary rewards. The broader questions for the field concern (1) how generalizable such alterations are across tasks, (2) how generalizable they are across patient groups and, conversely, how specific they are to TMD or chronic pain, (3) whether they are the result of neurological dysfunction, as opposed to (e.g.) adaptive strategies or assumptions about the environment/task structure.

      It will be important to understand which features of patients' and/or controls' cognition are driving the changes. For example, could the performance differences observed here be attributable to a reduced or altered understanding of the task instructions, more uncertainty about the rules of the game, different assumptions about environments (i.e., that they are more volatile/uncertain or less so), or reduced attention or interest in optimizing performance? Are the controls OVERconfident in their understanding of the environment?

      This set of questions will not be easy to answer and will be the work of many groups for many years to come. It is a judgment call how far any one paper must go to address them, but my view is that it is a collaborative effort. Start with a finding, replicate it across labs, take the replicable phenomena and work to unpack the underlying questions. The field must determine whether it is this particular task with this model that produces case-control differences (and why), or whether the findings generalize broadly. Would we see the same findings for monetary losses, sounds, and social rewards? Tasks with painful stimuli instead of rewards?

      Another set of questions concerns the space of computational models tested, and whether their parameters are identifiable. An alteration in estimated volatility or learning rate, for example, can come from multiple sources. In one model, it might appear as a learning rate change and in another as a confirmation bias. It would be interesting in this regard to compare the "mechanisms" (parameters) of other models used in pain neuroscience, e.g., models by Seymour, Mancini, Jepma, Petzschner, Smith, Chen, and others (just to name a few).

      One immediate next step here could be to formally compare the performance of both patients and controls to normatively optimal models of performance (e.g., Bayes optimal models under different assumptions). This could also help us understand whether the differences in patients reflect deficits and what further experiments we would need to pin that down.<br /> In addition, the volatility parameter in the computational model correlated with apathy. This is interesting. Is there a way to distinguish apathy as a particular clinical characteristic and feature of TMD from apathy in the sense of general disinterest in optimal performance that may characterize many groups?

      If we know this, what actionable steps does it lead us to take? Could we take steps to reduce apathy and thus help TMD patients better calibrate to environmental uncertainty in their lives? Or take steps to recalibrate uncertainty (i.e., increase uncertainty adaptation), with benefits on apathy? A hallmark of a finding that the field can build off of is the questions it raises.

      (4) Technical questions about the models and results

      Clarification of some technical points would help interpret the paper and findings further:

      (a) Was the reward probability truly random? Was the random walk different for each person, or constrained?

      (b) When were self-report measures administered, and how?

      (c) Pain assessments: What types of pain? Was a body map assessed? Widespreadness? Pain at the time of the test, or pain in general?

      (d) Parameter recovery: As you point out, r = 0.47 seems very low for recovery of the true quantity, but this depends on noise levels and on how the parameter space is sampled. Is this noise-free recovery, and is it robust to noise? Are the examples of true parameters drawn from the space of participants, or do they otherwise systematically sample the space of true parameters?

      (e) What are the covariances across parameter estimates and resultant confusability of parameter estimates (e.g., confusion matrix)?

      (f) It would be helpful to have a direct statistical comparison of controls and TMD on model parameter estimates.

      (g) Null statistical findings on differences in correlations should not be interpreted as a lack of a true effect. Bayes Factors could help, but an analysis of them will show that hundreds of people are needed before it is possible to say there are no differences with reasonable certainty. Some journals enforce rules around the kinds of language used to describe null statistical findings, and I think it would be helpful to adopt them more broadly.

      (h) What is normatively optimal in this task? Are TMD patients less so, or not? The paper states "aberrant precision (uncertainty) weighting and misestimation of environmental volatility". But: are they misestimates?

      (i) It's not clear how well the choice of prior variance for all parameters (6.25) is informed by previous research, as sensible values may be task- and context-dependent. Are the main findings robust to how priors are specified in the HBI model?

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Pathogen-Phage Geomapping to Overcome Resistance," Do et al. present an impressive demonstration of using geographical sampling and metagenomics to guide sample choice for enrichment in human-associated microbes and the pathogen of interest to increase the chances of success for isolating phages active against highly resistant bacterial strains. The authors document many notable successes (17!) with highly resistant bacterial isolates and share a thoughtfully structured phage discovery effort, potentially opening the door to similar geomapping efforts across the field. While the work is methodologically strong and valuable for the community, there are a few areas where additional clarification and analysis could better align the claims with the data presented.

      Strengths:

      (1) The manuscript describes a well-executed and transparent example of overcoming a major obstacle in therapeutic virus identification, providing a practical success story that will resonate with researchers in microbiology and medicine.

      (2) Many phage researchers have anecdotally experienced a similar phenomenon, that a particular wastewater treatment plant always seems to have the pathogens you need. Quantifying this with metagenomics modernizes and adds evidence to this phenomenon in a way that could help researchers reproduce this success in a methodical way.

      (3) The methodology of combining environmental sampling, viral screening, and host-range analysis is clearly articulated and reproducible, offering a valuable blueprint for others in the field.

      (4) The data are presented with appropriate analytical rigor, and the results include robust sequencing and metagenomic profiling that deepen understanding of local viral communities.

      (5) The 17 successes yielding 35 phages have a lot of phylogenetic novelty beyond what the Tailor labs have typically found with previous methods.

      (6) The work highlights a practical and innovative solution to an increasingly important clinical problem, supporting the development of personalized antiviral strategies.

      Weaknesses:

      (1) The central concept of geomapping as a broadly applicable strategy is wonderfully supported by the 17 successes documented in the paper. While this is actually, of course, a strength, the study does not include a comparative analysis across multiple sites with varying sampling outcomes for different bacterial types, which would be necessary to validate this claim more generally.

      (2) Some elements, such as beta diversity comparisons and the metagenomics analysis of viral dark matter, would benefit from additional statistical analysis and clearer context.

      (3) Claims about therapeutic cocktails would be better framed as speculative and/or moved to the discussion section.

      (4) The manuscript could be strengthened by elaborating on the scope and composition of the phage and bacterial isolate collections, which are important for interpreting the broader significance of the findings.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Do and colleagues aims to develop a workflow for isolating and identifying bacteriophages with potential applications in phage therapy against antibiotic-resistant pathogens. The workflow integrates geΦmapping as a strategy to identify potential phage sources, ΦHD as a device for phage concentration, and RΦ as a phage library constructed from the initial sampling, resulting in the discovery of 36 new phages. The paper is overall interesting, and the proposed method appears robust and effective.

      Strengths:

      The methods proposed combined state-of-the-art strategies to solve an ever-increasing problem of antibiotic resistance. The methods are robust, and the controls are appropriate. The integration of environmental sampling, concentration strategies, and downstream genomic characterization is a clear strength and provides a potentially scalable framework for identifying candidate therapeutic phages. The manuscript is clearly written overall, and the results support the main conclusions.

      Weaknesses:


      While the authors acknowledge several limitations, some aspects require clearer framing or additional clarification. The proposed workflow focuses exclusively on aquatic environments as sources of phages, which may limit the diversity of hosts and phage types recoverable using this approach. Some interpretations, particularly regarding taxonomic classification and sampling saturation, would benefit from more cautious wording given current limitations in viral taxonomy and the observed data.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors trained rats on a "figure 8" go/no-go odor discrimination task. Six odor cues (3 rewarded and 3 non-rewarded) were presented in a fixed temporal order and arranged into two alternating sequences that partially overlap (Sequence #1: 5⁺-0⁻-1⁻-2⁺; Sequence #2: 3⁺-0⁻-1⁻-4⁺) --forming an abstract figure-8 structure of looping odor cues.

      This task is particularly well-suited for probing representations of hidden states, defined here as the animal's position within the task structure beyond superficial sensory features. Although the task can be solved without explicit sequence tracking, it affords the opportunity to generalize across functionally equivalent trials (or "positions") in different sequences, allowing the authors to examine how OFC representations collapse across latent task structure.

      Rats were first trained to criterion on the task and then underwent 15 days of self-administration of either intravenous cocaine (3 h/day) or sucrose. Following self-administration, electrodes were implanted in lateral OFC, and single-unit activity was recorded while rats performed the figure-8 task.

      Across a series of complementary analyses, the authors report several notable findings. In control animals, lOFC neurons exhibit representational compression across corresponding positions in the two sequences. This compression is observed not only in trial/positions involving overlapping odor (e.g., Position 3 = odor 1 in sequence 1 vs sequence 2), but also in trials/positions involving distinct, sequence-specific odors (e.g., Position 4: odor 2 vs odor 4) --indicating generalization across functionally equivalent task states. Ensemble decoding confirms that sequence identity is weakly decodable at these positions, consistent with the idea that OFC representations collapse incidental differences in sensory information into a common latent or hidden state representation. In contrast, cocaine-experienced rats show persistently stronger differentiation between sequences, including at overlapping odor positions.

      Strengths:

      Elegant behavioral design that affords the detection of hidden-state representations.

      Sophisticated and complementary analytical approaches (single-unit activity, population decoding, and tensor component analysis).

      Weaknesses:

      The number of subjects is small --can't fully rule out idiosyncratic, animal-specific effects.

      Comments

      (1) Emergence of sequence-dependent OFC representations across learning.

      A conceptual point that would benefit from further discussion concerns the emergence of sequence-dependent OFC activity at overlapping positions (e.g., position P3, odor 1). This implies knowledge of the broader task structure. Such representations are presumably absent early in learning, before rats have learned the sequence structure. While recordings were conducted only after rats were well trained, it would be informative if the authors could comment on how they envision these representations developing over learning. For example, does sequence differentiation initially emerge as animals learn the overall task structure, followed by progressive compression once animals learn that certain states are functionally equivalent? Clarifying this learning-stage interpretation would strengthen the theoretical framing of the results.

      (2) Reference to the 24-odor position task

      The reference to the previously published 24-odor position task is not well integrated into the current manuscript. Given that this task has already been published and is not central to the main analyses presented here, the authors may wish to a) better motivate its relevance to the current study or b) consider removing this supplemental figure entirely to maintain focus.

      (3) Missing behavioral comparison

      Line 117: the authors state that absolute differences between sequences differ between cocaine and sucrose groups across all three behavioral measures. However, Figure 1 includes only two corresponding comparisons (Fig. 1I-J). Please add the third measure (% correct) to Figure 1, and arrange these panels in an order consistent with Figure 1F-H (% correct, reaction time, poke latency).

      (4) Description of the TCA component

      Line 220: authors wrote that the first TCA component exhibits low amplitude at positions P1 and P4 and high amplitude at positions P2 and P3. However, Figure 3 appears to show the opposite pattern (higher magnitude at P1 and P4 and lower magnitude at P2 and P3). Please check and clarify this apparent discrepancy. Alternatively, a clearer explanation of how to interpret the temporal dynamics and scaling of this component in the figure would help readers correctly understand the result.

      (5) Sucrose control<br /> Sucrose self-administration is a reasonable control for instrumental experience and reward exposure, but it means that this group also acquired an additional task involving the same reinforcer. This experience may itself influence OFC representations and could contribute to the generalization observed in control animals. A brief discussion of this possibility would help contextualize the interpretation of cocaine-related effects.

      (6) Acknowledge low N

      The number of rats per group is relatively low. Although the effects appear consistent across animals within each group, this sample size does not fully rule out idiosyncratic, animal-specific effects. This limitation should be explicitly acknowledged in the manuscript.

      (7) Figure 3E-F: The task positions here are ordered differently (P1, P4, P2, P3) than elsewhere in the paper. Please reorder them to match the rest of the paper.

    2. Reviewer #2 (Public review):

      In the current study, the authors use an odor-guided sequence learning task described as a "figure 8" task to probe neuronal differences in latent state encoding within the orbitofrontal cortex after cocaine (n = 3) vs sucrose (n = 3) self-administration. The task uses six unique odors which are divided into two sequences that run in series. For both sequences, the 2nd and 3rd odors are the same and predict reward is not available at the reward port. The 1st and 4th odors are unique, and are followed by reward. Animals are well-trained before undergoing electrode implant and catheterization, and then retrained for two weeks prior to recording. The hypothesis under test is that cocaine-experienced animals will be less able to use the latent task structure to perform the task, and instead encode information about each unique sequence that is largely irrelevant. Behaviorally, both cocaine and sucrose-experienced rats show high levels of accuracy on task, with some group differences noted. When comparing reaction times and poke latencies between sequences, more variability was observed in the cocaine-treated group, implying animals treated these sequences somewhat differently. Analyses done at the single unit and ensemble level suggests that cocaine self-administration had increased the encoding of sequence-specific information, but decreased generalization across sequences. For example, the ability to decode odor position and sequence from neuronal firing in cocaine-treated animals was greater than controls. This pattern resembles that observed within the OFC of animals that had fewer training sessions. The authors then conducted tensor component analysis (TCA) to enable a more "hypothesis agnostic" evaluation of their data.

      Overall, the paper is well written and the authors do a good job of explaining quite complicated analyses so that the reader can follow their reasoning. I have the following comments.

      While well-written, the introduction mainly summarises the experimental design and results, rather than providing a summary of relevant literature that informed the experimental design. More details regarding the published effects of cocaine self-administration on OFC firing, and on tests of behavioral flexibility across species, would ground the paper more thoroughly in the literature and explain the need for the current experiment.

      For Fig 1F, it is hard to see the magnitude of the group difference with the graph showing 0-100%- can the y axis be adjusted to make this difference more obvious? It looks like the cocaine-treated animals were more accurate at P3- is that right?<br /> The concluding section is quite brief. The authors suggest that the failure to generalize across sequences observed in the current study could explain why people who are addicted to cocaine do not use information learned e.g. in classrooms or treatment programs to curtail their drug use. They do not acknowledge the limitations of their study e.g. use of male rats exclusively, or discuss alternative explanations of their data.

      Is it a problem that neuronal encoding of the "positions" i.e. the specific odors was at or near chance throughout in controls? Could they be using a simpler strategy based on the fact that two successive trials are rewarded, then two successive trials are not rewarded, such that the odors are irrelevant?

      When looking at the RT and poke latency graphs, it seems the cocaine-experienced rats were faster to respond to rewarded odors, and also faster to poke after P3. Does this mean they were more motivated by the reward?

    1. Reviewer #1 (Public review):

      Summary:

      This study makes a significant and timely contribution to the field of attention research. By providing the first direct neuroimaging evidence for the integration-segregation theory of exogenous attention, it fills a critical gap in our understanding of the neural mechanisms underlying inhibition of return (IOR). The authors employ a carefully optimized cue-target paradigm combined with fMRI to elegantly dissociate the neural substrates of cue-target integration from those of segregation, thereby offering compelling support for the integration-segregation account. Beyond validating a key theoretical hypothesis, the study also uncovers an interaction between spatial orienting and cognitive conflict processing, suggesting that exogenous attention modulates conflict processing at both semantic and response levels. This finding shed new light on the neural mechanisms that connect exogenous attentional orienting with cognitive control.

      Strengths:

      The experimental design is rigorous, the analyses are thorough, and the interpretation is well grounded in the literature. The manuscript is clearly written, logically structured, and addresses a theoretically important question. Overall, this is an excellent, high-impact study that advances both theoretical and neural models of attention.

      Weaknesses:

      While this study addresses an important theoretical question and presents compelling neuroimaging findings, a few additional details would help improve clarity and interpretation. Specifically, more information could be provided regarding the experimental conditions (SI and RI), the justification for the criteria used for excluding behavioral trials, and how the null condition was incorporated into the analyses. In addition, given the non-significant interaction effect in the behavioral results, the claim that the behavioral data "clearly isolated" distinct semantic and response conflict effects should be phrased more cautiously.

    2. Reviewer #2 (Public review):

      Summary:

      This study provides evidence for the integration-segregation theory of an attentional effect, widely cited as inhibition of return (IOR), from a neuroimaging perspective, and explores neural interactions between IOR and cognitive conflict, showing that conflict processing is potentially modulated by attentional orienting.

      Strengths:

      The integration-segregation theory was examined in a sophisticated experimental task that also accounted for cognitive conflict processing, which is phenomenologically related to IOR but "non-spatial" by nature. This study was carefully designed and executed. The behavioral and neuroimaging data were carefully analyzed and largely well presented.

      Weaknesses:

      The rationale for the experimental design was not clearly explained in the manuscript; more specifically, why the current ER-fMRI study would disentangle integration and segregation processes was not explained. The introduction of "cognitive conflict" into the present study was not well reasoned for a non-expert reader to follow.

      The presentation of the results can be further improved, especially the neuroimaging results. For instance, Figure 4 is challenging to interpret. If "deactivation" (or a reduction in activation) is regarded as a neural signature of IOR, this should be clearly stated in the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      This study aims to provide the first direct neuroimaging evidence relevant to the integration-segregation theory of exogenous attention - a framework that has shaped behavioral research for more than two decades but has lacked clear neural validation. By combining an inhibition-of-return (IOR) paradigm with a modified Stroop task in an optimized event-related fMRI design, the authors examine how attentional integration and segregation processes are implemented at the neural level and how these processes interact with semantic and response conflicts. The central goal is to map the distinct neural substrates associated with integration and segregation and to clarify how IOR influences conflict processing in the brain.

      Strengths:

      The study is well-motivated, addressing a theoretically important gap in the attention literature by directly testing a long-standing behavioral framework with neuroimaging methods. The experimental approach is creative: integrating IOR with a Stroop manipulation expands the theoretical relevance of the paradigm, and the use of a genetic-algorithm-optimized fMRI design ensures high efficiency. Methodologically, the study is sound, with rigorous preprocessing, appropriate modeling, and analyses that converge across multiple contrasts. The results are theoretically coherent, demonstrating plausible dissociations between integration-related activity in the fronto-parietal attention network (FEF, IPS, TPJ, dACC) and segregation-related activity in medial temporal regions (PHG, STG). The findings advance the field by supplying much-needed neural evidence for the integration-segregation framework and by clarifying how IOR modulates conflict processing.

      Weaknesses:

      Some interpretive aspects would benefit from clarification, particularly regarding the dual roles ascribed to dACC activation and the circumstances under which PHG and STG are treated as a single versus separate functional clusters. Reporting conventions are occasionally inconsistent (e.g., statistical formatting, abbreviation definitions), which may hinder readability. More detailed reporting of sample characteristics, exclusion criteria, and data-quality metrics-especially regarding the global-variance threshold-would improve transparency and reproducibility. Finally, some limitations of the study, including potential constraints on generalization, are not explicitly acknowledged and should be articulated to provide a more balanced interpretation.

    1. Reviewer #1 (Public review):

      Summary:

      Fahdan et al. present a study investigating the molecular programs underlying axon initial growth and regrowth in Drosophila mushroom body (MB) neurons. The authors leverage the fact that different Kenyon cell (KC) subtypes undergo distinct axonal events on the same developmental timeline: γ KCs prune and then regrow their axons during early pupation, whereas α/β KCs extend their axons for the first time during the same pupal period. Using bulk Smart-seq2 RNA sequencing across six developmental time points, the authors identify genes enriched during γ KC regrowth and α/β KC initial outgrowth, and subsequently perform an RNAi screen to determine which candidates are functionally required for these processes.

      Among these, they focus on Pmvk, a key enzyme in the mevalonate pathway. Both RNAi knockdown and a CRISPR-generated mutant produce strong γ KC regrowth defects. Knockdown of other mevalonate pathway components (Hmgcr, Mvk) partially recapitulates this phenotype. The authors propose that Pmvk promotes axonal regrowth through effects on the TOR pathway.

      Overall, this work identifies new molecular players in developmental axon remodeling and provides intriguing evidence connecting Pmvk to γ KC regrowth.

      While the Pmvk knockdown and loss-of-function data are compelling, the evidence that the mevalonate pathway broadly regulates γ KC axon regrowth is less clear. RNAi knockdown of enzymes upstream of Pmvk (Hmgcr, Mvk) produces only mild phenotypes, and knockdown of several downstream enzymes produces no phenotype. The authors attribute this discrepancy to the possibility of weak RNAi constructs, which is plausible but not fully demonstrated. It would be helpful for the authors to discuss alternative explanations, including non-canonical roles for Pmvk that may not require the full pathway, and clarify the extent to which the current data support the conclusion that the mevalonate pathway, rather than Pmvk specifically, is a core regulator of regrowth.

      It is not clear from the Methods whether γ KCs and α/β KCs were sorted from the same brains using orthogonal binary expression systems (e.g., Gal4 > reporter 1 and LexA > reporter 2), or isolated separately from different fly lines. If the latter, differences in genetic background, staging, or batch effects could influence transcriptional comparisons. This should be explicitly clarified in the Methods, and any associated limitations discussed in the manuscript.

      The authors have made important findings that contribute to our understanding of axon growth and regrowth. As written, some major claims are only partially supported, but these issues can be addressed through reframing and clarification. In particular, the manuscript would benefit from (1) a more cautious interpretation of the mevalonate pathway's role, potentially considering Pmvk non-canonical functions, and (2) addressing methodological ambiguities in the transcriptomic analysis.

    2. Reviewer #2 (Public review):

      Fahdan et al. set out to build upon their previous work outlining the genes involved in axon growth, targeting two axon growth states: initial growth and regrowth. They outline a debate in the field that axon regrowth (For instance, after injury or in the peripheral nervous system) is different from initial axon growth, for which the authors have previously demonstrated distinct mechanisms. The authors set out to directly compare the transcriptomes of initial axon growth and regrowth, specifically within the same neuronal environment and developmental time point. To this end, the authors used the well-characterized genetic tools available in Drosophila melanogaster (the fruit fly) to build a valuable dataset of genes involved at different time points in axon growth (alpha/beta Mushroom Body Kenyon cells) and regrowth (gamma Mushroom Body Kenyon cells). The authors then focus on genes that are upregulated during both initial axon growth and axon regrowth. Then, using this subset of genes, they screen for axonal growth and regrowth deficits by knocking down 300 of these genes. 12 genes are found to be phenotypically involved in both axon growth and regrowth based on RNAi gene-targeted knockdown in the Mushroom Body. Of these 12 genes, the authors focus on one gene, Pmvk, which is part of the mevalonate pathway. They then highlight other genes in this pathway. But these genes primarily affect axon regrowth, not initial axon growth, implicating metabolic pathways in axon regrowth. This comprehensive RNA-seq dataset will be a valuable resource for the field of axon growth and regrowth, as well as for other researchers studying the Mushroom Body.

      Strengths:

      This paper contains many strengths, including the in-depth sequencing of overlapping developmental time points during the alpha/beta KCs' initial axon growth and gamma KCs' regrowth. This produces a rich dataset of differentially expressed genes across different time points in either cell population during development. In addition, the authors characterized expression patterns at developmental time points for 30 Gal4 lines previously identified as alpha/beta KC-expressing. This is very helpful for Drosophila

      Mushroom Body researchers because the authors not only characterized alpha/beta expression but also alpha'/beta' expression, gamma expression, and non-MB expression. The authors comprehensively walked through identifying differentially expressed genes during alpha/beta axon growth, identifying a subset of overlapping upregulated genes between cell types, then systematically characterized whether knockdown of a subset of these genes produced an axonal growth defect, and finally selected 1 of 3 cell-autonomous genes important for gamma KCs regrowth to further study.

      The authors utilized the developing Mushroom Body in Drosophila melanogaster, which happens to have new neurons developing axons and neurons that have undergone pruning and are regrowing neurons at the same developmental time. They are also in the same part of the brain (the Mushroom Body) and, in theory, since the authors implicate a metabolic pathway, they will have similar metabolic growth conditions.

      Identifying Pmvk and two other components of the mevalonate pathway in axon regrowth opens up novel avenues for future studies on the role this metabolic pathway may have in axon growth. The authors of this paper are also very upfront about their negative results, allowing researchers to avoid running redundant experiments and truly build on this work.

      Weaknesses:

      While the dataset produced in this study is a strength, certain aspects make it more challenging to interpret. For instance, the authors state that roughly equal numbers of males and females are used for sequencing, and this vagueness, coupled with only taking a subset of the GFP-labeled neurons during FACs sorting, can introduce confounds into the dataset. This may hold true in imaging studies as well, in which males and females were used interchangeably.

      Additionally, a rationale is needed to explain why random numbers of 1-7 were assigned to zero-expressing genes in the DESeq analysis. This does not seem to conform to the usual way this analysis is normally performed. This can alter how genes across the dataset are normalized and requires further explanation.

      The display and discussion of the data set do not always align with the authors' stated goal of having a comprehensive description of the genes that dynamically change during axon<br /> growth and regrowth. Displaying more information about genes differentially expressed in the alpha/beta KCs, or any information about the genes diƯerentially expressed in the gamma KCs when using the same criteria as the alpha/beta KCs, or the 676 overlapping upregulated genes, would significantly add to this paper. The authors previously performed a similar study across developmental time points for gamma KCs, and it is not clear whether any overlapping genes were identified. Also, more information on the genes consisting of PC1 and PC3 when showing the PCA analysis would be helpful. Within the text, there is a discussion of why certain genes or gene groups were omitted or selected, such as clusters 1 and 2, and then some of their subgroups based on expected genes. There is also some discussion of omitted gene groups, but this is not complete across the different clusters, nor is there a discussion of why PC2 was not selected or of which genes might exhibit greater variability than cell type. The authors would make a stronger case for the genes they pursued if they showed that groups of genes already known to be involved in axon growth clustered within the selected groups. Since we do not see the gene lists, this is unclear and adds to the sometimes arbitrary nature of the author's choices about what to pursue in this paper. A larger set of descriptors, such as gene lists and Gene Ontology analysis beyond what is shown, would be very helpful in putting the results in context and determining whether this is a resource beneficial to others.

      While the Pmvk story is interesting, the authors appear to make some arbitrary decisions in what is shown or pursued in this paper. Visually, CadN and Twr appear to be more severe axon regrowth phenotypes, where the peduncle appears intact, and axons are not regrowing in Figures 3 N and O. In contrast, Pmvk visually appears to lose neurons in Figure 3 M. With a change of the Gal4 driver (Figure 4), Pmvk now produces a gamma axon regrowth phenotype similar to CadN and Twr in Figure 3. This diƯerence in the use of Gal4 for characterizing axonal phenotypes is not discussed, making some interpretations more challenging due to diƯerences in Gal4 expression strength. For instance, the sequencing work was done with a diƯerent Gal4 MB expressing line than the characterization of gene knockdowns. Further characterization of the Pmvk was performed in the same Gal4 lines as the sequencing (Figure 4), suggesting a potential diƯerence in Gal4 strength that may play a role in their rescue experiments if they are using a slightly weaker Gal4 for gamma lobe expression. A broader discussion of this may make the selection of Pmvk less arbitrary if the phenotype is similar to those of CadN and Twr. Along the lines of the sometimes arbitrary nature of the genes chosen to pursue further, the authors state that they selected genes that showed differential expression at any time point. As they refine their list of genes to pursue further, they seem to prioritize genes that change at 18-21 APF. This appears to be the early period for axon growth in alpha/beta KCs and gamma KCs, based on Figure 1. A stronger case might be made at longer time points when the axon is growing or regrowing.

      The paper would benefit from scaling back the claim that the mevalonate pathway is involved. The authors identified only a subset of genes from the mevalonate pathway, all immediately upstream of Pmvk, with no effect on downstream genes. Along these lines, the paper would benefit from a discussion of non-canonical PmvK signaling.

      While the ability to take neurons at the same developmental time and from the same brain region is a strength, they are still 2 different types of neurons. Although gamma neuron axon growth occurs very early in development, it would be interesting to know whether the same genes are involved in their initial growth. A caveat to the author's conclusion is that these are 2 different cell types, and they might use different genetic programs or use overlapping ones at other times. The authors did not show that gamma KCs use these genes in their initial axon growth.

    1. Reviewer #1 (Public review):

      Summary:

      Here, the authors have addressed the recruitment and firing patterns of motor units (MUs) from the long and lateral heads of triceps in the mouse. They used their newly developed Myomatrix arrays to record from these muscles during treadmill locomotion at different speeds, and they used template-based spike sorting (Kilosort) to extract units. Between MUs from the two heads, the authors observe differences in their firing rates, recruitment probability, phase of activation within the locomotor cycle and interspike interval patterning. Examining different walking speeds, the authors find increases in both recruitment probability and firing rates as speed increases. The authors also observed differences in the relation between recruitment and the angle of elbow extension between motor units from each head. These differences indicate meaningful variation between motor units within and across motor pools, and may reflect the somewhat distinct joint actions of the two heads of triceps.

      Strengths:

      The extraction of MU spike timing for many individual units is an exciting new method that has great promise for exposing the fine detail in muscle activation and its control by the motor system. In particular, the methods developed by the authors for this purpose seem to be the only way to reliably resolve single MUs in the mouse, as the methods used previously in humans and in monkeys (e.g. Marshall et al. Nature Neuroscience, 2022) do not seem readily adaptable for use in rodents.

      The paper provides a number of interesting observations. There are signs of interesting differences in MU activation profiles for individual muscles here, consistent with those shown by Marshall et al. It is also nice to see fine scale differences in the activation of different muscle heads, which could relate to their partially distinct functions. The mouse offers greater opportunities for understanding the control of these distinct functions, compared to the other organisms in which functional differences between heads have previously been described.

      The Discussion is very thorough, providing a very nice recounting of a great deal of relevant previous results.

      Weaknesses:

      The findings are limited to one pair of muscle heads. While the findings are important in their own right, the lack of confirmation from analysis of other muscles acting at other joints leaves the generalization of these findings unclear.

      While differences between muscle heads with somewhat distinct functions are interesting and relevant to joint control, differences between MUs for individual muscles, like those in Marshall et al., are more striking because they cannot be attributed potentially to differences in each head's function. The present manuscript does show some signs of differences for MUs within individual heads (e.g. Figure 2C), but the manuscript falls short of providing a statistical basis for the existence of distinct subpopulations.

    2. Reviewer #2 (Public review):

      The present study, led by Thomas and collaborators, aims to characterise the firing activity of individual motor units in mice during locomotion. To achieve this, the team implanted small arrays of eight electrodes into two heads of the triceps and performed spike sorting using a custom implementation of Kilosort. Concurrently, they tracked the positions of the shoulder, elbow, and wrist using a single camera and a markerless motion capture algorithm (DeepLabCut). Repeated one-minute recordings were conducted in six mice across five speeds, ranging from 10 to 27.5 cm-1.

      From these data, the authors demonstrate that:

      - Their recording method and adapted spike-sorting algorithm enable robust decoding of motor unit activity during rapid movements.<br /> - Identified motor units tend to be recruited during a subset of strides, with recruitment probability increasing with speed.<br /> - Motor units within individual heads of the triceps likely receive common synaptic inputs that correlate their activity, whereas motor units from different heads exhibit distinct behaviour.

      The authors conclude that these differences arise from the distinct functional roles of the muscles and the task constraints (i.e., speed).

      Strengths:

      - The novel combination of electrode arrays for recording intramuscular electromyographic signals from a larger muscle volume, paired with an advanced spike-sorting pipeline capable of identifying motor unit populations.<br /> - The robustness of motor unit decoding during fast movements.

      Weaknesses:

      - The data do not clearly indicate which motor units were sampled from each pool, leaving uncertainty as to whether the sample is biased towards high-threshold motor units or representative of the entire pool.<br /> - The results largely confirm the classic physiological framework of motor unit recruitment and rate coding, offering limited new insights into motor unit physiology.

      I would like to thank the authors for their thorough and insightful revisions. I am particularly pleased with the inclusion of the new analyses demonstrating the robustness of motor unit decoding, as well as the improved transparency regarding spike-sorting yield for each muscle and animal. Additionally, the new analyses illustrating that recruitment within muscle heads is consistent with the presence of common synaptic inputs and orderly recruitment significantly strengthen the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      Using the approach of Myomatrix recording, the authors report that 1) motor units are recruited differently in the two types of muscles and 2) individual units are probabilistically recruited during the locomotion strides, whereas the population bulk EMG has a more reliable representation of the muscle. Third, the recruitment of units was proportional to walking speed.

      Strengths:

      The new technique provides a unique dataset, and the data analysis is convincing and well-executed.

      Weaknesses:

      After the revision, I no longer see any apparent weaknesses in the study.

    1. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is very important research question.

      Strengths:

      The biggest strength of the manuscript is vast number of mouse strains used.

      Weaknesses:

      After the review, there are still some open questions from my side:

      (1) I would like the authors to defend their choice of diet type since this has not been done in the review/response to authors. In case they cannot, we need additional proof (HFD or WD model).

      (2) Since the authors used same control groups (chow and HFCD), as required by the animal ethics committee, they must have power analysis test to show that the number of controls (but also in other groups) they used is enough to see the effect. Please provide it.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Comments on revisions:

      I have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      Mazer & Yovel 2025 dissect the inverse problem of how echolocators in groups manage to navigate their surroundings despite intense jamming using computational simulations.

      The authors show that despite the 'noisy' sensory environments that echolocating groups present, agents can still access some amount of echo-related information and use it to navigate their local environment. It is known that echolocating bats have strong small and large-scale spatial memory that plays an important role for individuals. The results from this paper also point to the potential importance of an even lower-level, short-term role of memory in the form of echo 'integration' across multiple calls, despite the unpredictability of echo detection in groups. The paper generates a useful basis to think about the mechanisms in echolocating groups for experimental investigations too.

      Strengths:

      The paper builds on biologically well-motivated and parametrised 2D acoustics and sensory simulation setup to investigate the various key parameters of interest

      The 'null-model' of echolocators not being able to tell apart objects & conspecifics while echolocating still shows agents successfully emerge from groups - even though the probability of emergence drops severely in comparison to cognitively more 'capable' agents. This is nonetheless an important result showing the direction-of-arrival of a sound itself is the 'minimum' set of ingredients needed for echolocators navigating their environment.

      The results generate an important basis in unraveling how agents may navigate in sensorially noisy environments with a lot of irrelevant and very few relevant cues.

      The 2D simulation framework is simple and computationally tractable enough to perform multiple runs to investigate many variables - while also remaining true to the aim of the investigation.

    2. Reviewer #2 (Public review):

      This manuscript describes a detailed model for bats flying together through a fixed geometry. The model considers elements which are faithful to both bat biosonar production and reception and the acoustics governing how sound moves in air and interacts with obstacles. The model also incorporates behavioral patterns observed in bats, like one-dimensional feature following and temporal integration of cognitive maps. From a simulation study of the model and comparison of the results with the literature, the authors gain insight into how often bats may experience destructive interference of their acoustic signals and those of their peers, and how much such interference may actually negatively effect the groups' ability to navigate effectively. The authors use generalized linear models to test the significance of the effects they observe.

      The work relies on a thoughtful and detailed model which faithfully incorporates salient features, such as acoustic elements like the filter for a biological receiver and temporal aggregation as a kind of memory in the system. At the same time, the authors abstract features that are complicating without being expected to give additional insights, as can be seen in the choice of a two-dimensional rather than three-dimensional system. I thought that the level of abstraction in the model was perfect, enough to demonstrate their results without needless details. The results are compelling and interesting, and the authors do a great job discussing them in the context of the biological literature.

      With respect to the first version of the manuscript, the authors have remedied all my outstanding questions or concerns in the current version. The new supplementary figure 5 is especially helpful in understanding the geometry.

    1. Reviewer #1 (Public review):

      The key discovery of the manuscript is that the authors found that genetically wild type females descended from Khdc3 mutants shows abnormal gene expression relating to hepatic metabolism, which persist over multiple generations and pass through both female and male lineages. They also find dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with Khdc3 mutant ancestry. These data provide solid evidence further support that phenotype can be transmitted to multiple generations without altering DNA sequence, supporting the involvement of epigenetic mechanisms. The authors further performed exploratory studies on the small RNA profiles in the oocytes of Khdc3-null females, and their wild type descendants, suggesting that altered small RNA expression could be a contributor of the observed phenotype transmission, although this has not been functionally validated.

      Comments on revisions:

      My previous comments are addressed.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript aimed to investigate the non-genetic impact of KHDC3 mutation on the liver metabolism. To do that they analyzed the female liver transcriptome of genetically wild type mice descended from female ancestors with a mutation in the Khdc3 gene. They found that genetically wild type females descended from Khdc3 mutants have hepatic transcriptional dysregulation which persist over multiple generations in the progenies descended from female ancestors with a mutation in the Khdc3 gene. This transcriptomic deregulation was associated with dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with female mutational ancestry. Furthermore, to determine whether small non-coding RNA could be involved in the maternal non-genetic transmission of the hepatic transcriptomic deregulation, they performed small RNA-seq of oocytes from Khdc3-/- mice and genetically wild type female mice descended from female ancestors with a Khdc3 mutation and claimed that oocytes of wild type female offspring from Khdc3-null females has dysregulation of multiple small RNAs.

      Finally, they claimed that their data demonstrates that ancestral mutation in Khdc3 can produce transgenerational inherited phenotypes.

      Comments on revisions:

      I thank the authors for their detailed response to my comments. I have nothing to add.

    1. Reviewer #1 (Public review):

      Summary:

      This paper describes a number of alterations in pulmonary surfactant recovered from bottlenosed dolphins. Although the sample consists of only seven diseased and two control animals, due to the difficulty in obtaining these animals, this is considered adequate. However, conclusions must be considered in view of this small sample size. The authors employ a number of sophisticated techniques to show differences in the composition and in the structure of bilayers formed by these two surfactant samples

      Strengths:

      The availability of these samples makes this study quite original. The authors apply mass spectroscopy to observe an increase of an acidic phospholipid and in the level of plasmalogens in the diseased (i.e. pneumonia) aquatic animals. They suggest these increases contribute to hampered function in vivo. They show alterations in lipid bilayers formed from lipid extracts of these surfactants by electron microscopy, by Atomic Force Microscopy and by small and wide-angle X-ray scattering -SAXS/WAXS. They have previously shown that adding small amounts of cardiolin to the clinical surfactant BLES results in altered bilayer structure, consistent with the current study.

      Weaknesses:

      It seems surprising to me that the small changes in cardiolipin can alter surfactant function i.e., reducing surface tension to near zero. As it happens, no surfactant function tests monitoring the reduction in surface tension were conducted. This would add a great deal to the paper. Further, the paper would benefit greatly from the inclusion of a table listing the lipid composition of surfactant recovered from diseased and normal animals and comparing this to the composition of BLES, a clinical surfactant. Finally, there is a possibility that the minor lipid identified by mass spec is the lysosomal marker, bis-(monoacylglcerol)phosphate rather than the metachronal marker, cardiolipin.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Porras-Gómez et al. analyse the lipid composition and biophysical properties of pulmonary surfactant obtained by bronchoalveolar lavage (BAL) from a group of bottlenose dolphins (Tursiops truncatus), including two healthy individuals and five affected by pneumonia. Through lipidomic analysis, the authors report an exacerbated presence of cardiolipin species in the BAL lipid extracts from diseased dolphins compared to healthy ones. Structural analyses using electron microscopy, atomic force microscopy, and X-ray scattering on rehydrated membrane samples reveal that lipids from diseased animals form membranes with a more pronounced Lβ phase and reduced fluidity. Moreover, the membranes from affected lungs appear more interconnected and less hydrated, as indicated by the X-ray scattering data. These findings provide valuable and convincing insights into how pulmonary disease alters the lipid composition and structural properties of surfactant in diving mammals, and may have broader implications for understanding surfactant dysfunction in marine mammals.

      Strengths:

      The study is well designed, and the experimental techniques were applied in a logical and coherent manner. The results are thoroughly analysed and discussed, and the manuscript is clearly written and well organized, making it both easy to follow and scientifically robust. Although the number of samples is limited, the rarity and logistical challenges of obtaining bronchoalveolar lavage material, particularly from animals affected by respiratory disease, make this study especially valuable and relevant.

      Weaknesses:

      In my opinion, the main issue lies in the treatment of the samples. Pulmonary surfactant is a lipoprotein complex produced by type II pneumocytes of the alveolar epithelium in the form of compact and highly dehydrated structures known as tubular myelin. Once secreted, these structures unfold and, upon contact with the air-liquid interface, form an interfacial monolayer connected to surfactant membranes in the subphase, thereby facilitating respiratory dynamics throughout the breathing cycle.

      When bronchoalveolar lavages are treated using the Bligh and Dyer method to extract the hydrophobic fraction of these samples, the structural complexity of the surfactant is disrupted, and this organization cannot be completely restored once the lipids are rehydrated. Although these extracts contain the hydrophobic proteins SP-B and SP-C, the hydrophilic protein SP-A may play an essential role in the formation of pulmonary surfactant structures. It is well established that SP-A is crucial for the formation of tubular myelin, an intermediate structure between the lamellar bodies newly secreted by type II cells and the interfacial surfactant layers.

      Moreover, and more importantly, bronchoalveolar lavage fluid may contain cells, tissue debris, and even bacteria that can alter the lipid composition of the samples used in the study after extraction by the Bligh and Dyer method. For this reason, most studies include a density gradient centrifugation step to isolate the surfactant membranes. Consequently, the samples used may be contaminated with phospholipids originating from other cells, such as macrophages, pneumocytes, or bacterial cells, particularly in lavages obtained from diseased animals.

      Although the techniques employed provide valuable information about the behaviour of surfactant membranes and allow certain inferences regarding their functionality, no functional studies of these samples have been conducted using methods such as the constrained drop surfactometer or the captive bubble surfactometer. The observed alterations do not necessarily demonstrate that surfactant modulates its properties, as claimed by the authors, but rather indicate that it is altered by the presence of other lipids.

      The spin-coating technique used to form lipid films for analysis by atomic force microscopy is not the most suitable approach to reproduce the structures generated by pulmonary surfactant. However, the results obtained may still provide valuable insights into the biophysical behaviour of its components. The analysis of lung tissue shown in Supplementary Figure S3 presents the same limitation, as the samples were embedded in a cutting compound, and the measurements may have been taken from different regions of the tissue. Therefore, it cannot be ensured that the analysed structures correspond to those generated by pulmonary surfactant.

      The finding that the structures formed in samples obtained from diseased animals are more tightly packed and dehydrated than those derived from the surfactant of healthy animals contrasts with the notion that the high efficiency of lamellar bodies in generating interfacial structures is related to their high degree of packing and dehydration. The formation of these structures involves the participation of the ABCA3 protein, which pumps phospholipids into the interior of lamellar bodies, and SP-B, which facilitates the formation of close membrane contacts.

      While the results are interesting from a comparative perspective, the implications for surfactant performance and respiratory dynamics should be interpreted with caution.

    3. Reviewer #3 (Public review):

      In this manuscript, the authors present data on the supposed composition of pulmonary surfactant obtained from bronchoalveolar lavages (BALs) of a small cohort of dolphins, a group of them suffering from pneumonia. The lipid compositional differences of the sample group are consistent with the different pathological situations of the specimens, suggesting that differences in surfactant composition are somehow associated (as a cause or as a consequence) with the particular pathophysiological contexts. It is particularly remarkable that an increase in cardiolipins and plasmalogens appears as an abnormal composition in pathological surfactants. The study is completed by analyzing the differences in membrane properties (order, packing, phase) of abnormal versus "control" membranes, concluding that pneumonia in dolphins is associated with a significant alteration of surfactant membranes that become more rigid, packed and thicker than those in surfactant from animals with no lung disease.

      In general terms, the data provided are of interest as they somehow offer a framework of effects that may extend what is known about alterations of composition, biophysical properties and functional performance of pulmonary surfactant as a consequence of respiratory pathologies. A collection of pertinent biophysical methodologies (fluorescence, X-ray scattering, AFM) have been applied to complete a full characterization of membrane properties in the different samples.

      However, they way the samples have been processed, i.e. by making organic extracts of hydrophobic (lipid and protein) components before surfactant membranes have been purified or at least, separated from bulk lavage, open the question of how much of the altered composition is actually occurring in surfactant or comes from other membranes (from cells, bacteria) that have been completely intermixed as a consequence of the organic extraction. Without an appropriate surfactant membrane obtention, the results of the study should be taken with caution and await confirmation. Specific questions that need to be considered include:

      (1) As said, the direct organic extract of BAL samples ends in a full mix of lipid and protein components that in origin could be part of different membranes, either from different surfactant assemblies, or even from pulmonary cells or membrane debris, or microorganisms, collected within the lavage. Obtaining conclusions about the structure and properties of membranes artefactually reconstituted from such lipid and protein mixtures is far from correct.

      It is mentioned that "subsequentially" to the organic extraction, the samples were subjected to ultracentrifugation to separate debris and membrane cells. I do not see what the ultracentrifugation is going to change if it is done after the organic extraction. It should have been done before the extraction, for the organic solvents to solubilize exclusively the large, and relatively light, surfactant membrane complexes.

      On the other hand, the ulterior reconstitution of the obtained full lipid mixture surely ends in membrane assemblies whose compositional distribution and organization may differ significantly from those in the original membranes.

      Taking all this into account, statements such as "These aggregate forms reproduce the expected membrane microstructures observed in native alveolar hypophase" or "pulmonary membranes can be successfully extracted and reconstituted from BALs of Navy dolphins" are simply not true and should be rephrased.

      One can understand that the limitation of material may make it difficult to obtain first the purified surfactant membranes and then their organic extract. However, the limitation should be acknowledged to make the readers clear that the actual compositional effects caused in surfactant by pneumonia need confirmation.

      (2) In some of the experiments, i.e. in the AFM characterization, supported membranes were prepared by the spray-dry method applied to organic solutions. Again, the spray-dry of organic lipid solutions ends in a lipid dispersion that may be very far from the real organization of the lipids in actual surfactant membranes.

      (3) When stated that phospholipid concentrations are greater in BAL from pinnipeds than in humans, how has the actual concentration been determined? BAL volumes are typically subjected to large variations depending on the conditions used to obtain the lavage (including volume of saline instilled, level of atelectasia in the lung tissue, presence of inflammation and edema, etc). If total amounts of phospholipids in BAL are to be compared, certain normalization procedures should be applied, such as for instance, with respect to the urea concentration in serum.

      (4) All the differences regarding membrane phase and lipid order/packing have been interpreted in terms of the potential coexistence of Lbeta (gel)/Lalpha (liquid crystalline) phases. However, it has been well established that in lipid systems containing cholesterol, such as pulmonary surfactant, phase coexistence can actually be of the type liquid-ordered (Lo)/liquid-disordered (Ld), very different in terms of mobility and true molecular order. Why do the authors consider that Lbeta is the phase observed in the surfactant membranes they have reconstituted? The presence of round-shaped domains seems to indicate that a liquid/liquid phase segregation is actually occurring.

      (5) In the same line as the previous comment, the authors state that SAXS shows that bovine-extracted pulmonary membranes exhibit a coexistence of two lamellar phases, one rich in unsaturated lipids and one in saturated lipids. SAXS and WAXS cannot provide compositional information, but structural parameters such as membrane thickness, or molecular order. This should be clarified.

      (6) It is mentioned that the surfactant monolayer at the air-liquid interface is interconnected to tubular membranous structures (tubular myelin, TM). It is true that TM, when present, appears interconnected with the interface. However, it is widely recognized that there are many other structures connected with the interfacial film, including multilamellar membrane arrays or reservoirs that have not been mentioned here. Furthermore, TM is not required for surfactant function, because it is absent, for instance, in mice lacking expression of surfactant protein SP-A, which can breathe perfectly.

      (7) In the Discussion, the authors mention that "...after squeeze-out, the excluded multilayers remain closely associated with the interfacial monolayer rather than escaping into the subphase". The authors may like to complete this discussion by specifying that the stable association of excluded assemblies with the interfacial film is actually possible thanks to the surfactant proteins.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an ambitious attempt to examine whether episodic memory traces ("engrams") of forgotten associations persist in the human brain and whether these traces continue to influence behavior implicitly. Using 7T fMRI, the authors track 96 one-shot face-object associations across learning, 30-minute retrieval, and 24-hour retrieval, complemented by a recognition test. Participants classify each memory as sure, unsure, or guess, enabling an operational dissociation between consciously accessible and inaccessible memories.

      Strengths:

      The study addresses a timely and theoretically important question arising from rodent engram research, i.e., whether forgotten human memories leave detectable neural signatures. The use of high-resolution 7T fMRI, representational similarity analysis (RSA), and gPPI connectivity analyses aims at a detailed systems-level perspective. The results suggest that correct guess responses (i.e., when participants believe they are guessing) are accompanied by hippocampal activity and connectivity patterns that correlate with behavioral performance, potentially pointing to residual memory traces. The study also presents evidence for divergent consolidation trajectories: consciously accessible memories become more neocortically distributed after sleep, whereas inaccessible memories exhibit strengthened hippocampal signatures.

      Weaknesses:

      Despite the methodological rigor, some interpretational issues merit caution. First, the reliance on participants' subjective "guess" reports to categorize trials as forgotten is problematic. Guess responses at the 30-minute retrieval were at chance level, whereas guess responses during recognition were above chance; interpreting both as "implicit episodic memory" may conflate different mechanisms (episodic retrieval, familiarity, associative priming).

      Second, several analyses raise concerns about circularity or insufficient independence, for example, when contrasting correct vs. incorrect guess trials to locate "engram" activity and then correlating that activity with guessing accuracy. Similarly, the behavioral analyses are fragmented (multiple t-tests across conditions) rather than using a factorial model that accounts for dependencies among confidence levels and timepoints.

      Third, the choice to include only "sure" and "guess" responses discards a substantial portion of trials ("unsure"), reducing power and complicating interpretation, especially given that unsure responses show above-chance performance.

      Finally, the study's two-scanner-sequence design (small-FOV vs. whole-brain) is challenging as it complicates comparisons across analyses, especially when some critical results (e.g., hippocampal reinstatement patterns) do not consistently replicate across sequences.

      Conclusion:

      Overall, the manuscript provides preliminary evidence that neural traces of forgotten episodic memories might persist in humans and could guide behavior in the absence of conscious awareness. While interpretational caution is warranted, especially regarding the nature of "guess"-based retrieval and the independence of neural contrasts, the study makes a valuable contribution to debates on engram persistence, systems consolidation, and the role of consciousness in episodic memory.

    2. Reviewer #2 (Public review):

      Summary:

      The goal of the experiment was to identify the fMRI neural correlates of persistence and recovery of forgotten memories. A forgotten memory was defined behaviorally as successful learning, followed by failure in a recall format task, followed by next-day success in a recognition format task. The comparison is to memories that were not forgotten at any stage of the task. Various univariate, connectivity, and multivariate analyses were used to identify neural correlates of forgotten memories that were recovered, that remained forgotten, and successful memory. Some claims are made about how activity of the "episodic memory network" predicts the persistence of forgotten memories.

      Strengths:

      Studies on the persistence of forgotten memories in rodent models have been used to make some novel claims about the potential properties of engrams. Attempting similar research in humans is a laudable goal.

      Patterns of behavioral responses are consistent across subjects.

      Weaknesses:

      I do not find that the fMRI results fit the narrative provided.

      A major issue is that primary results do not replicate across the two fMRI datasets that were collected using the same task. For example, hippocampal activity associated with correct responses (confident and guess) was identified in the group receiving the fMRI scan that used a small FOV, but not in the group that received an fMRI scan of the whole brain, for both 30-min and 24-hr delays (lines 202-217). This suggests that the main findings are not even replicable internally within the same experiment. There is no reasonable justification for this.

      Next, most of the reported fMRI findings do not meet reasonable thresholds for statistical significance. In many places, the authors acknowledge this in the text by saying that a difference in the fMRI metric "tended towards significant correlation" or that comparisons "revealed non-significant mean value comparisons". It is not clear why these non-significant findings are interpreted as though they are positive findings. Beyond that, many of the reported findings are not meeting the threshold (i.e., p=0.058), without any acknowledgement that they are marginal. Beyond that, the majority of comparisons that are interpreted in the main text are not significant based on the companion information provided in the supplementary tables. That is, they are totally non-significant when using FWE or FDR correction at either the cluster or peak levels.

      Beyond this, the supplementary tables indicate that "clusters identified solely within white matter regions have been excluded." The fact that there are any findings in white matter to ignore indicates that the statistical thresholds are inappropriate. It's tantamount to seeing activation in the brain of a dead fish.

      The overall picture based on these factors is that the statistical tests did not use sufficiently stringent safeguards against false positives given the multiple comparison problem that plagues fMRI. So, there are tons of false positives, which are being selectively interpreted to tell a particular story. That is, each comparison yields lots of findings in many brain area, and those that do not fit the particular narrative are being ignored (including those in white matter). What's more, when the small FOV fMRI scan is done, the imaging volume is centered on the hippocampus and its close network, so all false positives appear to be exactly in those brain regions about which the authors want to make conclusions. When throwing darts, you will always hit a bullseye if that is all that exists. The fact that the same comparisons done in the companion whole-brain dataset do not yield the same results is telling: the analysis plan is not sufficiently rigorous to yield findings that are replicable.

      Further, I think that it is highly debatable whether the task measures the recovery of forgotten memories at all. Forgotten memories are defined as those that fail when tested using a recollection format but succeed when tested using a recognition format. The well-characterized distinction between recollection and recognition is thus being construed as telling us something about the fate of engrams. I think the much more likely alternative is that "forgotten" memories are just relatively weak memories that don't meet whatever criteria subjects typically use when making recollection judgments, and not some special category of memory. In terms of brain activation, they seem for the most part to follow the pattern of stronger memory, but weaker.

      Finally, many hypotheses are used as though they are proven. For instance, fMRI activity patterns are called "engrams" even though there are no tests to determine whether they meet reasonable criteria that have been adopted in the engram literature (e.g., necessity, sufficiency). Whatever happens over the 24-hour delay is called "consolidation" even if there is no test that consolidation has occurred. Etc. It becomes hard to differentiate what is an assumption, versus a hypothesis, versus an inference/conclusion.

    1. Reviewer #1 (Public review):

      Summary:

      This study extends the short-term synaptic plasticity (STP)-based theory of activity-silent working memory (WM) by introducing a physiological mechanism for chunking that relies on synaptic augmentation (SA) and specialized chunking clusters. The model consists of a recurrent neural network comprising excitatory clusters representing individual items and a global inhibitory pool. The self-connections within each cluster dynamically evolve through the combined effects of STP and SA. When a chunking cue, such as a brief pause in a stimulus sequence, is presented, the chunking cluster transiently suppresses the activity of the item clusters, enabling the grouped items to be maintained as a coherent unit and subsequently reactivated in sequence. This mechanism allows the network to enhance its effective memory capacity without exceeding the number of simultaneously active clusters, which defines the basic capacity. They further derive a new upper limit of WM capacity, the new magic number. When the basic capacity is four, the upper bound for complete recall becomes eight, and the optimal hierarchical structure corresponds to a binary tree of two-item pairs forming four chunks that combine into two meta-chunks. Reanalysis of linguistic data and single-neuron recordings from human epilepsy patients (identifying boundary neurons) provides qualitative support for the model's predictions.

      Strengths:

      This study makes an important contribution to theoretical and computational neuroscience by proposing a physiologically grounded mechanism for chunking based on STP and SA. By embedding these processes in a recurrent neural network, the authors provide a unified account of how chunks can be formed, maintained, and sequentially retrieved through local circuit dynamics, rather than through top-down cognitive strategies. The work is conceptually original, analytically rigorous, and clearly presented, deriving a simple yet powerful capacity law that extends the classical magic number framework from four to eight items under hierarchical chunking. The modeling results are further supported by preliminary empirical evidence from linguistic data and single-neuron recordings in the human medial temporal lobe, lending credibility to the proposed mechanism. Overall, this is a well-designed and well-written study that offers novel insights into the neural basis of working-memory capacity and establishes a solid bridge between theoretical modeling and experimental findings.

      Weaknesses:

      This study is conceptually strong and provides an elegant theoretical framework, but several aspects limit its biological and empirical grounding.

      First, the control mechanism that triggers and suppresses chunking clusters remains only schematically defined. The model assumes that chunking events are initiated by pauses, prosodic cues, or internal control signals, but does not specify the underlying neural circuits (e.g., prefrontal-basal ganglia loops) that could mediate this gating in the brain. Clarifying where, when, and how the chunking clusters are turned on and off will be critical for establishing biological plausibility.

      Second, the network representation is simplified: item clusters are treated as non-overlapping and homogeneous, whereas real cortical circuits exhibit overlapping representations, distinct excitatory/inhibitory populations, and multiscale local and long-range connectivity. It remains unclear how robust the proposed dynamics and derived capacity limit would be under such biologically realistic conditions.

      Third, the model heavily relies on SA operating over a timescale of several seconds, yet in vivo, the time constants and prevalence of SA can vary widely across cortical regions and neuromodulatory states. The stability of the predicted "new magic number" under realistic noise levels and modulatory influences, therefore, needs to be systematically evaluated.

    2. Reviewer #2 (Public review):

      Summary:

      This work extends a previous recurrent neural network model of activity-silent working memory to account for well-established findings from psychology and neuroscience suggesting that working memory capacity constraints can be partially overcome when stimuli can be organized into chunks. This is accomplished via the introduction of specialized chunking clusters of neurons to the original model. When these chunking clusters are activated by a cue (such as a longer delay between stimuli), they rapidly suppress recently active stimulus clusters. This makes these stimulus clusters available for later retrieval via a synaptic augmentation mechanism, thereby expanding the network's overall effective capacity. Furthermore, these chunking clusters can be arranged in a hierarchical fashion, where chunking clusters are themselves chunked by higher-level chunking clusters, further expanding the network's overall effective capacity to a new "magic number", 2^{C-1} (where C is the basic capacity without chunking). In addition to illustrating the basic dynamics of the model with detailed simulations (Figures 1 and 2), the paper also utilizes qualitative predictions from the model to (re-)analyze data collected in previous experiments, including single-unit recordings from human medial temporal lobe as well as behavioral findings from a classic study of human memory.

      Strengths:

      The writing and figures are very clear, and the general topic is relevant to a broad interdisciplinary audience. The work is strongly theory-driven, but also makes some effort to engage with existing data from two empirical studies. The basic results showcasing how chunking can be achieved in an activity-silent working memory model via suppression and synaptic augmentation dynamics are interesting. Furthermore, we agree with the authors that the derivation of their new "magic number" is relatively general and could apply to other models, so those findings in particular may be of interest even to researchers using different modeling frameworks.

      Weaknesses:

      (1) Very important aspects of the model are assumed / hard-coded, raising the concern that it relies too much on an external controller, and that it would therefore be difficult to implement the same principles in a fully behaving model responsible for producing its own outputs from a sequence of stimuli (i.e., without a priori knowledge of the structure of incoming sequences).

      (i) One such aspect is the use of external chunking cues provided to the model at critical times to activate the chunking clusters. The simulations reported in the paper were conducted in a setting where signals to chunk are conveniently indicated by longer delays between stimuli. In this case, it is not difficult to imagine how an external component could detect the presence of such a delay and activate a chunking cluster in response. However, in order for the model to be more broadly applicable to different memory tasks that elicit chunking-related phenomena, a more general-purpose detector would be required (see further comments below and alternative models).

      (ii) Relatedly, and as the authors acknowledge in the discussion, the network relies on a pretty sophisticated external controller that decides when the individual chunking clusters are activated or deactivated during readout/retrieval. This seems especially complex in the hierarchical case. How might a network decide which chunking/meta-chunking clusters are activated/deactivated in which order? This was hard-coded in their simulations, but we imagine that it would be difficult to implement a general solution to this problem, especially in cases where there is ambiguity about which stimuli should be chunked, or where the structure of the incoming sequence is not known in advance.

      (iii) One of the central mechanisms of the model is the rapid synaptic plasticity in the inhibitory connections responsible for binding chunking clusters to their corresponding stimulus clusters. This mechanism again appears to have been hard-coded in the main simulations. Although we appreciate that the authors worked on one possible way that this could be implemented (Methods section D, Supplementary Figure S2), in the end, their solution seems to rely on precisely fine-tuning the timing with which stimuli are presented - a factor that seems unlikely to matter very much in humans/animals. This stands in contrast with models of working memory that rely on persistent activity, which are more robust to changes in timing. Note that we do not discount the possibility of activity-silent WM, and indeed it should be studied in its own right, but it is then even more important to highlight which of its features are dependent on the time constants, etc.

      (2) Another key shortcoming of this work is its limited direct engagement with empirical evidence and alternative computational accounts of chunking in WM. Although the efforts to re-analyze existing empirical results in light of the new predictions made by the model are commendable, in the end, we think they fall short of being convincing. As noted above, the model doesn't actually perform the same two tasks used in the human experiments, so direct quantitative comparisons between the model and human behavior or neural data are not possible. Instead, the authors rely on isolating two qualitative predictions of the model - the "dip" and "ramp" phenomena observed after a chunking cluster is activated (Figure 3), and the new magic number for effective capacity derived from the model in the case where stimuli are chunkable, which approximately converges with human recall performance in a memory study (Figure 4). Below, we highlight some specific issues related to these two sets of analyses, but the larger point is that if the model is making a commitment about how these neural mechanisms relate to behavioral phenomena, it would be important to test if the model can produce the behavioral patterns of data in experimental paradigms that have been extensively used to characterize those phenomena. For example, modern paradigms characterizing capacity limits have been more careful to isolate the contributions of WM per se (whereas the original magic number 7 is now thought to reflect a combination of episodic and working memory; see Cowan 2010). There are several existing models that more directly engage with this literature (e.g., Edin et al., 2009; Matthey et al., 2015; Nassar et al., 2018; Soni & Frank, 2025; Swan & Wyble, 2014; van den Berg et al., 2014; Wei et al., 2012), some of which also account for chunking-related phenomena (e.g., Wei et al, 2012; Nassar et al., 2018; Panichello et al., 2019; Soni & Frank, 2025). A number of related proposals suggest that WM capacity limits emerge from fundamentally different mechanisms than the one considered here - for example, content-related interference (Bays, 2014; Ma et al., 2014; Schurgin et al., 2020), or limitations in the number of content-independent pointers that can be deployed at a given time (Awh & Vogel, 2025), and/or the inherent difficulty of learning this binding problem (Soni & Frank, 2025). We think it would be worth discussing how these ideas could be considered complementary or alternatives to the ones presented here.

      (i) Single unit recordings. We found it odd that the authors chose to focus on evidence from single-unit recordings in the medial temporal lobe from a study focused on episodic memory. It was unclear how exactly these data are supposed to relate to their proposal. Is the suggestion that a mechanism similar to the boundary neurons might be operative in the case of working memory over shorter timescales in WM-related areas such as the prefrontal cortex, or that their chunking mechanism may relate not only to working memory but also to episodic memory in the medial temporal lobe?

      (ii) N-gram memory experiment. Our main complaint about the analysis of the behavioral data from the human memory study (Figure 4) is that the model clearly does not account for the main effect observed in that study - namely, the better recall observed for higher-order n-gram approximations to English. We acknowledge that this was perhaps not the main point of the analysis (which related more to the prediction about the absolute capacity limit M*), but it relates to a more general criticism that the model cannot account for chunking behavior associated with statistical learning or semantic similarity. Most of the examples used in the introduction and discussion are of this kind (e.g., expressions such as "Oh my God" or "Easier said than done", etc.). However, the chunking mechanism of the model should not have any preference for segmenting based on statistical regularities or semantic similarity - it should work just as well if statistical anomalies or semantic dissimilarity were used as external chunking cues. In our view, these kinds of effects are likely to relate to the brain's use of distributed representations that can capture semantic similarity and learn statistical regularities in the environment. Although these kinds of effects may be beyond the scope of this model, some effort could be made to highlight this in the discussion. But again, more generally, the paper would be more compelling if the model were challenged to simulate more modern experimental paradigms aimed at testing the nature of capacity limits in WM, or chunking, etc.

      (iii) There are a number of other empirical phenomena that we're not sure the model can explain. In particular, one of the hallmarks of WM capacity limits is that it suffers from a recency bias, where people are more likely to remember the most recent items at the expense of items presented prior to that (Oberauer et al 2012). [There are also studies showing primacy effects in addition to recency effects, but the primacy effects are generally attributed to episodic rather than working memory - for example, introducing a distractor task abolishes the recency but not primacy effect]. But the current model seems to make the opposite prediction: when the stimuli exceed its base capacity, it appears to forget the most recent stimuli rather than the earliest ones (Figure 1d). This seems to result from the number of representations that can be reactivated within a cycle and thus seems inherent to the dynamics of the model, but the authors can clarify if, instead, it depends on the particular values of certain parameters. (In contrast, this recency effect is captured in other models with chunking capabilities based on attractive dynamics and/or gating mechanisms - eg Boboeva et al 2023; Soni & Frank (2025)). Relatedly, we're not sure if the model could account for the more recent finding that recall is specifically enhanced when chunks occur in early serial positions compared to later ones (Thalmann, Souza, Oberauer, 2019).

    3. Reviewer #3 (Public review):

      The paper presents a synaptic mechanism for chunking in working memory, extending previous work of the last author by introducing specialized "chunking clusters", neural populations that can dynamically segment incoming items into chunks. The idea is that this enables hierarchical representations that increase the effective capacity of working memory. They also derive a theoretical bound for working memory capacity based on this idea, suggesting that hierarchical chunking expands the number of retrievable items beyond the basic WM capacity. Finally, they present neural and behavioral data related to their hypothesis.

      Strengths

      A major strength of the paper is its clear theoretical ambition of developing a mechanistic model of working memory chunking.

      Weaknesses

      Despite the inspiration in biophysical mechanisms (short-term synaptic plasticity with different time constants), the model is "cartoonish". It is unclear whether the proposed mechanism would work reliably in the presence of noise and non-zero background activity or in a more realistic implementation (e.g., a spiking network).

      As far as I know, there is no evidence for cyclic neural activation patterns, which are supposed to limit WM capacity (such as in Figure 1d). In fact, I believe there is no evidence for population bursts in WM, which are a crucial ingredient of the model. For example, Panicello et al. 2024 have found evidence for periods during which working memory decoding accuracy decreases, but no population bursts were observed in their data. In brief, my critique is that including some biophysical mechanism in an abstract model does not make the model plausible per se.

      It is claimed that "our proposed chunking mechanism applies to both the persistent-activity and periodic-activity regimes, with chunking clusters serving the same function in each", but this is not shown. If the results and model predictions are the same, irrespective of whether WM is activity-silent or persistent, I suggest highlighting this more and including the corresponding simulations.

      The empirical validations of the model are weak. The single-unit analysis is purely descriptive, without any statistical quantification of the apparent dip-ramp pattern. I agree that the dip-ramp pattern may be consistent with the proposed model, but I don't believe that this pattern is a specific prediction of the proposed model. It seems just to be an interesting observation that may be compatible with several network mechanisms involving some inhibition and a rebound.

      Moreover, the reanalyses of n-gram behavioral data do not constitute a mechanistic test of the model. The "new magic number" depends strongly on structural assumptions about how chunking operates, and it is unclear whether human working memory uses the specific hierarchical scheme required to achieve the predicted limit.

      The presentation of the modeling results is highly compressed in two figures and is rather hard to follow. Plotting the activity of different neural clusters in separate subplots or as heatmaps (x-axis time, y-axis neural population, color = firing rate) would help to clarify (Figure 1d). Also, control signals that activate the chunking clusters should be shown.

      Overall, the theoretical proposal is interesting, but its empirical grounding and biological plausibility need to be substantially reinforced.

    1. Reviewer #1 (Public review):

      Summary:

      This study addresses the encoding of forelimb movement parameters using a reach-to-grasp task in mice. The authors use a modified version of the water-reaching paradigm developed by Galinanes and Huber. Two-photon calcium imaging was then performed with GCaMP6f to measure activity across both the contralateral caudal forelimb area (CFA) and the forelimb portion of primary somatosensory cortex (fS1) as mice perform the reaching behavior. Established methods were used to extract the activity of imaged neurons in layer 2/3, including methods for deconvolving the calcium indicator's response function from fluorescence time series. Video-based limb tracking was performed to track the positions of several sites on the forelimb during reaching and extract numerous low-level (joint angle) and high-level (reach direction) parameters. The authors find substantial encoding of parameters for both the proximal and distal parts of the limb across both CFA and fS1, with individual neurons showing heterogeneous parameter encoding. Limb movement can be decoded similarly well from both CFA and fS1, though CFA activity enables decoding of reach direction earlier and for a more extended duration than fS1 activity. Collectively, these results indicate involvement of a broadly distributed sensorimotor region in mouse cortex in determining low-level features of limb movement during reach-to-grasp.

      Strengths:

      The technical approach is of very high quality. In particular, the decoding methods are well designed and rigorous. The use of partial correlations to distinguish correlation between cortical activity and either proximal or distal limb parameters or either low- or high-level movement parameters was very nice. The limb tracking was also of extremely high quality, and critical here to revealing the richness of distal limb movement during task performance.

      The task itself also reflects an important extension of the original work by Galinanes and Huber. The demonstration of a clear, trackable grasp component in a paradigm where mice will perform hundreds of trials per day expands the experimental opportunities for the field. This is an exciting development.

      The findings here are important and the support for them is solid. The work represents an important step forward toward understanding the cortical origins of limb control signals. One can imagine numerous extensions of this work to address basic questions that have not been reachable in other model systems.

      Collectively, these strengths made this manuscript a pleasure to read and review.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Grier, Salimian, and Kaufman characterize the relationship between the activity of neurons in sensorimotor cortex and forelimb kinematics in mice performing a reach-to-grasp task. First, they train animals to reach to two cued targets to retrieve water reward, measure limb motion with high resolution, and characterize the stereotyped kinematics of the shoulder, elbow, wrist, and digits. Next, they find that inactivation of the caudal forelimb motor area severely impairs coordination of the limb and prevents successful performance of the task. They then use calcium imaging to measure the activity of neurons in motor and somatosensory cortex, and demonstrate that fine details of limb kinematics can be decoded with high fidelity from this activity. Finally, they show reach direction (left vs right target) can be decoded earlier in the trial from motor than from somatosensory cortex.

      Strengths:

      In my opinion, this manuscript is technically outstanding and really sets a new bar for motor systems neurophysiology in the mouse. The writing and figures are clear, and the claims are supported by the data. This study is timely, as there has been a recent trend towards recording large numbers of neurons across the brain in relatively uncontrolled tasks and inferring a widespread but coarse encoding of high-level task variables. The central finding here, that sensorimotor cortical activity reflects fine details of forelimb movement, argues against the resurgent idea of cortical equipotentiality, and in favor of a high degree of specificity in the responses of individual neurons and of the specialization of cortical areas.

      Comment on revised version:

      The authors addressed all my concerns, and in my opinion, the manuscript is suitable for publication of the Version of Record in its current form.

    1. Reviewer #1 (Public review):

      Wang et al. studied an old, still unresolved problem: Why are reaching movements often biased? Using data from a set of new experiments and from earlier studies, they identified how the bias in reach direction varies with movement direction and movement extent, and how this depends on factors such as the hand used, the presence of visual feedback, the size and location of the workspace, the visibility of the start position and implicit sensorimotor adaptation. They then examined whether a target bias, a proprioceptive bias, a bias in the transformation from visual to proprioceptive coordinates and/or biomechanical factors could explain the observed patterns of biases. The authors conclude that biases are best explained by a combination of transformation and target biases.

      A strength of this study is that it used a wide range of experimental conditions with also a high resolution of movement directions and large numbers of participants, which produced a much more complete picture of the factors determining movement biases than previous studies did. The study used an original, powerful and elegant method to distinguish between the various possible origins of motor bias, based on the number of peaks in the motor bias plotted as a function of movement direction. The biomechanical explanation of motor biases could not be tested in this way, but this explanation was excluded in a different way using data on implicit sensorimotor adaptation. This was also an elegant method as it allowed the authors to test biomechanical explanations without the need to commit to a certain biomechanical cost function.

      Overall, the authors have done a good job mapping out reaching biases in a wide range of conditions, revealing new patterns in one of the most basic tasks, and the evidence for the proposed origins is convincing. The study will likely have substantial impact on the field, as the approach taken is easily applicable to other experimental conditions. As such, the study can spark future research on the origin of reaching biases.

      Comments on revisions:

      The authors have addressed my concerns convincingly. The inclusion of the data on movement extent, and the comparison with the data and explanation of Gordon et al. (1994), has strengthened the paper, as it shows that the proposed model can also explain biases in movement extent. I also appreciate the addition of the mathematical analysis, although I suspect that this analysis can be developed further to yield more detailed insights into the conditions under which the 1-, 2- and 4-peaked patterns arise, but that is a more suitable question for follow-up work.

    2. Reviewer #2 (Public review):

      Summary:

      This work examines an important question in the planning and control of reaching movements - where do biases in our reaching movements arise and what might this tell us about the planning process. They compare several different computational models to explain the results from a range of experiments including those within the literature. Overall, they highlight that motor biases are primarily caused errors in the transformation between eye and hand reference frames. One strength of the paper is the large numbers of participants studied across many experiments. However, one weakness is that most of the experiments follow a very similar planar reaching design - with slicing movements through targets rather than stopping within a target. This is partially addressed with Exp 4. This work provides a valuable insight into the biases that govern reaching movements. While the evidence is solid for planar reaching movements, further support in the manner of 3D reaching movements would help strengthen the findings.

      Strengths:

      The work uses a large number of participants both with studies in the laboratory which can be controlled well and a huge number of participants via online studies. In addition, they use a large number of reaching directions allowing careful comparison across models. Together these allow a clear comparison between models which is much stronger than would usually be performed.

      Comments on revisions:

      I thank the authors for all the additions to the manuscript, which has addressed my concerns.

    3. Reviewer #3 (Public review):

      This study makes excellent use of a uniquely large dataset of reaching movements collected over several decades to evaluate the origins of systematic motor biases. The analyses convincingly demonstrate that these biases are not explained by errors in sensed hand position or by biomechanical constraints, but instead arise from a misalignment between eye-centric and body-centric representations of position. By testing multiple computational models across diverse contexts-including different effectors, visible versus occluded start positions-the authors provide strong evidence for their transformation model. My earlier concerns have been addressed, and I find the work to be a significant and timely contribution that will be of broad interest to researchers studying visuomotor control, perception, and sensorimotor integration.

      Comments on revisions:

      None

    1. Reviewer #1 (Public review):

      In this study, the authors explore the implications of two types of rhythmic inhibition - "gamma" (30-80 Hz) and "beta"(13-30Hz) - for synaptic integration. They study this in a multi-compartmental model L5 pyramidal neuron with Poisson excitation and rhythmic inhibition (16 Hz and 64 Hz), applied either to the perisomatic or apical tuft regions in the neuron. They find that 64 Hz inhibition applied to the cell body is effective in phasic modulation of AP generation, while 16 Hz inhibition applied to the apical tufts is effective in phasic modulation of dendritic spikes (in addition to APs). Switching the location of the two kinds of rhythmic inhibition reduces the overall excitability, but is not effective in phasic modulation of either dendritic spikes and weakly so for somatic APs.

      Strengths:

      The effect of the timescale of rhythmic inhibition on synaptic integration is an interesting question, since a) rhythmic spiking is most strongly evident in inhibitory population, b) rhythmic spiking is modulated by behavioral states and the sensory environment. The methods are clear and data are well-presented. The study systematically explores the effect of two frequencies of rhythmic inhibition in a biophysically detailed model. The study considers not only idealized rhythmic inhibition but also the bursty kind that is observed in in-vivo conditions. Both distributed and clustered excitatory synaptic organization are simulated, which covers the two extremes of the spatial organization of excitatory inputs in-vivo.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript illustrates how spatial targeting (perisomatic vs distal, apical and basal dendritic) and timing of inhibition is crucial to distinct effects on neuronal integration, and show that beta and gamma oscillations differentially engage dendritic spiking mechanisms.

      Strengths:

      The strength of this study lies in the integrative biophysical modelling of a layer 5 pyramidal neuron by bringing together in vitro and in vivo observations

      Weaknesses:

      The weaknesses are probably in some of the parameterization of inhibitory synaptic dynamics. A unitary peak conductance of 1nS is very high for inhibitory synapses. This high value could invariably skew some of the network-level predictions. The authors could obtain specific parameters from the Neocortical Collaboration Portal (https://bbp.epfl.ch/nmc-portal/microcircuit.html), which comes across an incredible resource for cortical neurons and synapses.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigate the determinants of population-level cell size variability, quantified via the coefficient of variation, in budding yeast populations. Using a combination of computational modeling and experimental readouts, they conclude that mother-daughter division asymmetry is the dominant factor shaping the coefficient of variation of cell size. In particular, through parameter sensitivity analysis of the Chandler-Brown model and empirical perturbations, the authors show that size-control mutations have limited effects on CV, whereas modulating mother-daughter asymmetry, by changing the growth environment, produces substantially larger shifts.

      Strengths:

      (1) The study addresses a fundamental question in biophysics, i.e., what are the mechanisms that produce and maintain population size heterogeneity?

      (2) It provides a conceptual reconciliation for previous observations that size-control mutants often alter mean size but not CV.

      (3) The modeling framework is clearly explained and compared to the data.

      (4) The parameter sensitivity analysis is thoughtfully performed and provides transparent intuition about which parameters influence variability.

      (5) The writing is clear, and the figures are well-organized.

      Weaknesses:

      (1) The work focuses on the Chandler-Brown model, so it is not clear to what extent the conclusions depend on it. A sensitivity or robustness check using an alternative model would strengthen generality.

      (2) CV is the sole descriptor used to quantify heterogeneity; while this is an efficient descriptor, it must be handled with care when used on experimental data, as it may vary due to differences in the chosen observables (e.g., if size is identified via cell volume, length, area, number of proteins, etc.) instead of real differences in the distribution.

      (3) The experimental validation using varied nutrient conditions is interesting; however, the statistical significance of the found correlations should be provided/discussed.

    2. Reviewer #2 (Public review):

      Summary:

      This paper provides a new framework for understanding how cell size variability arises in budding yeast populations. Whereas previous studies emphasized G1/S size control in daughter cells as the main regulator of size homeostasis, the authors show that perturbations to this control checkpoint have only modest effects on population-wide size variability.

      By extending a stochastic model of the yeast cell cycle to include both mother and daughter lineages, the authors demonstrate that division asymmetry-stemming from slower growth and longer post-Start phases in mother cells-is the key factor determining the population coefficient of variation (CV). As mothers grow larger and daughters smaller, the overall size distribution broadens. Experimental measurements across multiple mutants and conditions support the predicted correlation between asymmetry and CV.

      Strengths:

      The main conceptual advance of this study is to consider the full proliferating population, and in particular the dominant mother lineages, rather than single-cycle daughters, thereby offering a population-level explanation for size variability that is consistent with several previous but seemingly conflicting results.

      Weaknesses:

      Nevertheless, the modelling is described superficially and has notable limitations.

      (1) The extended Chandler-Brown model was originally parameterized only for daughter cells, and its generalization to mothers introduces several new assumptions that are not directly tested.

      (2) The model treats asymmetry phenomenologically, without a mechanistic basis, so while it correctly identifies correlations, causality remains uncertain.

      (3) Moreover, since population CVs emerge from steady-state lineage dynamics, they could be sensitive to parameter choices or growth-related details not fully explored in the current analysis.

      In summary, this study provides a useful conceptual synthesis and a useful quantitative framework, but it should be clear that readers should interpret the modeling as heuristic. The central message-that division asymmetry dominates population size variability-remains interesting and well supported at the phenomenological level.

    3. Reviewer #3 (Public review):

      Summary:

      The article studies the origins of cell size random variability in budding yeast. Different strains with different average cell sizes have very similar noise measured using the coefficient of variability defined as the standard deviation over the mean. Manipulating the noise in key variables such as the duration of cell stages, the growth rate or the division strategy (adder, timer, sizer) was not enough to explain the observed noise in mutants. The proposed solution for the origin of most of the cell size noise is related to the asymmetry in the average cell size for cells with two different phenotypes: daughter cells (New cells that have not passed the first division) AND 'Mother cells' (the rest). The origin of the cell size noise is mainly related to the fact that the distributions of these phenotypes have different cell size distributions. The article includes simple statistical methods for hypothesis analysis and explanatory figures.

      Strengths:

      The article provides different approaches: experimental (mutants and different growth conditions) and computational (simulations) to explain and test the hypothesis. The methods are based on previous articles with simple conclusions and explanations easy to follow.

      The rigor level in both mathematical and biological approaches looks fair to me. The terms are well defined and consistent throughout the article. Authors use well-established analysis techniques.

      The proposed theoretical analysis is coarse-grained and therefore can explain different strains and mutations using mathematical tools (noise analysis), aiming to reach general (mathematically) claims. This approach strengthens the conclusions and provides a good language to set a bridge between the biological community and mathematicians (quantitative biologists).

      The concept that the population heterogeneity (mothers vs daughters) is a fundamental reason behind the cell size variability is not new, but this article presents a clear experimental justification for the development of complete models of cell size regulation. I consider this contribution very relevant to the community modelling cell size.

      Weaknesses:

      The concept that population heterogeneity (mother and daughters) with different cell size distributions explains the observed size variability in a heterogeneous population. It is not clear how the population composition can affect this heterogeneity. Intuitively, I would expect that the fraction (number of daughters)/(number of mothers) changes in different stages of the population expansion due to the mean duration of both stages can change in different growth conditions. I would suggest studying how different (or not) these fractions are in different conditions. The authors should acknowledge this effect and discuss briefly using, for instance, simple models of random variables addition (adding different fractions of individuals with different cell size distributions) in which cases (different fractions or different means and noises in their respective distribution) their contribution is relevant. Finally. Do different simulations (gradient or sizer, timer) predict different moments (mean and CV) in distributions of both mother size and daughter size?

      Related to the previous comment, I would also include the fraction (number of daughters)/(number of mothers) or the percentage in different growth conditions with their respective size moments (mean and CV) to test whether the resultant cell size moments are related to the addition of two variables with different fractions with their respective moments.

      It is interesting how the G1 timer and G1 Sizer are located in different quadrants of Figure 4D, while the studied mutants belong to the other quadrant. I expected them to be closer to the G1 timer, similar to that observed in Figure 4G. I think the authors should discuss this dissimilarity.

      Although the authors are working using a definite model, other models would predict different results, especially in synthetic data. For instance, the same models for obtaining sizers can predict different noise levels.

      Nieto, C. et al., 2024. npj Systems Biology and Applications, 10(1), p.61.

      Barber, Felix, et al., Frontiers in cell and developmental biology 5 (2017): 92.

      Teimouri, H. et al,.2020. The Journal of Physical Chemistry Letters, 11(20), pp.8777-8782.

      I would mention that the noise level also depends on whether the population has reached steady-state conditions. This would require multiple generations, and measure over at least a couple of thousand cells. Therefore, experiments with single-cell-derived colonies would present different levels of noise than the noise in steady conditions, especially if few cells were sampled. However, I acknowledge that the purpose of the article is not a detailed description of the system but rather the presentation of the concept and for that matter, this level of detail is not mandatory.

    1. Reviewer #1 (Public review):

      Summary:

      The Drosophila wing disc is an epithelial tissue which study has provided many insights into the genetic regulation of organ patterning and growth. One fundamental aspect of wing development is the positioning of the wing primordia, which occurs at the confluence of two developmental boundaries, the anterior-posterior and the dorsal-ventral. The dorsal-ventral boundary is determined by the domain of expression of the gene apterous, which is set early in the development of the wing disc. For this reason, the regulation of apterous expression is a fundamental aspect of wing formation.

      In this manuscript the authors used state of the art genomic engineering and a bottom-up approach to analyze the contribution of a 463 base pair fragment of apterous regulatory DNA. They find compelling evidence about the inner structure of this regulatory DNA and the upstream transcription factors that likely bind to this DNA to regulate apterous early expression in the Drosophila wing disc.

      Strengths:

      This manuscript has several strengths concerning both the experimental techniques used to address a problem of gene regulation and the relevance of the subject. To identify the mode of operation of the 463 bp enhancer, the authors use a balanced combination of different experimental approaches. First, they use bioinformatic analysis (sequence conservation and identification of transcription factors binding sites) to identify individual modules within the 463 bp enhancer. Second, they identify the functional modules through genetic analysis by generating Drosophila strains with individual deletions. Each deletion is characterized by looking at the resulting adult phenotype and also by monitoring apterous expression in the mutant wing discs. They then use a clever method to interfere in a more dynamic manner with the function of the enhancer, by directing the expression of catalytically inactive Cas9 to specific regions of this DNA. Finally, they recur to a more classical genetic approach to uncover the relevance of candidate transcription factors, some of them previously know and other suggested by the bioinformatic analysis of the 463 bp sequence. This workflow is clearly reflected in the manuscript, and constitute a great example of how to proceed experimentally in the analysis of regulatory DNA.

      Weaknesses:

      The previously pointed weakness (vg expression, P compartment specific effects, early vs late analysis of ap expression in mutants) have been throughly and satisfactorily addressed by the authors.

    2. Reviewer #3 (Public review):

      In this manuscript, authors use the Drosophila wing as model system and combine state-of-the-arte genetic engineering to identify and validate the molecular players mediating the activity of one of the cis-regulatory enhancers of the apterous gene involved in the regulation of its expression domain in the dorsal compartment of the wing primordium during larval development. The paper is subdivided into the following chapters/figures:

      (1) In the first couple of figures, authors describe the methodology to genetically manipulate the apE enhancer (a cartoon summarizing all the previous work with this enhancer might help) and identify two well-conserved domains in the OR463 enhancer required for wing development (the m3 region whose deletion phenocopies OR463 deletion: loss of wing, and the m1 region, whose deletion gives rise to AP identify changes in the P compartment).

      (2) In the following three figures, authors characterize the m1 regulatory region, identify HOX and ETS binding sites, functionally validate their role in wing development and the activity of the genes/proteins regulating their activity (eg-. Hth and Pointed) by their ability to phenocopy (when depleted) the m1 loss of function wing phenotype. Authors conclude that Hth and Pointed regulate apterous expression through the m1 region.

      (3) In the last few figures, authors perform similar experiments with the m3 regulatory region to conclude that the Grn and Antennapedia regulate apterous expression through the m3 enhancer.

      My comments:

      Technically sound: As stated in my previous review, the work is technically excellent (authors use state-of-the-art genetic engineering to manipulate the enhancer and combine it with genetic analysis through RNAi and CRISPR/Cas9 and phenotypic characterization to functionally validate their findings), figures are nicely done and cartoons are self-explanatory.

      Poor paper writing: The paper is too long and difficult to read/understand, many grammatical mistakes are found, and formatting is in some cases heterodox.

      Science:

      (1) The question of "who is locating the relative position of the AP and DV boundaries in the developing wing?" is not resolved. I would then change the intro or reduce the tone of this question. Having said that, I agree that these results shed light on the wing phenotypes of some apterous alleles related to AP identify and growth and, as such, I congratulate the authors.

      (2) Identification of two TFs (Grain and Antp) mediating the regulation of apterous expression is interesting but some contextualization might be required. Data on Antp is not as convincing as data on Grn. I wonder whether Antp data can be removed at all.

      (3) I am not sure whether the term hemizygous is used properly

    1. Reviewer #1 (Public review):

      Aw et al. have proposed that utilizing stability analysis can be useful for fine-mapping of cross populations. In addition, the authors have performed extensive analyses to understand the cases where the top eQTL and stable eQTL are the same or different via functional data.

      Comments on revisions:

      The authors have answered all my concerns.

    2. Reviewer #2 (Public review):

      Aw et al presents a new stability-guided fine-mapping method by extending the previously proposed PICS method. They applied their stability-based method to fine-map cis-eQTLs in the GEUVADIS dataset and compared it against residualization-based approaches. They evaluated the performance of the proposed method using publicly available functional annotations and demonstrated that the variants identified by their stability-based method show enrichment for these functional annotations.

      The authors have substantially strengthened the manuscript by addressing the major concerns raised in the initial review. I acknowledge that they have conducted comprehensive simulation studies to show the performance of their proposed approach and that they have extended their approach to SuSiE ("Stable SuSiE") to demonstrate the broader applicability of the stability-guided principle beyond PICS.

      One remaining question is the interpretation of matching variants with very low stable posterior probabilities (~0), which the authors have analyzed in detail but without fully conclusive findings. I agree with the authors that this event is relatively rare and the current sample size is limited but this might be something to keep in mind for future studies.

    1. Reviewer #1 (Public review):

      Summary:

      I read the paper by Parrotta et al with great interest. The authors are asking an interesting and important question regarding pain perception, which is derived from predictive processing accounts of brain function. They ask: If the brain indeed integrates information coming from within the body (interoceptive information) to comprise predictions about the expected incoming input and how to respond to it, could we provide false interoceptive information to modulate its predictions, and subsequently alter the perception of such input? To test this question, they use pain as the input and the sounds of heartbeats (falsified or accurate) as the interoceptive signal.

      Strengths:

      I found the question well-established, interesting and important, with important implications and contributions for several fields, including neuroscience of prediction-perception and pain research. The study is clearly written, the methods are generally adequate, and the results indeed support the claim that false cardiac feedback modulates both pain perception and anticipatory cardiac frequency. Importantly, the authors include a control experiment using exteroceptive auditory feedback to test whether effects are specific to heartbeat-like cues. This addition substantially strengthens interpretability.

      Weaknesses:

      In my view, the authors' central interpretation, namely that the effects arise because the manipulation targets interoceptive rather than exteroceptive or high-level threat-related cues, cannot be fully supported by the current design. The evidence does not rule out the possibility that participants interpret increased heartbeat sounds as a generic danger/threat cue rather than as (manipulated) interoceptive input. I also disagree with several other claims, though they are less critical, for example, that the use of specific comparisons without pre-registering them, the use of sensitivity analysis to justify sample size, and the intentional use of only 6 trials per participant.

      Conclusion:

      To conclude, the authors have shown in their findings that predictions about an upcoming aversive (pain) stimulus - and its subsequent subjective perception - can be altered not only by external expectations, or manipulating the pain cue, as was done in studies so far, but also by manipulating a cue that has fundamental importance to human physiological status, namely heartbeats. Whether this is a manipulation of actual interoception as sensed by the brain is, in my view, left to be proven.

      Even if the authors drop this claim, the paper has important implications in several fields of science, ranging from neuroscience prediction-perception research, to pain research, and may have implications for clinical disorders, as the authors propose. Furthermore, it may lead - either the authors or someone else - to further test this interesting question of manipulation of interoception in a different or more controlled manner.

      I salute the authors for coming up with this interesting question and encourage them to continue and explore ways to study it and related follow-up questions.

    2. Reviewer #3 (Public review):

      Parrotta et al provide a convincing and thorough revision of their manuscript "Exposure to false cardiac feedback alters pain perception and anticipatory cardiac frequency". The authors addressed my previous concerns regarding theoretical framing and methodological clarity. For example:

      They provided additional detail on the experimental design, procedure and statistical analyses.

      The predictive coding rationale for the hypotheses has been clarified.

      The limitations of the study are discussed comprehensively

      Additional analyses were performed to investigate the role of learning effects and across-experiment effects

      New supplementary figures allow a closer look at the feedback-related response patterns

      In sum, the revisions improve the manuscript. However, some issues remain present.

      (1) Potential learning/ habituation effects. In my first review of the manuscript, I raised the concern that learning effects may have contributed to the observed differences between interoceptive & exteroceptive cues.<br /> The authors argue that the small number of six trials per condition could limit aversive effects of differential learning between experiments. However, electric nociceptive stimuli are exceptionally potent in classical conditioning experiments and humans can develop conditioned responses to these types of stimuli after a single trial [1-2]. Therefore, six trials are sufficient to allow for associative or expectancy-based learning processes.

      However, the authors are also presenting additional analyses, i.e. LME models which included trial rank as a predictor. While these models do not show a statistically significant learning effect, they do indicate a noteworthy larger effect in earlier trials compared to later ones. However, in my reading, this speaks towards the presence of unspecific effects of attention or arousal. This pattern is compatible with early learning or, alternatively, with non-specific attentional or arousal responses that diminish across repetitions. This is potentially a limitation of the design: repetition-related effects (attention reduction, arousal habituation, early learning) may contribute to the results, and distinguishing between interoceptive inference and non-specific effects remains challenging within this paradigm.

      (1) Haesen K, Beckers T, Baeyens F, Vervliet B. One-trial overshadowing: Evidence for fast specific fear learning in humans. Behav Res Ther. 2017 Mar;90:16-24. doi: 10.1016/j.brat.2016.12.001. Epub 2016 Dec 8. PMID: 27960093.

      (2) Glenn CR, Lieberman L, Hajcak G. Comparing electric shock and a fearful screaming face as unconditioned stimuli for fear learning. Int J Psychophysiol. 2012 Dec;86(3):214-9. doi: 10.1016/j.ijpsycho.2012.09.006. Epub 2012 Sep 21. PMID: 23007035; PMCID: PMC3627354.

      (2) SESOI and power rationale. The authors elaborated on the sensitivity analyses and the rationale of reporting SESOI rather than traditional a-priori power analyses and included this information in the manuscript, which improves transparency.

      (3) Unspecific arousal/ attention mechanisms. The authors argue against unspecific arousal mechanisms based on the absence of main effects in pain ratings and heart rate. This reduces the likelihood of a purely unspecific arousal account, however, these unspecific effects may not need to manifest as main effects. Unspecific mechanisms are likely adding (at least residual) effects onto the results.

      Regarding attention-based mechanisms, the authors have clarified that in Experiment 2 (exteroceptive cue), the participants are instructed that the sound does not have any relation with their heart rate. If participants did not receive any instructions on the meaning of the knocking sounds, they may have simply ignored it - not unlikely, also because the exteroceptive feedback did not elicit any systematic effect on the outcome variables (minus the slowing of HR with slower exteroceptive feedback, which may reflect noise, altering, multiple comparisons?). Ultimately, how the participants did or did not process the exteroceptive cue is unclear.

      (4) The authors provided more context to their hypothesis and strengthened its theoretical motivation (increased pain intensity with incongruent-high cardiac feedback), rooting it in predictive coding accounts of interoception. For instance, their prior study shows that participants report an increased cardiac frequency while anticipating pain. The reasoning behind this study is hence that if pain shapes cardiac perception, cardiac perception should in turn shape pain perception. The introduction has been revised accordingly, adding more references on the interplay between cardiac feedback and pain and emotional responses. While this rooting within the predictive processing framework is now clearly developed, it also underscores a gap between the proposed theoretical mechanism and the current analytical approach. The hypothesis is formulated in a mechanistic, computational-level language, yet the statistical analysis remains primarily descriptive, at a group level, and does not directly test the predictive-coding account.

      New concerns introduced by the revision:

      (1) Some of the newly added paragraphs interrupt the narrative flow. For example, the justification of the supradiaphragmatic focus based on the BPQ questionnaire feels too long for this section and might fit more naturally in the theoretical background or introduction. Similarly, the predictive-coding paragraph appearing after the hypotheses seems better suited to the earlier conceptual framing rather than following the hypothesis statements. It would be better for the argumentative flow if hypotheses followed from theoretical considerations.

      (2) The authors now note that the administration of the BPQ questionnaire was exploratory, explaining the null-results in the methods section as resulting from an underpowered design. But if the design is not appropriate for discovering a connection between self-reported body awareness and pain ratings, why was it administered in the first place? The rationale here is unclear.

      (3) The discussion is longer than before and would benefit greatly from streamlining the arguments.

    1. Reviewer #1 (Public review):

      Summary:

      This study was designed to manipulate and analyze the effects of chemosensory cues on visuomotor control. They approach this by analyzing how eye-body coordination and brain-wide activity are altered with specific chemosensation in larval zebrafish. After analyzing the dynamics of coupled saccade-tail coordination sequences - directionally linked and typically coupled to body turns - the authors investigated the effects of sensory cues shown to be either aversive or appetitive on freely swimming zebrafish on the eye-body coordination. Aversive chemicals lead to an increase in saccade-tail sequences in both number and dynamics, seemingly facilitating behaviors like escape. Brain-wide imaging led the authors to neurons in the telencephalic pallium as a target to study eye-body coordination. Pallium neuron activity correlated with both aversive chemicals and coupled saccade-tail movements.

      Recommendations for improvement are minimal. So much of the data is ultimately tabular, and the figures are an impenetrable wall of datapoints. 1c is an excellent example: three concentrations are presented, but it's rare for the three averages to trend appropriately. The key point, which is that aversive odors are repulsive and attractive odors (sometimes) attractive just gets lost in showing the three concentrations individually; it also makes direct comparisons impossible. There are similar challenges abound in the violin plots in 4e-4h, the error bars on the "fits" in 4i-4m, and so on. We recommend selecting an illustrative subset of data to present to permit interpretation and putting the rest in a supplemental table. (Presenting) less is more (effective).

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Sy SKH. et al. on pallium encoded chemosensory impact of eye-body coordination describes how the valence of chemosensory stimuli can affect the coordination of eye saccades with tail flips. They show that aversive valence stimuli can increase both the strength and frequency of tail flips through a pallium-mediated circuit.

      Overall, the manuscript is well-written and easy to follow, although the figures are quite dense, the methodology is mostly sound, and the improvement to the fish on chips system is very interesting. The methods description is thorough and welcome, making the experiments clear. The limited number of animals, and the spread between 5 and 6dpf is a concern as most of the statistics seem to have been done on the individual events, and not the number of biological samples.

      The initial behavioural experiments are very promising. However, the conclusions surrounding the role of the pallium are a lot more speculative and not supported by the results.

      Comments:

      (1) The fish on chips 2.0 methods show a lot of promise for future studies of chemosensory stimuli, combined with whole-brain imaging. This will provide new avenues of research for zebrafish neuroscientists.

      (2) Chemosensory cues would have a very different timing than visual cues; timing is very important for multisensory integration. How do the authors suggest those are integrated? How would they differentiate between an integration of various cues or a different arousal state, as they describe in the introduction?

      (3) Studies have looked at chemosensation in Drosophila, including multisensory integration, which should be discussed by the authors (see the work of Mark Frye, amongst others).

      (4) In the brain imaging methods, there is a mention of robustly behaving larvae. Does that mean that an exclusion criterion was used to select only 5 larvae? If so, this should be stated clearly. The authors also do not mention how they avoid the switch to a passive state that one of the coauthors has observed in closed closed-loop setup. The authors should comment on this point.

      (5) Were the statistics in Figure 2 done with an n of 5, or do they assume that each tail flip and saccade is an independent event? I would imagine the latter would have inflated p-values and should be avoided.

      (7) Page 7: Why do the authors think that the cumulative effect of these minor differences could lead to very different behavioural goals? Especially when comparing to actual startle responses, which are extremely strong and stereotypical. How do their observations compare to the thermosensory navigation of larval zebrafish observed by Martin Haesemeyer, for example, or the work of the RoLi lab?

      (8) Page 8: Figure 5, I am confused by the y-axis of g, in e and f, the values are capped at 2, whereas in g they go up to 6, with apparently a number of cells whose preference is out of the y-axis limit (especially in Q2). Having the number of cells in each quadrant would also help to assess if indeed there is some preference in the pallium towards Q1.

      (9) Figure 6: How is the onset of neuronal activity determined compared to the motor stimulus? Looking at Supplementary Figure 8, it is quite unclear how the pallium is different from the OB or subpallium. The label of onset delay is also confusing in this figure.

      (10) Page 9: I do not think that the small differences observed in the pallium are as clear-cut as the authors make them out to be, or that they provide such strong evidence of their importance. As there are no interventions showing any causality in the presence of these pallium responses and the sensorimotor responses, these could represent different arousal states rather than any integration of sensory information.

    3. Reviewer #3 (Public review):

      The manuscript investigates the coupling of saccadic eye movements (S) with directed tail flips (T). The remarkable discovery is that tail flips that are preceded by a conjugate sacced (S-T) can be credibly classified as specific "volitional" turns that are distinguished from the standard tail movements that seem to be more of a spontaneous and "impulsive" nature.

      They show that 'turning intent', as indicated by a small increase in S, is elevated by aversive odors, while 'gliding intent', as indicated by a decrease in S and an increase in undulation cycles, is elevated by appetitive odors.

      This is a very important finding, which is backed up by a thorough behavioral analysis, and the identification of neural populations in the pallium and sub-pallium that clearly distinguish between these kinds of turns is very promising. Here they identify neuronal populations that are preferentially active during - and predictive of - coupled (S-T) versus isolated (T) tail flips.

      Especially the fact that S-T turns (but not T turns) can be predicted already by pre-event, ramping, pallial activity is intriguing.

      The authors then go on and demonstrate that the frequency of (S-T) turns is modulated in fish exposed to appetitive or aversive odors.<br /> Specifically, they quantify the aversiveness and appetitive-ness of several odors in a free swimming assay. They select a couple of these odors based on their valence, and they demonstrate that these odors induce moderate modulation in the frequency of eye saccades (S) and tail flips (T) and (S-T) turns.

      The study is rigorous and thorough, and the findings are informative and novel.

      In important controls, they confirm that brain-wide imaging can distinguish between appetitive and aversive contexts, and they show that pallial activation by aversive odors is consistent with neural activity in the rhombencephalon that correlates with turning activity, whereas sub-pallial activation by appetitive odors correlates with rhombencephalic activity related to gliding.

      Overall, this manuscript is very good.

    1. Reviewer #1 (Public review):

      Summary:

      Witte et al. examined whether canonical behavioral functions attributed to the cerebellum decline with age. To test this, they recruited younger, old, and older-old adults in a comprehensive battery of tasks previously identified as cerebellar-dependent in the literature. Remarkably, they found that cerebellar function is largely preserved across the lifespan-and in some cases even enhanced. Structural imaging confirmed that their older adult cohort was representative in terms of both cerebellar gray- and white-matter volume. Overall, this is an important study with strong theoretical implications and convincing evidence supporting the motor reserve hypothesis, demonstrating that cerebellar-dependent measures remain largely intact with aging.

      Strengths:

      (1) Relatively large sample size.

      (2) Most comprehensive behavioral battery to date assessing cerebellar-dependent behavior.

      (3) Structural MRI confirmation of age-related decline in cerebellar gray and white matter, ensuring representativeness of the sample.

      Weaknesses:

      (1) Although the authors note this was outside the study's scope, the absence of a voxel-based morphometry (VBM) analysis limits anatomical and functional specificity. Such an analysis would clarify which functions are cerebellar-dependent rather than solely inferring this from prior neuropsychological literature.

      (2) As acknowledged in the Discussion, task classification (cerebellar-dependent vs. general measures) remains somewhat ambiguous. Some "general" measures may still rely on cerebellar processes based on the paper's own criteria - for example, tasks in which individuals with cerebellar degeneration show impairments.

      (3) Cerebellar-dependent and general measures may inherently differ in measurement noise, potentially biasing results toward detecting effects in general measures but not in cerebellar-dependent ones.

    2. Reviewer #2 (Public review):

      Summary:

      The authors are investigating cerebellar-mediated motor behaviors in a large sample of adults, including 30 individuals over the age of 80 (a great strength of this work). They employed a large battery of motor tasks that are tied to cerebellar function, in addition to a cognitive task and motor tasks that are more general. They also evaluated cerebellar structure. Across their behavioral metrics, they found that even with cerebellar degeneration, cerebellar-mediated motor behavior remained intact relative to young adults. However, this was not the case for measures not directly tied to cerebellar function. The authors suggest that these functions are preserved and speak to the resiliency and redundancy of function in the cerebellum. They also speculate that cerebellar circuits may be especially good for preserving function in the face of structural change. The tasks are described very well, and their implementation is also well-done with consideration for rigor in the data collection and processing. The inclusion of Bayesian estimates is also particularly useful, given the theoretically important lack of age differences reported. This work is methodologically rigorous with respect to the behavior, and certainly thought-provoking.

      Strengths:

      The methodological rigor, inclusion of Bayesian statistics, and the larger sample of individuals over the age of 80 in particular are all great strengths of this work. Further, as noted in the text, the fact that all participants completed the full testing battery is of great benefit.

      Weaknesses:

      The suggestion of cerebellar reserve, given that at the group level there is a lack of difference for cerebellar-specific behavioral components, could be more robustly tested. That is, the authors suggest that this is a reserve given that the volume of cerebellar gray matter is smaller in the two older groups, though behavior is preserved. This implies volume and behavior are seemingly dissociated. However, there is seemingly a great deal of behavioral variability within each group and likewise with respect to cerebellar volume. Is poorer behavior associated with smaller volume? If so, this would still suggest that volume and behavior are linked, but rather than being age that is critical, it is volume. On the flip side, a lack of associations between behavior and volume would be quite compelling with respect to reserve. More generally, as explicated in the recommendations, there are analyses that could be conducted that, in my opinion, would more robustly support their arguments given the data that they have available. This is a well-executed and thought-provoking investigation, but there is also room for a bit more discussion.

    1. Reviewer #1 (Public review):

      Summary

      Wang et al. address the challenge of tracking goal-relevant visual signals amidst distractions, a fundamental aspect of adaptive visual information processing. By employing functional magnetic resonance spectroscopy (fMRS) during a visual tracking task, they quantify changes in both inhibitory (GABA) and excitatory (glutamate) neurotransmitter concentrations in the parietal and visual cortices. The results reveal that increases in GABA and glutamate in the parietal cortex are closely tied to the number of targets, and individual differences in GABAergic and glutamatergic responses within the parietal cortex predict tracking performance and distractor suppression. These findings underscore a neural mechanism in which GABAergic inhibition in the parietal cortex actively suppresses goal-irrelevant distractors, thereby facilitating goal-directed visual tracking and highlighting the dynamic role of these key metabolites in cognitive control during visual processing. I found the study to be well-written and thoughtful from an experimental standpoint, although it would benefit from some targeted revisions.

      Strengths

      (1) The study employs robust and validated fMRS methodology, allowing for real-time monitoring of metabolite changes during goal-directed tasks.

      (2) Simultaneous measurement of both GABA and Glx in parietal and visual cortices yields nuanced insights into the neurochemical correlates of visual attention.

      (3) The link between neurochemical changes and behavioral performance is clearly established, providing strong evidence for GABAergic involvement in distractor suppression.

      (4) Experimental protocols align with current standards for MEGA-PRESS, bolstering the technical reliability of the findings.

      Weaknesses

      (1) Certain aspects of terminology, methodological reporting, and confound management are inconsistently described throughout the manuscript.

      (2) Important confounding factors are not systematically reported or controlled.

      (3) Opportunities for additional analysis (e.g., behavioral dynamics, use of alternate fitting methods, more comprehensive quality metrics) have not been fully explored.

      (4) Open access data and/or codes for the analysis are not shared in the main manuscript

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates how the visual system is able to track target objects when these are presented in the visual field together with other irrelevant and distracting visual objects. The authors use functional Magnetic Resonance Spectroscopy to measure the two most important excitatory and inhibitory neurotransmitters, glutamate and GABA, in both the visual and parietal cortex.

      Strengths:

      (1) Well-designed functional challenge.

      (2) Number of subjects.

      (3) Good quality spectra and appropriate reporting of MRS methods and quality assurance.

      (4) Introduction and discussion are clear for non-experts in visual processing.

      Weaknesses:

      (1) Rejection of spectra based on high % CRLB may artificially remove data with the lowest metabolite concentration.

      (2) SN description as percentage does not make sense.

    3. Reviewer #3 (Public review):

      Wang et al. report multiple experiments using functional magnetic resonance spectroscopy (fMRS) in a multiple object tracking (MOT) task to investigate the effect of experimentally manipulating a) the number of targets, b) object size, and c) total number of objects in the display on GABA and glutamate (Glx) concentrations in parietal and visual cortex. Data is analyzed in two orthogonal ways throughout: via condition differences in behavorial performance (inverse efficiency), GABA, and Glx concentrations and through correlations between changes in inverse efficiency and GABA or Glx. All three experimental manipulations affected inverse efficiency, with worse performance with more targets, smaller objects, and a larger total number of objects. However, only the manipulation of the target number produced a condition difference in GABA and Glx, with higher concentrations of both in the parietal VOI and only of Glx in the visual VOI with more targets ('high load'). Correlational analyses revealed that participants with a larger change in GABA in the parietal VOI with a higher number of targets showed a smaller drop in behavioral performance with more targets. The opposite direction of correlation was observed for Glx in both the visual and parietal VOI.

      In the two control experiments, correlations were only investigated in the parietal VOI. There was a negative correlation between change in Glx and change in inverse efficiency with manipulation of object size, i.e. participants exhibiting a positive change in Glx showed no or little difference in performance, but those with an increase in Glx with smaller targets showed a more pronounced drop in performance. There was no correlation with GABA for the manipulation of object size. For the manipulation of total object number, participants exhibiting an increasing GABA concentration with more objects showed a smaller drop in performance.

      The authors' main claim is that GABAergic suppression of goal-irrelevant distractors in parietal cortex is key to goal-directed visual information processing.

      The study is, to my knowledge, the first to employ fMRS in an MOT paradigm, and I read it with great interest. I am admittedly not an expert on the fMRS technique and have therefore refrained from commenting on the technical aspects of its use. Although the application of fMRS to MOT is novel and adds new knowledge to the field, I have some critiques and believe that a much more nuanced interpretation of the findings is warranted.

      Major

      (1) Especially the control experiments lean heavily on Bettencourt and Somers (2009) and adopt and to some extent exaggerate claims from that paper uncritically. This is obvious in referring to the manipulations of object size and object number as high/low enhancement and high/low suppression, as if the association of these physical manipulations of the stimulus display with attentional mechanisms were so obvious and beyond doubt that drawing any distinction between these manipulations and their supposed effects is entirely superfluous. This seems far beyond what is warranted to me. It may seem plausible that adding distractors engages distractor suppression more, but whether this is truly the case is an empirical question, and Bettencourt and Somers (2009) have no direct measure of distractor suppression to substantiate this claim. Their study is purely behavioral, and there is no attempt to assess distractor processing separately. The case for the 'target enhancement' manipulation is even weaker: objects are of a sufficient size and at maximum contrast (white on black screen, but exact details are omitted) to be clearly visible in either condition, so why would smaller objects require more enhancement? Although the present data shows a clear effect of manipulating object size, the corresponding size of the effect in Bettencourt and Somers (2009) is rather underwhelming and does not warrant such a strong conclusion. In summary, the link between the object number and object size manipulations with suppression and enhancement is very far from the 1:1 that the authors seem to assume. Accordingly, I believe that the manipulations should be labelled as object number and object size rather than their hypothesized effects, throughout and that there should be a much more critical discussion as to whether these manipulations are indeed related to these effects as expected.

      (2) The author's interpretation of the results seems rather uncritical. What is observed (at least in the first experiment) is a change in GABA and Glx concentrations with changes in the number of tracked targets. Is the only conceivable way in which this could happen through target enhancement and distractor suppression? The processing of targets and distractors is not measured directly, so any claims are indirect, at best. The authors cite the recent 'Ten simple rules to study distractor suppression' paper (Wöstmann et al., 2022), which presents a consensus between leading researchers in the field. Neither Bettencourt & Somers (2009) nor the design of the current study live up to the rules established in that paper, so a much more nuanced interpretation and discussion of the current findings seems warranted. It is anything but obvious to me that the only activity in the parietal cortex that could possibly be suppressed by GABA is the representation of distractors. Indeed, cueing more targets (high load) decreases the number of distractors in the first experiment, so the need for distractor suppression in the high load condition is less than in the low load condition. So, shouldn't we observe lower GABA concentrations in the 'high load' condition?

      (3) It seems that the authors included data from both correctly tracked and incorrectly tracked trials in their fMRS analysis. In MOT, attending target objects is the task per se, so task errors indicate that participants did not actually track the targets. So when comparing conditions with different error levels, it is ambiguous whether changes in brain activity reflect the experimental manipulation as such, or rather the different mix of correctly tracked and incorrectly tracked trials that result from this physical manipulation. Are the correlations perhaps driven by the inclusion of different proportions of correctly tracked trials across participants? It seems that the authors may have to separate correct and error trials in the analysis to check for the possibility that effects are due to the inclusion of data from trials in which participants may have stopped tracking at least some of the target objects. Of course, such an analysis is somewhat limited by the fact that only one target was probed, yielding a 50% guessing chance (i.e. even if the response is correct, we do not know whether the other, unprobed, objects were tracked correctly on that trial).

      (4) The key findings from the control experiments are purely correlational. The supposed cause may be what the authors claim, but there is an infinity of alternative explanations. Correlational findings cannot simply be interpreted as if they resulted from an experimental manipulation (...although this is, unfortunately, by no means rare in the cognitive neuroscience literature). The authors should make a rigorous effort to consider the most plausible alternative explanations for these correlations and argue why or why not they believe that they can be discounted.

      (5) Related to the previous point: the experimental manipulations did not produce mean differences in GABA/Glx in the control experiments. Doesn't this speak against the authors' interpretation? They briefly acknowledge this in the discussion, but I think there is a deeper problem. The absence of these effects casts doubt on what these manipulations actually do, and therefore also on the interpretation of the correlations in these experiments. For example, the authors might also have concluded from the same data that the absence of increased GABA in the 'high suppression' condition refutes the very idea that GABA concentrations are related to distractor suppression.

      (6) 'Inverse Efficiency' is a highly unusual measure of MOT performance in the literature, and its use reduces the comparability of the findings with previous work. The standard is to assess the correctness ('accuracy') of responses with no focus on speed. This makes sense as responses are given after the object motion has stopped. At the same time, reaction time can be informative too (e.g., Störmer et al., 2013). I think the authors should justify their use of inverse efficiency as the dependent variable.

      (7) The choice of variable names is problematic: it is sometimes misleading and makes understanding the findings harder (see also points 1 and 6): obvious, unambiguous, and importantly, interpretation free names for conditions such as target number (2/4), object size (small/large), and total object number (8/12) become load (high/low), target enhancement (high/low) and distractor suppression (low/high). This reduces clarity and, especially in the case of enhancement and suppression, conflates the actual manipulation with its interpretation.

    1. Reviewer #1 (Public review):

      This is a highly original and impactful study that significantly advances our understanding of transcriptional regulation, in particular RNAPII pausing, during early heart development. The Chen lab has a long history of producing influential studies in cardiac morphogenesis, and this manuscript represents another thorough and mechanistically insightful contribution. The authors have thoroughly addressed this Reviewer's concerns and incorporated all of my suggestions in the revised manuscript. In addition, their responses to the other reviewer's comments are also very clear. As it is, this work is of great interest to the readership of Elife, as well as to the general scientific community.

      The authors reveal a fundamentally new role for Rtf1-a component of the PAF1 complex-in governing promoter-proximal RNAPII pausing in the context of myocardial lineage specification. While transcriptional pausing has been implicated in stress responses and inducible gene programs, its developmental relevance has remained poorly defined. This study fills that gap with rigorous in vivo evidence demonstrating that Rtf1-dependent pausing is indispensable for activating the cardiac gene program from the lateral plate mesoderm.

      Importantly, the study also provides compelling therapeutic implications. Showing that CDK9 inhibition-using either flavopiridol or targeted knockdown-can restore promoter-proximal pausing and rescue cardiomyocyte formation in Rtf1-deficient embryos suggests that modulation of pause-release kinetics may represent a new avenue for correcting transcriptionally driven congenital heart defects. Given that many CDK inhibitors are clinically approved or in active development, this connection significantly elevates the translational impact of the findings.

      In sum, this study is rigorous, innovative, and transformative in its implications for developmental biology and cardiac medicine. I strongly support its publication.

    2. Reviewer #2 (Public review):

      Summary:

      Langenbacher at el. examine the requirement of Rtf1, a component of the PAF1C complex, which regulates transcriptional pausing in cardiac development. The authors first confirm that newly generated rtf1 mutant alleles recapitulate the defects in cardiac progenitor differentiation found using morpholinos from their previous work. The authors then show that conditional loss of Rtf1 in mouse embryos and depletion in mouse ESCs both demonstrates a failure to turn on cardiac progenitor and differentiation marker genes, supporting conservation of Rtf1 in promoting vertebrate cardiac progenitor development. The authors then employ bulk RNA-seq on flow-sorted hand2:GFP+ cells and multiomic single-cell RNA-seq on whole Rtf1-depleted zebrafish embryos at the 10-12 somite stage. These experiments corroborate that gene expression associated with cardiac progenitor differentiation is lost. Furthermore, analysis of differentiation trajectories suggests that the expression of genes associated with cardiac, blood, and endothelial progenitor differentiation is not initiated within the anterior lateral plate mesoderm. Structure-function analysis supports that the Rtf1 Plus3 domain is necessary for its function in promoting cardiac progenitor differentiation. ChIP-seq for RNA Pol II on 10-12 somite stage zebrafish embryos supports that Rtf1 is required for proper promoter pausing at the transcriptional start site. The transcriptional promoter pausing defect and cardiac differentiation can partially be rescued in zebrafish rtf1 mutants through pharmacological inhibition and depletion of Cdk9, a kinase that inhibits elongation. Thus, the authors have provided a clear analysis of the requirements and basic mechanism that Rf1 employs regulating cardiac progenitor development.

      Strengths and weaknesses:

      Overall, the data presented are strong and the message of the study is clear. The conclusions that Rtf1 is required for transcriptional pause release and promotes vertebrate cardiac progenitor differentiation are supported. Areas of strength include the complementary approaches in zebrafish and mouse embryos, and mouse embryonic stem cells, which together support the conserved requirement for Rtf1 in promoting cardiac differentiation. The bulk and single-cell RNA-sequencing analyses provide further support for this model via examining broader gene expression. In particular, the pseudotime analysis bolsters that there is a broader effect on differentiation of anterior lateral plate mesoderm derivatives. The structure-function analysis provides a relatively clean demonstration of the requirement of the Rtf1 Plus3 domain. The pharmacological and depletion epistasis of Cdk9 combined with the RNA Pol II ChIP-seq nicely support the mechanism implicating Cdk9 in the Rtf1-dependent RNA Pol II promoter pausing. Additionally, this is a revised manuscript. The authors were overall responsive to the previous critiques. The new analysis and revisions have helped to strengthen their hypothesis and improve the clarity of their study. While the revised manuscript is significantly improved, the lack of analysis from the multiomic analysis still represents a lost opportunity to provide further insight into Rtf1 mechanisms within this study. However, the authors have nevertheless achieved their goal for this study. The data sets reported will also be useful tools for further analysis and integration by the cardiovascular development community. Thus, the study will be of interest to scientists studying cardiovascular development and those broadly interested in epigenetic regulation controlling vertebrate development.

    1. Reviewer #1 (Public review):

      Summary:

      Here, the authors are proposing a role for miR-196, a microRNA that has been shown to bind and enhance degradation of mRNA targets in the regulation of cell processes, has a novel role in allowing the emergence of CD19+ cells in cells in which Ebf1, a critical B-cell transcription factor, has been genetically removed.

      Strengths:

      That over-expression of mR-195 can allow the emergence of CD19+ cells missing Ebf1 is somewhat novel.

      Their data does perhaps support to a degree the emergence of a transcriptional network that may bypass the absence of Ebf1, including the FOXO1 transcription factor, but this data is not strong or definitive.

      Weaknesses:

      It is unclear whether this observation is in fact physiological. When the authors analyse a knockout model of miR-195, there is not much of a change in the B-cell phenotype. Their findings may therefore be an artefact of an overexpression system.

      The authors have provided insufficient data to allow a thorough appraisal of the step-wise molecular changes that could account for their observed phenotype.

      On review of the resubmitted manuscript, while I note the authors have attempted to address several of my comments, unfortunately, their resubmission is not sufficient to address several of the comments I had previously made.

      In particular, in the resubmitted data that includes western blots for PAX5 and ERG in their EBF1-/- model, Supp Fig S3, the bands they show infer that that PAX5 and ERG expression can still be significantly detected in their EBF1-/- early B-cell model. This should not be the case, as no expression of PAX5 or ERG should be seen, as has been shown in prior literature.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate miRNA miR-195 in the context of B-cell development. They demonstrate that ectopic expression of miR-195 in hematopoietic progenitor cells can, to a considerable extent, override the consequences of deletion of Ebf1, a central B-lineage defining transcription factor, in vitro and upon short-term transplantation into immunodeficient mice in vivo. In addition, the authors demonstrate that the reverse experiment, genetic deletion of miR-195, has virtually no effect on B-cell development. Mechanistically, the authors identify Foxo1 phosphorylation as one pathway partially contributing to the rescue effect of miR-195. An additional analysis of epigenetics by ATACseq adds potential additional factors that might also contribute to the effect of ectopic expression of miR-195.

      Strengths:

      The authors employ a robust assay system, Ebf1-KO HPC, to test for B-lineage promoting factors. The manuscript overall takes on an interesting perspective rarely employed for analysis of miRNA by overexpressing the miRNA of interest. Ideally, this approach may reveal, if not the physiological function of this miRNA, the role of distinct pathways in developmental processes.

      Weaknesses:

      At the same time, this approach constitutes a major weakness: It does not reveal information on the physiological role of miR-195. In fact, the authors themselves demonstrate in their KO approach, that miR-195 has virtually no role in B-cell development, as has been demonstrated already in 2020 by Hutter and colleagues. While the authors cite this paper, unfortunately, they do so in a different context, hence omitting that their findings are not original.

      Conceptually, the authors stress that a predominant function of miRNA (in contrast to transcription factors, as the authors suggest) lies in fine-tuning. However, there appears to be a misconception. Misregulation of fine tuning of gene expression may result in substantial biological effects, especially in developmental processes. The authors want to highlight that miR-195 is somewhat an exception in that regard, but this is clearly not the case. In addition to miR-150, as referenced by the authors, also the miR-17-92 or miR-221/222 families play a significant role in B-cell development, their absence resulting in stage-specific developmental blocks, and other miRNAs, such as miR-155, miR-142, miR-181, and miR-223 are critical regulators of leukocyte development and function. Thus, while in many instances a single miRNA moderately affects gene expression at the level of an individual target, quite frequently targets converge in common pathways, hence controlling critical biological processes.

      The paper has some methodological weaknesses as well: For the most part, it lacks thorough statistical analysis and only representative FACS plots are provided. Many bar graphs are based on heavy normalization making the T-tests employed inapplicable. No details are provided regarding statistical analysis of microarrays. Generation of the miR-195-KO mice is insufficiently described and no validation of deletion is provided. Important controls are missing as well, the most important one being a direct rescue of Ebf1-KO cells by re-expression of Ebf1. This control is critical to quantify the extent of override of Ebf1-deficiency elicited by miR-195 and should essentially be included in all experiments. A quantitative comparison is essential to support the authors' main conclusion highlighted in the title of the manuscript. As the manuscript currently stands, only negative controls are provided, which, given the profound role of Ebf1, are insufficient, because many experiments, such as assessment of V(D)J recombination, IgM surface expression, or class-switch recombination, are completely negative in controls. In addition, the authors should also perform long-term reconstitution experiments. While it is somewhat surprising that the authors obtain splenic IgM+ B cells after just 10 days, these experiments would certainly be much more informative after longer periods of time. Using "classical" mixed bone marrow chimeras using a combination of B-cell defective (such as mb1/mb1) bone marrow and reconstituted Ebf1-KO progenitors would permit much more refined analyses.

      With regard to mechanism, the authors show that the Foxo1 phosphorylation pathway accounts for the rescue of CD19 expression, but not of other factors, and mentioned in the discussion. The authors then resort to epigenetic analysis, but their rationale remains somewhat vague. It remains unclear how miR-195 is linked to epigenetic changes.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, Miyatake et al. present the interesting finding that ectopic expression of miR-195 in EBF1-deficient hematopoietic progenitor cells can partially rescue their developmental block and allows B cells to progress to a B220+ CD19+ cells stage. Notably, this is accompanied by an upregulation of B cell specific genes and, correspondingly, a downregulation of T, myeloid and NK lineage-related genes, suggesting that miR-195 expression is at least in part equivalent to EBF1 activity in orchestrating the complex gene regulatory network underlying B cell development. Strengthening this point, ATAC sequencing of miR-195-expressing EBF1-deficient B220+CD19+ cells and a comparison of these data to public datasets of EBF1-deficient and -proficient cells suggest that miR-195 indirectly regulates gene expression and chromatin accessibility of some, but not all regions regulated by EBF1.

      Mechanistically, the authors identify a subset of potential target genes of miR-195 involved in MAPK and PI3K signalling. Dampening of these pathways has previously been demonstrated to activate FOXO1, a key transcription factor for early B cells downstream of EBF1. Accordingly, the authors hypothesize that miR-195 exerts its function through FOXO1. Supporting this claim, also exogenous FOXO1 expression is able to promote the development of EBF1-deficient cells to the B220+CD19+ stage and thus recapitulates the miR-195 phenotype.

      Strengths:

      The strength of the presented study is the detailed assessment of the altered chromatin accessibility in response to ectopic miR-195 expression. This provides insight into how miR-195 impacts on the gene regulatory network that governs B cell development and allows the formation of mechanistic hypotheses.

      Weaknesses:

      The key weakness of this study is that its findings are based on the artificial and ectopic expression of a miRNA out of its normal context, which in my opinion strongly limits the biological relevance of the presented work.

      While the authors performed qPCRs for miR-195 on different B cell populations and show that its relative expression peaks in early B cells, it remains unclear whether the absolute miR-195 expression is sufficiently high to have any meaningful biological activity. In fact, other miRNA expression data from immune cells (e.g. DOI 10.1182/blood-2010-10-316034 and DOI 10.1016/j.immuni.2010.05.009) suggest that miR-195 is only weakly, if at all, expressed in the hematopoietic system.<br /> Update to this part after revision: The authors now state in the discussion that their study does not aim to uncover and characterize a physiological role of miR-195 in lymphocytes development, but rather reveals "the potential of miR-195 to compensate for EBF1 deficiency". However, in my opinion, the absence of any physiological context still limits this study's relevance.

      The authors support their finding by a CRISPR-derived miR-195 knockout mouse model which displays mild but significant differences in the hematopoietic stem cell compartment and in B cell development. However, they fail to acknowledge and discuss a lymphocyte-specific miR-195 knockout mouse that does not show any B cell defects in the bone marrow or spleen and thus contradicts the authors' findings (DOI 10.1111/febs.15493). Of note, B-1 B cells in particular have been shown to be elevated upon loss of miR-15-16-1 and/or miR-15b-16-2, which contradicts the data presented here for loss of the family member miR-195.

      A second weakness is that some claims by the authors appear overstated or at least not fully backed up by the presented data. In particular, the findings that miR-195-expressing cells can undergo VDJ recombination, express the pre-BCR/BCR and can class switch need to be strengthened. It would be beneficial to include additional controls to these experiments, e.g. a RAG-deficient mouse as a reference/negative control for the ddPCR and the surface IgM staining, and cells deficient in class switching for the IgG1 flow cytometric staining.

      Moreover, the manuscript would be strengthened by a more thorough investigation of the hypothesis that miR-195 promotes the stabilization and activity of FOXO1, e.g. by comparing the authors' ATACseq data to the FOXO1 signature.

    1. Reviewer #1 (Public review):

      Summary:

      Rahmani et al. utilize the TurboID method to characterize global proteome changes in the worm's nervous system induced by a salt-based associative learning paradigm. Altogether, they uncover 706 proteins tagged by the TurboID method in worms that underwent the memory-inducing protocol. Next, the authors conduct a gene enrichment analysis that implicates specific molecular pathways in salt-associative learning, such as MAP kinase and cAMP-mediated pathways, as well as specific neuronal classes including pharyngeal neurons, and specific sensory neurons, interneurons, and motor neurons. The authors then screen a representative group of hits from the proteome analysis. They find that mutants of candidate genes from the MAP kinase pathway, namely dlk-1 and uev-3, do not affect performance in the learning paradigm. Instead, multiple acetylcholine signaling mutants, as well as a protein-kinase-A mutant, significantly affected performance in the associative memory assay (e.g., acc-1, acc-3, lgc-46, and kin-2). Finally, the authors demonstrate that protein-kinase-A mutants, as well as acetylcholine signaling mutants, do not exhibit a phenotype in a related but distinct conditioning paradigm-aversive salt conditioning-suggesting their effect is specific to appetitive salt conditioning.

      Overall, the authors addressed the concerns raised in the previous review round, including the statistics of the chemotaxis experiments and the systems-level analysis of the neuron class expression patterns of their hits. I also appreciate the further attempt to equalize the sample size of the chemotaxis experiments and the transparent reporting of the sample size and statistics in the figure captions and Table S9. The new results from the panneuronal overexpression of the kin-2 gain-of-function allele also contribute to the manuscript. Together, these make the paper more compelling.

    2. Reviewer #2 (Public review):

      Summary:

      In this study by Rahmani in colleagues, the authors sought to define the "learning proteome" for a gustatory associative learning paradigm in C. elegans. Using a cytoplasmic TurboID expressed under the control of a pan-neuronal promoter, the authors labeled proteins during the training portion of the paradigm, followed by proteomics analysis. This approach revealed hundreds of proteins potentially involved in learning, which the authors describe using gene ontology and pathway analysis. The authors performed functional characterization of over two dozen of these genes for their requirement in learning using the same paradigm. They also compared the requirement for these genes across various learning paradigms and found that most hits they characterized appear to be specifically required for the training paradigm used for generating the "learning proteome".

      Strengths:

      - The authors have thoughtfully and transparently designed and reported the results of their study. Controls are carefully thought-out, and hits are ranked as strong and weak. By combining their proteomics with behavioral analysis, the authors also highlight the biological significance of their proteomics findings, and support that even weak hits are meaningful.

      - The authors display a high degree of statistical rigor, incorporating normality tests into their behavioral data which is beyond the field standard.

      - The authors include pathway analysis that generates interesting hypotheses about processes involved learning and memory

      -The authors generally provide thoughtful interpretations for all of their results, both positive and negative, as well as any unexpected outcomes.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, authors used a learning paradigm in C. elegans; when worms were fed in a saltless plate, its chemotaxis to salt is greatly reduced. To identify learning-related proteins, authors employed nervous system-specific transcriptome analysis to compare whole proteins in neurons between high-salt-fed animals and saltless-fed animals. Authors identified "learning-specific proteins" which are observed only after saltless feeding. They categorized these proteins by GO analyses, pathway analyses and expression site analyses, and further stepped forward to test mutants in selected genes identified by the proteome analysis. They find several mutants that are defective or hyper-proficient for learning, including acc-1/3 and lgc-46 acetylcholine receptors, F46H5.3 putative arginine kinase, and kin-2, a cAMP pathway gene. These mutants were not previously reported to have abnormality in the learning paradigm.

      Concerns:

      Upon revision, authors addressed all concerns of this reviewer, and the results are now presented in a way that facilitates objective evaluation. Authors' conclusions are supported by the results presented, and the strength of the proteomics approach is persuasively demonstrated.

      Significance:

      (1) Total neural proteome analysis has not been conducted before for learning-induced changes, though transcriptome analysis has been performed for odor learning (Lakhina et al., http://dx.doi.org/10.1016/j.neuron.2014.12.029). This warrants the novelty of this manuscript, because for some genes, protein levels may change even though mRNA levels remain the same. Although in a few reports TurboID has been used in C. elegans, this is the first report of a systematic analysis of tissue-specific differential proteomics.

      (2) Authors found five mutants that have abnormality in the salt learning. These genes have not been described to have the abnormality, providing novel knowledge to the readers, especially those who work on C. elegans behavioural plasticity. Especially, involvement of acetylcholine neurotransmission has not been addressed before. Although transgenic rescue experiments have not been performed except kin-2, and the site of action (neurons involved) has not been tested in this manuscript, it will open the venue to further determine the way in which acetylcholine receptors, cAMP pathway etc. influences the learning process.

      [Editors' note: this version has been assessed without input from the reviewers.]

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Bisht et al. investigate the role of PPE2, a Mycobacterium tuberculosis (Mtb) secreted virulence factor, in adipose tissue physiology during tuberculosis (TB) infection. Previous work by this group established the significance of PPE proteins in Mtb virulence and their role in modulating the innate immune response. Here, the authors present compelling evidence that PPE2 regulates host cell adipogenesis and lipolysis, thereby establishing a link to the development of insulin resistance during TB infection. These fundamental findings demonstrate, for the first time, that a bacterial virulence factor is directly involved in the profound body fat loss, or "wasting," which is a long-established clinical symptom of active TB.

      Key Strengths:

      The confidence in the major findings of this study is significantly strengthened by the authors' comprehensive approach. They judiciously employ multiple experimental systems, including:

      (1) Purified PPE2 protein.

      (2) A non-pathogenic Mycobacterium strain engineered to express PPE2.

      (3) A pathogenic clinical Mtb strain (CDC1551) utilizing a targeted PPE2 deletion mutant.

      (4) While the presence of Mtb in adipose tissues in human and animal models is well-documented, this study is groundbreaking in demonstrating that an Mtb virulence-associated factor actively modulates host fatty acid metabolism within the adipose tissue.

      Key Weakness:

      Although the manuscript provides solid evidence associating the presence of PPE2 with transcriptional changes in host fatty acid machinery within the adipose tissue, the underlying mechanistic details remain elusive. A focused, deep mechanistic follow-up study will be essential to fully appreciate the complex biological implications of the findings reported here.

    2. Reviewer #2 (Public review):

      Summary:

      In the manuscript entitled "The PPE2 protein of Mycobacterium tuberculosis is responsible for the development of hyperglycemia and insulin resistance during tuberculosis" the authors identify PPE2, a secretory protein of Mycobacterium tuberculosis, as a modulator of adipose function. They show that PPE2 treatment in mice causes fat loss, immune cell infiltration into adipose, reduced gene expression of PPAR-γ, C/EBP-α, and adiponectin, and glucose intolerance. Overall, the authors link PPE2 with adipose tissue perturbation and insulin resistance following infection with M. tuberculosis. PPE2, a secretory protein of Mycobacterium tuberculosis, is a modulator of adipose function. They show that PPE2 treatment in mice causes fat loss, immune cell infiltration into adipose, reduced gene expression of PPAR-γ, C/EBP-α, and adiponectin, and glucose intolerance. Overall, the authors link PPE2 with adipose tissue perturbation and insulin resistance following infection with M. tuberculosis.

      Strengths:

      While it is known that M. tuberculosis persists in adipose, the mycobacterial factors contributing to adipose dysfunction are unknown. The study uses multiple mechanisms, including recombinant purified protein, non-pathogenic mycobacterium expressing PPE2, and a clinical strain of M. tuberculosis depleted of PPE2, to show that PPE2 may play an important role in causing fat loss, lipolysis, and insulin resistance following infection. The authors show that PPE2, through unknown mechanisms, decreases gene expression of proteins involved in adipogenesis. Although the mechanisms are unclear, this study advances the field as it is the first to identify a secreted factor (PPE2) from M. tuberculosis to play a role in disrupting adipose tissue.

      Weaknesses:

      There is a lack of completeness amongst the figures that greatly diminishes the claims and impact of the manuscript. For example, in Figures 2 and 5, the authors measure adipocyte area in H&E-stained adipose tissue to show adipose hypertrophy. However, this was not completed in Figures 3 and 4 despite the authors claiming that treatment with rPPE2 induces adipose hypertrophy. It is unclear why the adipocyte area was not measured in these figures, and having this included would support the author's claim and strengthen the manuscript. The same is true for immune cell infiltration, where the authors say there is increased immune cell infiltration following PPE2 treatment. This is based on H&E staining, but the data supporting this is limited. Although the authors measure CD3+ T cell infiltration in adipose tissue from mice infected with the clinical strain where PPE was depleted, staining was performed in only this experiment. Completing these experiments by showing data to support that PPE2 induces immune cell infiltration would greatly strengthen the manuscript.

      The authors state that a Student's t-test was performed to calculate the significance between two samples. However, there is no discussion of what statistical method was used when there were more than 2 groups, which occurs throughout the manuscript, such as in Figure 5, where 4 groups are analyzed. Having the appropriate statistical analysis is important for the impact of the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript titled "The PPE protein of Mycobacterium tuberculosis is responsible for the development of hyperglycemia and insulin resistance during tuberculosis", Bisht et al describe that PPE2 protein from Mtb is a key modulator of adipose tissue physiology that contributes to the development of insulin resistance. The authors have used 3T3-L1 preadipocyte cell lines, M. smegmatis overexpression strain, mice model, and genetically modified Mtb deletion strains to demonstrate that PPE promotes persistence in adipose tissue and regulates glucose homeostasis. Using qPCR and RNA-seq experiments, the authors demonstrate that PPE2 regulates the expression of key genes involved in adipogenesis.

      Strengths:

      Using purified protein, the authors show that PPE2 regulates adipose tissue physiology, and this effect was neutralised in the presence of anti-PPE2. The expression of several adipogenic markers was also reduced in 3TL-1 adipocytes treated with rPPE2 and in mice infected with M. smegmatis strains overexpressing PPE2. Using a mouse model of infection, the authors show that PPE2 contributes to enhanced mycobacterial survival within fat tissues. The authors also show infiltration of immune cells in the fat tissues of mice infected with wild-type and ppe2-complemented strains compared to the ppe2 KO strain. In order to gain a better mechanistic understanding of how PPE2 regulates adipogenesis, the authors employed an RNA-seq approach and identified 191 genes that were significantly differentially expressed in the fat tissues of mice infected with wild-type and ppe2 KO Mtb strains. The differentially expressed genes included transcripts encoding for proteins involved in chemokine/cytokine signalling, ER stress response. The expression of a few of these markers was also validated by qPCR and western blot analysis. Finally, the authors also show that PPE2 promotes lipolysis by reducing phosphodiesterase levels and activating PKA-HSL signalling. The experimental design is overall reasonable, and the methods used are reliable. Overall, the current study did provide some new information on the contribution of PPE2 in regulating adipose tissue physiology.

      Weaknesses:

      (1) The authors have used several methodologies to show that PPE2 regulates adipose tissue physiology and glucose homeostasis. But the exact mechanism is still not clear.

      (2) Mtb encodes several PE/PPE proteins? The authors have used PPE2 for their study. Will secretory PPE2 homologs also regulate similar cellular processes?

      (3) How do the authors rule out that the differences observed in the fat tissues of mice infected with wild-type and mutant strains are not associated with reduced bacterial burdens? Is it possible to include another Mtb attenuated strain as a control in mice experiments for few critical experiments?

    1. Reviewer #1 (Public review):

      Summary:

      Syed et al. investigate the circuit underpinnings for leg grooming in the fruit fly. They identify two populations of local interneurons in the right front leg neuromere of ventral nerve cord, i.e. 62 13A neurons and 64 13B neurons. Hierarchical clustering analysis identifies each 10 morphological classes for both populations. Connectome analysis reveals their circuit interactions: these GABAergic interneurons provide synaptic inhibition either between the two subpopulations, i.e. 13B onto 13A, or among each other, i.e. 13As onto other 13As, and/or onto leg motoneurons, i.e. 13As and 13Bs onto leg motoneurons. Interestingly, 13A interneurons fall into two categories with one providing inhibition onto a broad group of motoneurons, being called "generalists", while others project to few motoneurons only, being called "specialists". Optogenetic activation and silencing of both subsets strongly effects leg grooming. As well activating or silencing subpopulations, i.e. 3 to 6 elements of the 13A and 13B groups has marked effects on leg grooming, including frequency and joint positions and even interrupting leg grooming. The authors present a computational model with the four circuit motifs found, i.e. feed-forward inhibition, disinhibition, reciprocal inhibition and redundant inhibition. This model can reproduce relevant aspects of the grooming behavior.

      Strengths:

      The authors succeeded in providing evidence for neural circuits interacting by means of synaptic inhibition to play an important role in the generation of a fast rhythmic insect motor behavior, i.e. grooming of the body using legs. Two populations of local interneurons in the fruit fly VNC comprise four inhibitory circuit motifs of neural action and interaction: feed-forward inhibition, disinhibition, reciprocal inhibition and redundant inhibition. Connectome analysis identifies the similarities and differences between individual members of the two interneuron populations. Modulating the activity of small subsets of these interneuron populations markedly affects generation of grooming behavior thereby exemplifying their relevance. The authors carefully discuss strengths and limitations of their approaches and place their findings into the broader context of motor control.

      Weaknesses:

      Effects of modulating activity in the interneuron populations by means of optogenetics were conducted in the so-called "closed-loop" condition. This does not allow to differentiate between direct and secondary effects of the experimental modification in neural activity, as feedforward and feedback effects cannot be disentangled. To do so open loop experiments, e.g. in deafferented conditions, would be needed. Given that many members of the two populations of interneurons do not show one, but two or more circuit motifs, it remains to be disentangled which role the individual circuit motif plays in the generation of the motor behavior in intact animals.

      Comments on revisions:

      The authors have carefully revised the manuscript. I have no further suggestions or criticisms.

    2. Reviewer #3 (Public review):

      Summary:

      The authors set out to determine how GABAergic inhibitory premotor circuits contribute to the rhythmic alternation of leg flexion and extension during Drosophila grooming. To do this, they first mapped the ~120 13A and 13B hemilineage inhibitory neurons in the prothoracic segment of the VNC and clustered them by morphology and synaptic partners. They then tested the contribution of these cells to flexion and extension using optogenetic activation and inhibition and kinematic analyses of limb joints. Finally, they produced a computational model representing an abstract version of the circuit to determine how the connectivity identified in EM might relate to functional output. The study makes important contributions to the literature.

      The authors have identified an interesting question and use a strong set of complementary tools to address it:

      They analysed serial‐section TEM data to obtain reconstructions of every 13A and 13B neuron in the prothoracic segment. They manually proofread over 60 13A neurons and 64 13B neurons, then used automated synapse detection to build detailed connectivity maps and cluster neurons into functional motifs.

      They used optogenetic tools with a range of genetic driver lines in freely behaving flies to test the contribution of subsets of 13A and 13B neurons.

      They used a connectome-constrained computational model to determine how the mapped connectivity relates to the rhythmic output of the behavior.

      Comments on revisions:

      I appreciate that the authors have updated the GitHub repository to include the model and analysis code. Still lacking is: for the authors to explicitly separate empirical findings from modelling inferences in the text, and a supplemental table to make it clear which cell types are included. I should also point out that the code lacks annotations necessary for the results to be reproduced and the model to be reused.

  2. Jan 2026
    1. Reviewer #1 (Public review):

      Summary:

      The authors assess the impact of E-cigarette smoke exposure on mouse lungs using single-cell RNA sequencing. Air was used as control and several flavors (fruit, menthol, tobacco) were tested. Differentially expressed genes (DEGs) were identified for each group and compared against the air control. Changes in gene expression in either myeloid or lymphoid cells were identified for each flavor and the results varied by sex. The scRNAseq dataset will be of interest to the lung immunity and e-cig research communities, and some of the observed effects could be important. Unfortunately, the revision did not address the reviewers' main concerns about low replicate numbers and lack of validations. The study remains preliminary and no solid conclusions could be drawn about the effects of E-cig exposure as a whole or any flavor-specific phenotypes.

      Strengths:

      The study is the first to use scRNAseq to systematically analyze the impact of e-cigarettes on the lung. The dataset will be of broad interest.

      Weaknesses:

      This study had only N=1 biological replicates for the single-cell sequencing data per sex per group and some sex-dependent effects were observed. This could have been remedied by validating key observations from the study using traditional methods such as flow cytometry and qPCR, but the limited number of validation experiments did not support the conclusions of the scRNAseq analysis. An important control group (PG:VG) had extremely low cell numbers and therefore could not be used to derive meaningful conclusions. Statistical analysis is lacking in almost all figures. Overall, this is a preliminary study with some potentially interesting observations.

      (1) The only new validation experiment for this revision is the immunofluorescent staining of neutrophils in Figure 4. The images are very low resolution and low quality and it is not clear which cells are neutrophils. S100A8 (calprotectin) is highly abundant in neutrophils but not strictly neutrophil-specific. It's hard to distinguish positive cells from autofluorescence in both ly6g and S100a8 channels. No statistical analysis is presented for the quantified data from this experiment.

      (2) The relevance of Fig. 3A and B are unclear since these numbers only reflect the number of cells captured in the scRNAseq experiment and the biological meaning of this data is not explained. Flow cytometry quantification is presented as cell counts but percentage of cells from the CD45+ gate should be shown. No statistical analysis is shown, and flow cytometry results do not support the conclusions of scRNAseq data.

    2. Reviewer #3 (Public review):

      This work aims to establish cell-type-specific changes in gene expression upon exposure to different flavors of commercial e-cigarette aerosols compared to control or vehicle. Kaur et al. conclude that immune cells are most affected, with the greatest dysregulation found in myeloid cells exposed to tobacco-flavored e-cigs and lymphoid cells exposed to fruit-flavored e-cigs. The up- and down-regulated genes are heavily associated with innate immune response. The authors suggest that a Ly6G-deficient subset of neutrophils is found to be increased in abundance for the treatment groups, while gene expression remains consistent, which could indicate impaired function. Increased expression of CD4+ and CD8+ T cells along with their associated markers for proliferation and cytotoxicity is thought to be a result of activation following this decline in neutrophil-mediated immune response.

      Strengths:

      Single-cell sequencing data can be very valuable in identifying potential health risks and clinical pathologies of lung conditions associated with e-cigarettes considering they are still relatively new.

      Not many studies have been performed on cell-type-specific differential gene expression following exposure to e-cig aerosols.

      The assays performed address several factors of e-cig exposure such as metal concentration in the liquid and condensate, coil composition, cotinine/nicotine levels in serum and the product itself, cell types affected, which genes are up- or down-regulated and what pathways they control.

      Considerations were made to ensure clinical relevance such as selecting mice whose ages corresponded with human adolescents so that data collected was relevant.

      The discussion addresses the limitations of this study.

      Weaknesses:

      The exposure period of 1 hour a day for 5 days is not representative of chronic use and this time point may be too short to see a full response in all cell types. There is no gold standard in the field.

      Most findings are based on scRNA-seq alone, so interpretations should be made with care as some conclusions are observational.

      This paper provides a good foundation for future follow-up studies that will examine the effects of e-cig exposure on innate immunity.

    1. Reviewer #1 (Public review):

      IBEX Knowledge Database

      Here, Yanid Z. and colleagues present the IBEX knowledge base. A community tool developed to centralize knowledge and help its adoption by more users. Authors have done a fantastic job, and there is careful consideration of the many aspects of the data management and FAIR principles. The manuscript needs no further work, as it is very well written and have detailed descriptions for data contribution as well as describing the KB itself. Overall, it is a great initiative, especially the aim to inform about negative data and non-recommended reagents, which will positively affect the user community and scientific reproducibility.

      This initiative will serve as a groundwork to include technical details of other multiple immunofluoresecence methods (such as immunoSABER, 4i, etc). Including other methods would help the knowledge base itself and related methods to evolve and assist their communities in the future.

      Significant care has been taken to allow the report of negative data. While there might be limitations as to how this information is included, transparency and community usage will ensure the knowledge base offers a fair representation.

      There are two ways to contribute to the knowledge base. While authors have contributed significantly to its creation, it will be the role of the maintainers to assist potential users and contributors. It is specially appreciated that a path to contribute is possible with no coding skills. I am keen to see how the KB evolves and it helps disseminate the use of this and other great techniques.

    2. Reviewer #2 (Public review):

      Summary:

      The paper introduces the IBEX Knowledge-Base (KB), a shared online resource designed to help scientists working with immunofluorescence imaging. It acts as a central hub where researchers can find and share information about reagents, protocols, and imaging methods. The KB is not static like traditional publications; instead, it evolves as researchers contribute new findings and refinements. A key highlight is that it includes results of both successful and unsuccessful experiments, helping scientists avoid repeating failed experiments and saving time and resources. The platform is built on open-access tools ensuring that the information remains available to everyone. Overall, the KB aims to collaboratively accelerate research, improve reproducibility, and reduce wasted effort in imaging experiments.

      Strengths:

      (1) The IBEX KB is built entirely on open-source tools, ensuring accessibility and long-term sustainability. This approach aligns with FAIR data principles and ensures that the KB remains adaptable to future advancements.

      (2) The KB also follows strict data organization standards, ensuring that all information about reagents and protocols is clearly documented and easy to find with little ambiguity.

      (3) The KB allows scientists to report both positive and negative results, reducing duplication of effort and speeds up the research process.

      (4) The KB is helpful for all researchers, but even more so for scientists in resource-limited settings. It provides guidance on finding affordable alternatives to expensive or discontinued reagents, making it easier for researchers with fewer resources to perform high-quality experiments.

      (5) The KB includes a community discussion forum where scientists can ask for advice, share troubleshooting tips, and collaborate with others facing similar challenges.

      (6) The authors discuss plans for active maintenance of the database and also to incentivize higher participation from the community.

      (7) Even those unfamiliar with Github may contribute with the help of the database maintenance team.

      Note: The authors have addressed my comments on the previous version of the article and the current version has been strengthened as a result.

    3. Reviewer #3 (Public review):

      Summary:

      The authors have developed and interactive knowledge-base that uses crowdsourcing information on antibodies and reagents for immunofluorescence imaging.

      Strengths:

      The authors provide an extremely relevant and needed interphase for collaboration through a well-built platform. All the links in their website work, the information provided, reagents, datasets, videos and protocols are very informative. The instructions for the community researchers to contribute is clear and they provide detailed instructions in how to technically proceed. Additionally, the interface has been refined to enable the contribution regardless of the computational expertise of the researcher.

      Weaknesses:

      The Knowledge-Base relies on community contributions without mandatory, standardized metadata and validation criteria. Whilst this enhances the contributions, it limits the reliability of the database.

    1. Reviewer #1 (Public review):

      Summary:

      A central function of glial cells is the ensheathment of axons. Wrapping of larger-diameter axons involves myelin-forming glial classes (such as oligodendrocytes), whereas smaller axons are covered by non-myelin forming glial processes (such as olfactory ensheathing glia). While we have some insights into the underlying molecular mechanisms orchestrating myelination, our understanding of the signaling pathways at work in non-myelinating glia remains limited. As non-myelinating glial ensheathment of axons is highly conserved in both vertebrates and invertebrates, the nervous system of Drosophila melanogaster, and in particular the larval peripheral nerves, have emerged as powerful model to elucidate the regulation of axon ensheathment by a class of glia called wrapping glia. This study seeks to specifically address the question, as to which molecular mechanisms contribute to the regulation of the extent of glial ensheathment focusing on the interaction of wrapping glia with axons.

      Strengths and Weaknesses:

      For this purpose, the study combines state-of-the-art genetic approaches with high-resolution imaging, including classic electron microscopy. The genetic methods involve RNAi mediated knockdown, acute Crispr-Cas9 knock-outs and genetic epistasis approaches to manipulate gene function with the help of cell-type specific drivers. The successful use of acute Crispr-Cas9 mediated knockout tools (which required the generation of new genetic reagents for this study) will be of general interest to the Drosophila community.

      The authors set out to identify new molecular determinants mediating the extent of axon wrapping in the peripheral nerves of third instar wandering Drosophila larvae. They could show that over-expressing a constitutive-active version of the Fibroblast growth factor receptor Heartless (Htl) causes an increase of wrapping glial branching, leading to the formation of swellings in nerves close to the cell body (named bulges). To identify new determinants involved in axon wrapping acting downstream of Htl, the authors next conducted an impressive large-scale genetic interaction screen (which has become rare, but remains a very powerful approach), and identified Uninflatable (Uif) in this way. Uif is a large single-pass transmembrane protein which contains a whole series of extracellular domains, including Epidermal growth factor-like domains. Linking this protein to glial branch formation is novel, as it has so far been mostly studied in the context of tracheal maturation and growth. Intriguingly, a knock-down or knock-out of uif reduces branch complexity and also suppresses htl over-expression defects. Importantly, uif over-expression causes the formation of excessive membrane stacks. Together these observations are in in line with the notion that htl may act upstream of uif.

      Further epistasis experiments using this model implicated also the Notch signaling pathway as a crucial regulator of glial wrapping: reduction in Notch signaling reduces wrapping, whereas over-activation of the pathway increases axonal wrapping (but does not cause the formation of bulges). Importantly, defects caused by over-expression of uif can be suppressed by activated Notch signaling. Knock-down experiments in neurons suggest further that neither Delta nor Serrate act as neuronal ligands to activate Notch signaling in wrapping glia, whereas knock-down of Contactin, a GPI anchored Immunoglobulin domain containing protein led to reduced axon wrapping by glia, and thus could act as an activating ligand in this context.

      Based on these results the authors put forward a model proposing that Uif normally suppresses Notch signaling, and that activation of Notch by Contactin leads to suppression of Htl, to trigger the ensheathment of axons. While these are intriguing propositions, future experiments will need to conclusively address whether and how Uif could "stabilize" a specific membrane domain capable to interact with specific axons.

      Moreover, to obtain evidence for Uif suppression by Notch to inhibit "precocious" axon wrapping and for a "gradual increase" of Notch signaling that silences uif and htl, (1) reporters for N and Htl signaling in larvae, (2) monitoring of different stages at a time point when branch extension begins, and (3) a reagent enabling the visualization of Uif expression could be important next tools/approaches. Considering the qualitatively different phenotypes of reduced branching, compared to excessive membrane stacks close to cell bodies, it would perhaps be worthwhile to explore more deeply how membrane formation in wrapping glia is orchestrated at the subcellular level by Uif.

      However, the points raised above remain at present technically difficult to address because of the lack of appropriate genetic reagents. Also more detailed electron microscopy analyses of early developmental stages and comparisons of effects on cell bodies compared to branches will be very labor-intensive, and indeed may represent a new study.

      In summary, in light of the importance of correct ensheathment of axons by glia for neuronal function, the proposed model for the interactions between Htl, Uif and N to control the correct extent of neuron and glial contacts will be of general interest to the glial biology community.

      Comments on revisions:

      The authors have addressed all my comments. However, the sgRNAs in the Star method table are still all for cleavage just before the transmembrane domain, while the Supplemental figure suggests different locations.

    2. Reviewer #2 (Public review):

      The FGF receptor Heartless has previously been implicated in Drosophila peripheral glial growth and axonal wrapping. Here, the authors performed a large-scale screen of over 2,600 RNAi lines to identify factors regulating the downstream signaling of this process. They identified the transmembrane protein Uninflatable (Uif) as essential for the formation of plasma membrane domains. Furthermore, they found that Notch, a regulatory target of Uif, is required for glial wrapping. Interestingly, additional evidence implies that Notch reciprocally regulates uif and htl, suggesting a feedback loop. Consequently, the authors propose that Uif functions as a 'switch' to regulate the balance between glial growth and axonal wrapping.

      Little is known about how glial cell properties are coordinated with axons, and the identification of Uif provides essential insight into this orchestration. The manuscript is well-written, and the experiments are generally well-controlled. The electron microscopy studies, in particular, are of outstanding quality and help mechanistically dissect the consequences of Uif and Notch signaling in the regulation of glial processes. Together, this important study provides convincing evidence of a new player coordinating the glial wrapping of axons.

      Comments on revisions:

      Overall, the authors have done an excellent job of responding to my substantive concerns in this significantly improved manuscript. In particular, the authors have provided important additional details about the design, prioritization, and outcomes of their screen, and relayed changes that strengthen and extend the impact of their study. I have revised my assessment accordingly, and I expect this study to be of high interest to a variety of researchers in the field.

    1. Reviewer #2 (Public review):

      Summary:

      This paper aims to elucidate the gene regulatory network governing the development of cone photoreceptors, the light-sensing neurons responsible for high acuity and color vision in humans. The authors provide a comprehensive analysis through stage-matched comparisons of gene expression and chromatin accessibility using scRNA-seq and scATAC-seq from the cone-dominant 13-lined ground squirrel (13LGS) retina and the rod-dominant mouse retina. The abundance of cones in the 13LGS retina arises from a dominant trajectory from late retinal progenitor cells (RPCs) to photoreceptor precursors and then to cones, whereas only a small proportion of rods are generated from these precursors.

      Strengths:

      The paper presents intriguing insights into the gene regulatory network involved in 13LGS cone development. In particular, the authors highlight the expression of cone-promoting transcription factors such as Onecut2, Pou2f1, and Zic3 in late-stage neurogenic progenitors, which may be driven by 13LGS-specific cis-regulatory elements. The authors also characterize candidate cone-promoting genes Zic3 and Mef2C, which have been previously understudied. Overall, I found that the across-species analysis presented by this study is a useful resource for the field.

      Comments on Revision:

      The authors have addressed my questions, and the revised text now presents their findings more clearly.

    2. Reviewer #3 (Public review):

      Summary:

      The authors perform deep transcriptomic and epigenetic comparisons between mouse and 13-lined ground squirrel (13LGS) to identify mechanisms that drive rod vs cone rich retina development. Through cross species analysis the authors find extended cone generation in 13LGS, gene expression within progenitor/photoreceptor precursor cells consistent with lengthened cone window, and differential regulatory element usage. Two of the transcription factors, Mef2c and Zic3, were subsequently validated using OE and KO mouse lines to verify role of these genes in regulating competence to generate cone photoreceptors.

      Strengths:

      Overall, this is an impactful manuscript with broad implications toward our understanding of retinal development, cell fate specification, and TF network dynamics across evolution and with the potential to influence our future ability to treat vision loss in human patients. The generation of this rich new dataset profiling the transcriptome and epigenome of the 13LGS is a tremendous addition to the field that assuredly will be useful for numerous other investigations and questions of a variety of interests. In this manuscript, the authors use this dataset and compare to data they previously generated for mouse retinal development to identify 2 new regulators of cone generation and shed insights onto their regulation and their integration into the network of regulatory elements within the 13LGS compared to mouse.

      The authors have done considerable work to address reviewer concerns from the first draft. The current version of the manuscript is strong and supports the claims.

    1. Reviewer #2 (Public review):

      Summary:

      Sugimoto et al. explore the relationship between glucose dynamics-specifically value, variability, and autocorrelation-and coronary plaque vulnerability in patients with varying glucose tolerance levels. The study identifies three independent predictive factors for %NC and emphasizes the use of continuous glucose monitoring (CGM)-derived indices for coronary artery disease (CAD) risk assessment. By employing robust statistical methods and validating findings across datasets from Japan, America, and China, the authors highlight the limitations of conventional markers while proposing CGM as a novel approach for risk prediction.The study has the potential to reshape CAD risk assessment by emphasizing CGM-derived indices, aligning well with personalized medicine trends.

      Further, the revised version includes expanded biological interpretation, improved statistical justification, and a new web-based calculator for clinical translation. Together, these updates make the study an important contribution to precision risk assessment in diabetes and cardiovascular research.

      Strengths:

      The introduction of autocorrelation as a predictive factor for plaque vulnerability adds a novel dimension to glucose dynamic analysis.

      Inclusion of datasets from diverse regions enhances generalizability.

      The use of a well-characterized cohort with controlled cholesterol and blood pressure levels strengthens the findings.

      The focus on CGM-derived indices aligns with personalized medicine trends, showcasing potential for CAD risk stratification.

      The benchmarking of CGM-derived measures against established CAD risk models (e.g., Framingham Risk Score) enhances interpretability and significance.

      The addition of a web-based computational tool makes the proposed indices accessible for potential clinical and research use.

      Weaknesses:

      The biological mechanism linking glucose autocorrelation to plaque vulnerability, although plausibly associated with insulin clearance pathways, remains largely theoretical.

      The primary cohort size is still modest, and while supported by power analysis and external datasets, broader prospective validation will be important.

      Strict participant selection criteria as employed by the study may reduce applicability to broader populations.

      CGM-derived indices like AC_Var and ADRR may be too complex for routine clinical use without simplified models or guidelines.

      Comments on revised version:

      The authors have thoroughly addressed previous concerns and produced a much stronger manuscript. The study now provides a coherent, validated, and well-reasoned argument for including autocorrelation as a third major dimension of glucose dynamics. It offers both conceptual novelty and translational potential and will likely stimulate further research on temporal glucose metrics in metabolic and cardiovascular risk assessment.

    2. Reviewer #3 (Public review):

      Summary:

      This is a retrospective analysis of 53 individuals over 26 features (12 clinical phenotypes, 12 CGM features, and 2 autocorrelation features) to examine which features were most informative in predicting percent necrotic core (%NC) as parameter for coronary plaque vulnerability. Multiple regression analysis demonstrated a better ability to predict %NC from 3 selected CGM derived features than 3 selected clinical phenotypes. LASSO regularization and partial least squares (PLS) with VIP scores were used to identify 4 CGM features that most contribute to the precision of %NC. Using factor analysis they identify 3 components that have CGM related features: value (relating to the value of blood glucose), variability (relating to glucose variability), and autocorrelation (composed of the two autocorrelation features). These three groupings appeared in the 3 validation cohorts and when performing hierarchical clustering. To demonstrate how these three features change, a simulation was created to allow the user to examine these features under different conditions.

      Summary of Revision 1. This is a Valuable study supported by Solid evidence. The revisions meaningfully strengthen the manuscript by clarifying methods, improving transparency, and refining presentation. The work provides useful conceptual and methodological advances for understanding CGM-derived glucose dynamics and their possible relationship to cardiovascular pathology.

      Strengths:

      The authors have provided a much clearer exposition of how each glycemic component was defined and validated across cohorts. The revised manuscript now includes explicit pairwise correlations, clarified p- and q-value reporting, and better visualization of key associations between CGM indices and %NC. The justification for LASSO and PLS use is now well explained, and additional details on cohort timing relative to PCI, validation dataset structure, and statistical robustness (e.g., VIP stability with covariates) address prior concerns. The inclusion of precise factor definitions and clearer graphics notably improves interpretability.

      Limitations:

      Some limitations remain inherent to the study design, including the modest primary sample size, reliance on retrospective data, and differences between validation datasets in outcome ascertainment. However, these are now acknowledged more openly.

    1. Reviewer #1 (Public review):

      Summary:

      The paper by ILBAY et al describes a screen in C. elegans for loss-of-function of factors that are presumed to constitutively downregulate heat shock or stress genes regulated by HSF-1. The hypothesis posits an active mechanism of downregulation of these genes under non-stressed conditions. The screen robustly identified ZNF-236, a multi zinc finger containing protein, whose loss upregulates heat-shock and stress-induced prion-like protein genes, but which does not appear to act in cis at the relevant promoters. The authors speculate that ZNF-236 acts indirectly on chromatin or chromatin domains to repress hs genes under non-stressed conditions.

      Strengths:

      The screen is clever, well-controlled and quite straightforward. I am convinced that ZNF-236 has something to do with keeping heat shock and other stress transcripts low. The mapping of potential binding sites of ZNF-236 is negative, despite the development of a new method to monitor binding sites. I am not sure whether this assay has a detection/sensitivity threshold limit, as it is not widely used. Up to this point, the data are solid, and the logic is easy to follow.

      Weaknesses:

      While the primary observations are well-documented, the mode of action of ZNF-236 is inadequately explored. Multi Zn finger proteins often bind RNA (TFIII3A is a classic example), and the following paper addresses multivalent functions of Zn finger proteins in RNA stability and processing: Mol Cell 2024 Oct 3;84(19):3826-3842.e8. doi: 10.1016/j.molcel.2024.08.010.). I see no evidence that would point to a role for ZNF-236 in nuclear organization, yet this is the authors' favorite hypothesis. In my opinion, this proposed mechanism is poorly justified, and certainly should not be posited without first testing whether ZNF-236 acts post-transcriptionally, directly down-regulating the relevant mRNAs in some way. It could regulate RNA stability, splicing, export or translation of the relevant RNAs rather than their transcription rates. This can be tested by monitoring whether ZNF-236 alters run-on transcription rates or not. If nascent RNA synthesis rates are not altered, but rather co- and/or post-transcriptional events, and if ZNF-236 is shown to bind RNA (which is likely), the paper could still postulate that the protein plays a role in downregulating stress and heat shock proteins. However, they could rule out that it acts on the promoter by altering RNA Pol II engagement. Another option that should be tested is that ZNF-236 acts by nucleating an H3K9me domain that might shift the affected genes to the nuclear envelope, sequestering them in a zone of low-level transcription. That is also easily tested by tracking the position of an affected gene in the presence and absence of SNF-236. This latter mechanism is also right in line with known modes of action for Zn finger proteins (in mammals, acting through KAP1 and SETDB1). A role for nucleating H3K9me could be easily tested in worms by screening MET-2 or SET-25 knockouts for heat shock or stress mRNA levels. These data sets are already published.

      Without testing these two obvious pathways of action (through RNA or through H3K9me deposition), this paper is too preliminary.

      Appraisal:

      The authors achieved their initial aim with the screen, and the paper is of interest to the field. However, they do not adequately address the likely modes of action. Indeed, I think their results fail to support the conclusion or speculation that ZNF-236 acts on long-range chromatin organization. No solid evidence is presented to support this claim.

      Impact:

      If the paper were to address and/or rule out likely modes of action, the paper would be of major value to the field of heat shock and stress mRNA control.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript reports the identification of ZNF-236 as a key regulator that maintains quiescence of heat shock inducible genes in C. elegans. Using a forward genetic screen for constitutive activation of an endogenous hsp-16.41 reporter, the authors show that loss of znf-236 leads to widespread, HSF-1-dependent expression of inducible heat shock proteins (iHSPs) and a subset of prion-like stress-responsive genes, in the absence of proteotoxic stress. Transcriptomic analysis reveals that znf-236 mutants partially overlap with the canonical heat shock response, selectively activating highly inducible iHSPs rather than the full HSR program. iHSP transgenes integrated throughout the genome generally become de-repressed in znf-236 mutants, whereas the same constructs on extrachromosomal arrays or inserted into the rDNA locus re insensitive to znf-236 loss. Using a newly developed method, Transcription Factor Deaminase Sequencing (TFD-seq), the authors show that ZNF-236 binds sparsely across the genome and does not associate with iHSP promoters, supporting an indirect mode of regulation. Physiologically, znf-236 mutants exhibit increased thermotolerance and maintain iHSP expression during aging.

      Strengths:

      This is a carefully executed and internally consistent study that identifies a new regulator of stress-induced gene quiescence in C. elegans. The genetics are clean and the phenotypes are robust.

      Weaknesses:

      The manuscript is largely descriptive. It would be substantially strengthened by deeper mechanistic insight into what ZNF-236 does beyond being required for default silencing.

    3. Reviewer #3 (Public review):

      Summary:

      The researchers performed a genetic screen to identify a protein, ZNF-236, which belongs to the zinc finger family, and is required for repression of heat shock inducible genes. The researchers applied a new method to map the binding sites of ZNF-236, and based on the data, suggested that the protein does not repress genes by directly binding to their regulatory regions targeted by HSF1. Insertion of a reporter in multiple genomic regions indicates that repression is not needed in repetitive genomic contexts. Together, this work identifies ZNF-236, a protein that is important to repress heat-shock-responsive genes in the absence of heat shock.

      Strengths:

      A hit from a productive genetic screen was validated, and followed up by a series of well-designed experiments to characterize how the repression occurs. The evidence that the identified protein is required for the repression of heat shock response genes is strong.

      Weaknesses:

      The researchers propose and discuss one model of repression based on protein binding data, which depends on a new technique and data that are not fully characterized.

      Major Comments:

      (1) The phrase "results from a shift in genome organization" in the abstract lacks strong evidence. This interpretation heavily relies on the protein binding technique, using ELT-2 as a positive and an imperfect negative control. If we assume that the binding is a red herring, the interpretation would require some other indirect regulation mechanism. Is it possible that ZNF-236 binds to the RNA of a protein that is required to limit HSF-1 and potentially other transcription factors' activation function? In the extrachromosomal array/rDNA context, perhaps other repressive mechanisms are redundant, and thus active repression by ZNF-236 is not required. This possibility is mentioned in one sentence in the discussion, but most of the other interpretations rely on the ZNF-236 binding data to be correct. Given that there is other evidence for a transcriptional role for ZNF-236, and no negative control (e.g. deletion of the zinc fingers, or a control akin to those done for ChIP-seq (like a null mutant or knockdown), a stronger foundation is needed for the presented model for genome organization.

      (2) Continuing along the same line, the study assumes that ZNF-236 function is transcriptional. Is it possible to tag a protein and look at localization? If it is in the nucleus, it could be additional evidence that this is true.

      (3) I suggest that the authors analyze the genomic data further. A MEME analysis for ZNF-236 can be done to test if the motif occurrences are enriched at the binding sites. Binding site locations in the genome with respect to genes (exon, intron, promoter, enhancer?) can be analyzed and compared to existing data, such as ATAC-seq. The authors also propose that this protein could be similar to CTCF. There are numerous high-quality and high-resolution Hi-C data in C. elegans larvae, and so the authors can readily compare their binding peak locations to the insulation scores to test their hypothesis.

      (4) The researchers suggest that ZNF-236 is important for some genomic context. Based on the transcriptomic data, can they find a clue for what that context may be? Are the ZNF-236 repressed genes enriched for not expressed genes in regions surrounded by highly expressed genes?

    1. Reviewer #1 (Public review):

      Summary:

      Authors explore how sex-peptide (SP) affects post-mating behaviours in adult females, such as receptivity and egg laying. This study identifies different neurons in the adult brain and the VNC that become activated by SP, largely by using an intersectional gene expression approach (split-GAL4) to narrow down the specific neurons involved. They confirm that SP binds to the well-known Sex Peptide Receptor (SPR), initiating a cascade of physiological and behavioural changes related to receptivity and egg laying.

      Comments on revised version:

      The authors have substantially strengthened the manuscript in response to our main concerns.

      In particular, they now explicitly test multiple established PMR nodes (including SAG/SPSN as well as pC1, OviDN/OviEN/OviIN and vpoDN), which helps separate direct SP targets from downstream PMR circuitry and supports their interpretation that some of these known nodes can affect receptivity without necessarily inducing oviposition. They also addressed key technical/clarity points: the requested head/trunk expression controls are provided (Suppl Fig S1), and the VT003280 annotation is corrected (now FD6 rather than "SAG driver"). Overall, these additions make the central conclusion, that distinct CNS neuron subsets ("SPRINz") are sufficient to elicit PMR components, more convincing, and the added comparisons with genital tract expressing lines further argue against a simple "periphery only" explanation.

    2. Reviewer #2 (Public review):

      Sex peptide (SP) transferred during mating from male to female induces various physiological responses in the receiving female. Among those, the increase in oviposition and decrease in sexual receptivity are very remarkable. Naturally, a long standing and significant question is the identify of the underlying sex peptide target neurons that express the SP receptor and are underlying these responses. Identification of these neurons will eventually lead to the identification of the underlying neuronal circuitry.

      The Soller lab has addressed this important question already several years ago (Haussmann et al. 2013), using relevant GAL4-lines and membrane-tethered SP. The results already showed that the action of SP on receptivity and oviposition is mediated by different neuronal subsets and hence can be separated. The GAL4-lines used at that time were, however, broad, and the individual identity of the relevant neurons remained unclear.

      In the present paper, Nallasivan and colleagues carried this analysis a significant step further, using new intersectional approaches and transsynaptic tracing.

      Strength:

      The intersectional approach is appropriate and state-of-the art. The analysis is a very comprehensive tour-de-force and experiments are carefully performed to a high standard. The authors also produced a useful new transgenic line (UAS-FRTstopFRT mSP). The finding that neurons in the brain (head) mediate the SP effect on receptivity, while neurons in the abdomen and thorax (ventral nerve cord or peripheral neurons) mediate the SP effect on oviposition, is a significant step forward in the endavour to identify the underlying neuronal networks and hence a mechanistic understanding of SP action. The analysis identifies a small set of neurons underlying SP responses. Some are part of the post-mating circuitry aind influence receptivity, while other are likely involved in higher order sensory processing. Though these results are not entirely unexpected, they are novel and represent a significant step forwards as the analysis is at a much higher resolution as previous work.

      Weakness:

      Though the analysis is at a much higher resolution as previous work on SP targets, it does not yet reach the resolution of single neuronal cell types. The last paragraph in the discussion rightfully speculates about the neurochemical identity of some of the intersection neurons (e.g. dopaminergic P1 neurons, NPF neurons). These suggested identities could have been confirmed by straight-forward immunostainings agains NPF or TH, for which antisera are available. Moreover, specific GAL4 lines for NPF or P1 or at least TH neurons are available which could be used to express mSP to test whether SP activation of those neurons is sufficient to trigger the SP effect. Moreover, the conclusion that SP target neurons operate as key integrators of sensory information for decision of behavioural outputs needs further experimental confirmation.

    3. Reviewer #3 (Public review):

      Summary:

      This paper reports new findings regarding neuronal circuitries responsible for female post-mating responses (PMRs) in Drosophila. The PMRs are induced by sex peptide (SP) transferred from males during mating. The authors sought to identify SP target neurons using a membrane-tethered SP (mSP) and a collection of GAL4 lines, each containing a fragment derived from the regulatory regions of the SPR, fru, and dsx genes involved in PMR. They identified several lines that induced PMR upon expression of mSP. Using split-GAL4 lines, they identified distinct SP-sensing neurons in the central brain and ventral nerve cord. Analyses of pre- and post-synaptic connection using retro- and trans-Tango placed SP target neurons at the interface of sensory processing interneurons that connect to two common post-synaptic processing neuronal populations in the brain. The authors proposed that SP interferes with the processing of sensory inputs from multiple modalities.

      Strengths:

      Besides the main results described in the summary above, the authors discovered the following:

      (1) Reduction of receptivity and induction of egg-laying are separable by restricting the expression of membrane-tethered SP (mSP): head-specific expression of mSP induces reduction of receptivity only, whereas trunk-specific expression of mSP induces oviposition only. Also, they identified a GAL4 line (SPR12) that induced egg laying but did not reduce receptivity.

      (2) Expression of mSP in the genital tract sensory neurons does not induce PMR. The authors identified three GAL4 drivers (SPR3, SPR 21, and fru9), which robustly expressed mSP in genital tract sensory neurons but did not induce PMRs. Also, SPR12 does not express in genital tract neurons but induces egg laying by expressing mSP.

  3. Dec 2025
    1. Reviewer #1 (Public review):

      The manuscript by Shan et al seeks to define the role of the CHI3L1 protein in macrophages during the progression of MASH. The authors argue that the Chil1 gene is expressed highly in hepatic macrophages. Subsequently, they use Chil1 flx mice crossed to Clec4F-Cre or LysM-Cre to assess the role of this factor in the progression of MASH using a high fat high, fructose diet (HFFC). They found that loss of Chil1 in KCs (Clec4F Cre) leads to enhanced KC death and worsened hepatic steatosis. Using scRNA seq they also provide evidence that loss of this factor promotes gene programs related to cell death. From a mechanistic perspective they provide evidence that CHI3L serves as a glucose sink and thus loss of this molecule enhances macrophage glucose uptake and susceptibility to cell death. Using a bone marrow macrophage system and KCs they demonstrate that cell death induced by palmitic acid is attenuated by the addition of rCHI3L1. While the article is well written and potentially highlights a new mechanism of macrophage dysfunction in MASH and the authors have addressed some of my concerns there are some concerns about the current data that continue to limit my enthusiasm for the study. Please see my specific comments below.

      Major:

      (1) The authors' interpretation of the results from the KC ( Clec4F) and MdM KO (LysM-Cre) experiments is flawed. The authors have added new data that suggests LyM-Cre only leads to a 40% reduction of Chil1 in KCs and that this explains the difference in the phenotype compared to the Clec4F-Cre. However, this claim would be made stronger using flow sorted TIM4hi KCs as the plating method can lead to heterogenous populations and thus an underestimation of knockdown by qPCR. Moreover, in the supplemental data the authors show that Clec4f-Cre x Chil1flx leads to a significant knockdown of this gene in BMDMs. As BMDMs do not express Clec4f this data calls into question the rigor of the data. I am still concerned that the phenotype differences between Clec4f-cre and LyxM-cre is not related to the degree of knockdown in KCs but rather some other aspect of the model (microbiota etc). It woudl be more convincing if the authors could show the CHI3L reduction via IF in the tissue of these mice.

      (2) Figure 4 suggests that KC death is increased with KO of Chil1. The authors have added new data with TIM4 that better characterizes this phenotype. The lack of TIM4 low, F4/80 hi cells further supports that their diet model is not producing any signs of the inflammatory changes that occur with MASLD and MASH. This is also supported by no meaningful changes in the CD11b hi, F4/80 int cells that are predominantly monocytes and early Mdms). It is also concerning that loss of KCs does not lead to an increase in Mo-KCs as has been demonstrated in several studies (PMID37639126, PMID:33997821). This would suggest that the degree of resident KC loss is trivial.

      (3) The authors demonstrated that Clec4f-Cre itself was not responsible for the observed phenotype, which mitigates my concerns about this influencing their model.

      (4) I remain somewhat concerned about the conclusion that Chil1 is highly expressed in liver macrophages. The author agrees that mRNA levels of this gene are hard to see in the datasets; however, they argue that IF demonstrates clear evidence of the protein, CHI3L. The IF in the paper only shows a high power view of one KC. I would like to see what percentage of KCs express CHI3L and how this changes with HFHC diet. In addition, showing the knockout IF would further validate the IF staining patterns.

      Minor:

      (1) The authors have answered my question about liver fibrosis. In line with their macrophage data their diet model does not appear to induce even mild MASH.

    2. Reviewer #2 (Public review):

      In the revised version of the manuscript, the authors have attempted to address my questions, however, a number of my original concerns still remain.

      Firstly, I had asked for a validation of the different CRE lines used - Lysm and Clec4f. The authors have now looked at BMDMs and KCs (steady state) from these animals. They conclude LysM only targets BMDMs not KCs, while CLEC4F targets both KCs and BMDMs. This I do not understand, BMDMs do not express CLEC4F so why are they targeted with this CRE? Additionally, BMDMs are not the correct control here, rather the authors should look at the incoming moMFs in the livers of these mice in the MASLD setting. Similarly, the KO in the MASLD KCs should be verified.

      Then I had asked for validation of macrophage expression of Chil1 in other MASLD human and mouse datasets. The authors have looked into this, but the data provided do not suggest it is highly expressed by these cells either in the other mouse models or in the human. Nevertheless, they include a statement suggesting a similar expression pattern (although also being expressed by other cells). This is not an accurate discussion of the data and hence must be revised. This also prompted me to take another look at their data and this has left me querying the data in Figure 1D. Is the percent expressed 1%? In Figure 1C the scale goes from 0-100 but here 0-1. If we are talking about expression in 1% of cells which would fit with the additional public mouse data now analysed then how relevant are any of these claims? How sure are the authors that the effects seen are through KCs/moMFs? In figure 1D all cells profiled by scRNA-seq should be shown not just MFs to get a better sense of this data. What is macrophage expression of Chil1 compared with all other liver cells?

      The cell death had also previously concerned me that 40-60% of KCs were tunel +ve. I do not understand how 60% are +ve at 8 weeks but then they have more or less same number of TIM4+ cells at 16 weeks? How can this be? why do the tunel +ve cells not die? This concern remains as I don't understand how they reached these numbers given the images. Additional, larger images were also not provided to be sure that they are representative images in the figure. Now in the images provided, there are clearly cells which are TIM4+ where the tunel does not overlap, likely it is in a LSEC or other neighbouring cell. Indeed also taking Fig S11b as an example there are ˜7KCs and at best 1 expresses tunel so how do they get to 60%?

    3. Reviewer #3 (Public review):

      This paper investigates the role of Chi3l1 in regulating the fate of liver macrophages in the context of metabolic dysfunction leading to the development of MASLD. I do see value in this work, but some issues exist that should be addressed as well as possible.

      Here are my comments:

      (1) Chi3l1 has been linked to macrophage functions in MASLD/MASH, acute liver injury, and fibrosis models before (e.g., PMID: 37166517), which limits the novelty of the current work. It has even been linked to macrophage cell death/survival (PMID: 31250532) in the context of fibrosis, which is a main observation from the current study.

      (2) The LysCre-experiments differ from experiments conducted by Ariel Feldstein's team (PMID: 37166517). What is the explanation for this difference? - The LysCre system is neither specific to macrophages (it also depletes in neutrophils, etc), nor is this system necessarily efficient in all myeloid cells (e.g., Kupffer cells vs other macrophages). The authors need to show the efficacy and specificity of the conditional KO regarding Chi3l1 in the different myeloid populations in the liver and the circulation.

      (3) The conclusions are exclusively based on one MASLD model. I recommend confirming the key findings in a second, ideally a more fibrotic, MASH model.

      (4) Very few human data are being provided (e.g., no work with own human liver samples, work with primary human cells). Thus, the translational relevance of the observations remains unclear.

      Comments on revisions:

      The authors have done a thorough job addressing my comments. However, I am not convinced about the MCD diet model, which is somewhat hidden in the Supplementary Files. Neither seems MASH different nor are any fibrosis data shown to support the conclusions. I am not satisfied with this part of the revised manuscript, and I do not agree that the second MASH model would support the conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors investigate the role of deubiquitinases (DUBs) in modulating the efficacy of PROTAC-mediated degradation of the cell-cycle kinase AURKA. Using a focused siRNA screen of 97 human DUBs, they identify UCHL5 and OTUD6A as negative regulators of AURKA degradation by PROTACs. They further offer a mechanistic explanation of enhanced AURKA degradation in the nucleus via OTUD6A expression being restricted to the cytosol, thereby protecting the cytoplasmic pool of AURKA. These findings provide important insight into how subcellular localization and DUB activity influence the efficiency of targeted protein degradation strategies, which could have implications for therapy.

      Strengths:

      The manuscript is well-structured, with clearly defined objectives and well-supported conclusions.

      The study employs a broad range of well-validated techniques-including live-cell imaging, proximity ligation assays, HiBiT reporter systems, and ubiquitin pulldowns - to dissect the regulation of PROTAC activity.

      The authors use informative experimental controls, including assessment of cell-cycle progression effects, rescue experiments with siRNA-resistant constructs to confirm specificity, and the application of both AURKA-targeting PROTACs with different warheads and orthogonal degrader systems (e.g., dTAG-13 and dTAGv-1) to differentiate between target- and ligase-specific effects.

      The identification of OTUD6A as a cytosol-restricted DUB that protects cytoplasmic but not nuclear AURKA is novel and may have therapeutic relevance for selectively targeting oncogenic nuclear AURKA pools.

      Weaknesses:

      Although UCHL5 and OTUD6A are shown to limit AURKA degradation, direct physical interaction was not assessed.

      While the authors suggest that combining PROTACs with DUB inhibition could enhance degradation, this was not experimentally tested.

      The authors acknowledge the apparent discrepancy between the enhanced degradation observed with CRBN-recruiting PROTACs and the lack of change in ubiquitination following UCHL5 knockdown, yet they do not propose any mechanistic explanation.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors present a screening approach to identify deubiquitylases that may impact PROTAC efficacy/potency, specifically in this case using a previously reported AURKA PROTAC as an initial model. The authors claim that UCHL5 is able to control the level of degradation of both AURKA and dTAG when using CRBN mediated PROTACs, however that VHL is not impacted by UCHL5 activity. They additionally claim that OTUD6A is able to control extent of AURKA degradation in a target protein-specific manner and that this effect is specific to cytoplasm located AURKA.

      Overall, the endeavour is of interest and important. Some of the claims made were overly generalised, and in the main effects observed when knocking down the respective DUBs were small. In addition, the systems used are highly artificial, and the data is not presented in a way that makes understanding absolute (rather than relative) changes easy to understand.

      Strengths:

      The topic is of high interest and relevance and explores an underappreciated and understudied area of the PROTAC mechanism of action. If further supported and understood, they would certainly bring value to the field.

      Weaknesses:

      The overall effects observed are sometimes limited in real terms. The data provided often omits the absolute changes in protein abundance observed. Data on endogenous/less engineered systems and/or with higher resolution read-outs would<br /> greatly strengthen some conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      Wang, Po-Kai et al., utilized the de novo polarization of MDCK cells cultured in Matrigel to assess the interdependence between polarity protein localization, centrosome positioning and apical membrane formation. They show that the inhibition of Plk4 with Centrinone does not prevent apical membrane formation, but does result in its delay, a phenotype the authors attribute to the loss of centrosomes due to the inhibition of centriole duplication. However, the targeted mutagenesis of specific centrosome proteins implicated in the positioning of centrosomes in other cell types (CEP164, ODF2, PCNT and CEP120), as well as the use of dominant negative constructs to inhibit centrosomal microtubule nucleation did not affect centrosome positioning in 3D cultured MDCK cells. A screen of proteins previously implicated in MDCK polarization revealed that the polarity protein Par-3 was upstream of centrosome positioning, similar to other cell types.

      Strengths:

      The investigation into the temporal requirement and interdependence of previously proposed regulators of cell polarization and lumen formation is valuable. The authors have provided a detailed analysis of many of these components at defined stages of polarity establishment, and well demonstrate that centrosomes are not necessary for apical polarity formation, but are involved in the efficient establishment of the apical membrane.

      Weaknesses:

      Key questions remain regarding the structure of the intracellular cytoskeleton following depletion of centrosomes, centrosome proteins,or abrogation of centrosome microtubule nucleation. The authors strengthen their model that centrosomes are positioned independently of microtubule nucleation using dominant negative Cdk5RAP2 and NEDD-1 constructs, however, the structure of the intracellular microtubule network remains unresolved and will be an important avenue for future investigation.

    2. Reviewer #3 (Public review):

      Here the Wang et al resubmit their manuscript describing the events in the establishment of polarity in MDCK cells cultured in vitro. As with the original version, the description is throughout and is important to the field to report as it establishes a hierarchy of events in polarization, placing Par3 upstream of centrosome positioning and apical membrane component trafficking. Unfortunately, in the revised version, the authors addressed almost none of my points. They did a cursory job of responding in the rebuttal letter but made little attempt to actually address what was being asked or to incorporate any of my suggestions into the manuscript. The particularly egregious examples are cited below:

      Comments on revisions:

      (1) My original main experimental concern was not addressed: I had originally asked what role microtubules play in the process of polarization (either centrosomal or non-centrosomal). An obvious model is that Gp135, Rab11, etc. are delivered to the AMIS on centrosomal microtubules. Centrosomes might be also be pulled to the AMIS via cortically derived microtubules as is the case in the C. elegans intestine where the centrosome moves apically on apical microtubules via dynein directed transport to the cortically anchored minus ends. The authors do not explore the role of microtubules in the revision, citing that it was not possible to observe the microtubules directly or to perform nocodazole experiments during polarization. Instead, the authors use a relatively new genetic tool to disrupt centrosomal microtubules. They appear to succeed in displacing centrosomal g-tubulin using this tool, but without being able to observe microtubules, a remaining caveat of this experiment is that it is still unclear whether the authors have removed centrosomal microtubules. Compounding this issue is that this tool has never been used in MDCK cells. The authors conclude "we found that cells lacking centrosomal microtubules were still able to polarize and position the centrioles apically.", but they have not shown this, instead the data suggest this conclusion and the authors should acknowledge the caveat that they have no idea whether centrosomal microtubules are abolished. Similarly, the authors also state: "Additionally, although PCNT knockout cells show reduced microtubule nucleation ability, they still recruit a small amount of γ-tubulin". Where are the data that show that microtubule nucleation is reduced in these PCNT knock out cells?

      (2) Many of my comments were addressed in the rebuttal, but not in the text.<br /> The non-centrosomal GP135 in Figure 2 is not acknowledged or explained.

      That the polarity index does not actually measure polarity, but nuclear-centrosome distance is not acknowledged or explained in the paper.

      I still don't believe that the quantification in Figure 3D matches the images I am being shown in Figure 3A. In the centrinone treatment condition, there is certainly an enrichment of GP135 at the AMIS that is not detected in the quantification. The method described in the rebuttal might miss this enrichment if it is offset from line drawn between the centroid of the two nuclei.

      Cell height changes in the centrosome depleted cysts are still referenced in the text ("the cell heights of the centrosome-depleted cysts are less uniform"), but no specific data or image is called out. Currently, Figure 3G is referenced, but that is a graph of GP135 intensity

      In my original review, I called on the authors to comment on the striking similarity of the mechanisms they documented in MDCK cells to what has been shown in in vivo systems. The authors did not do this, instead restating in the rebuttal some features of what they found. But, the mechanisms shown here are remarkably similar to the polarization of primordia that generate tubular organs in vivo. Perhaps most striking is the similarity to the C> elegans intestine where Par3 localizes to the cortex at the site of an apical MTOC that pulls the centrosome to the apical surface via dynein (Feldman and Priess, 2012). Instead of discussing this similarity, the authors state: "Par3 is likely to regulate centrosome positioning through some intermediate molecules or mechanisms, but its specific mechanism is still unclear and requires further investigation." Given the acetylated tubulin signal emanating from the Par3 positive patch in Figure 5E and F, I suspect similar mechanisms to the C. elegans intestine are at play here. Such a parallel should be noted in the Discussion.

      I had originally commented that "I find the results in Figure 6G puzzling. Why is ECM signaling required for Gp135 recruitment to the centrosome. Could the authors discuss what this means?" The authors responded that "The data in Figure 6G do not indicate that ECM signaling is required for the recruitment of Gp135 to the centrosome". In Figure 6G, the localization of GP135 to the centrosome appears significantly delayed compared to its localization to the centrosome in images where cells were cultured in Matrigel. Indeed, the authors argue that the centrosomal localization precedes and contributes to its localization to the AMIS. In the absence of ECM, GP135 localizes to the membrane before it localizes to the centrosome and its localization to the centrosome appears significantly reduced. Thus, my original and current interpretation is that ECM signaling is somehow required for the centrosomal targeting of GP135. One could make a competition argument, i.e. that the cortex in the absence of ECM is somehow a more desirable place to localize than the centrosome, but this experiment also argues that the centrosome does not need to be a source of this material in order for it to end up on the cortex.

      (3) There needs to be precision in the language used in many places:

      I don't understand this line in the abstract: "When cultured in Matrigel, de novo polarization of a single epithelial cell is often coupled with mitosis." If a cell has divided, it is no longer a single cell.

      The authors state in the Introduction "Because of its strong ability to nucleate microtubules, the centrosome functions as the primary microtubule organizing center", but then state ""In polarized epithelial cells, the centrosome is localized at the apical region during interphase, which contributes to the construction of an asymmetric microtubule network conducive to polarized vesicle trafficking". In the latter statement, I assume the authors are describing the well-characterized apical microtubule network in epithelial cells that is non-centrosomal. Thus, the latter sentence is at odds with the former.

      The authors continually refer to Par3 as a tight junction protein. "Par3, which controls tight junction assembly to partition the apical surface from the basolateral surface". To my knowledge, PARD3 is an apical protein with similar localization to C. elegans PAR-3 and Drosophila Bazooka. PARD3B is a junctional protein. I assume that the antibody that the authors are using is to PARD3 and not PARD3B? Can the authors please clarify this in the text.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Chengjian Zhao et al. focused on the interactions between vascular, biliary, and neural networks in the liver microenvironment, addressing the critical bottleneck that the lack of high-resolution 3D visualization has hindered understanding of these interactions in liver disease.

      Strengths:

      This study developed a high-resolution multiplex 3D imaging method that integrates multicolor metallic compound nanoparticle (MCNP) perfusion with optimized CUBIC tissue clearing. This method enables the simultaneous 3D visualization of spatial networks of the portal vein, hepatic artery, bile ducts, and central vein in the mouse liver. The authors reported a perivascular structure termed the Periportal Lamellar Complex (PLC), which is identified along the portal vein axis. This study clarifies that the PLC comprises CD34⁺Sca-1⁺ dual-positive endothelial cells with a distinct gene expression profile, and reveals its colocalization with terminal bile duct branches and sympathetic nerve fibers under physiological conditions.

      Comments on revisions:

      The authors very nicely addressed all concerns from this reviewer. There are no further concerns or comments.

    2. Reviewer #2 (Public review):

      Summary:

      The present manuscript of Xu et al. reports a novel clearing and imaging method focusing on the liver. The Authors simultaneously visualized the portal vein, hepatic artery, central vein, and bile duct systems by injected metal compound nanoparticles (MCNPs) with different colors into the portal vein, heart left ventricle, vena cava inferior and the extrahepatic bile duct, respectively. The method involves: trans-cardiac perfusion with 4% PFA, the injection of MCNPs with different colors, clearing with the modified CUBIC method, cutting 200 micrometer thick slices by vibratome, and then microscopic imaging. The Authors also perform various immunostaining (DAB or TSA signal amplification methods) on the tissue slices from MCNP-perfused tissue blocks. With the application of this methodical approach, the Authors report dense and very fine vascular branches along the portal vein. The authors name them as 'periportal lamellar complex (PLC)' and report that PLC fine branches are directly connected to the sinusoids. The authors also claim that these structures co-localize with terminal bile duct branches and sympathetic nerve fibers and contain endothelial cells with a distinct gene expression profile. Finally, the authors claim that PLC-s proliferate in liver fibrosis (CCl4 model) and act as scaffold for proliferating bile ducts in ductular reaction and for ectopic parenchymal sympathetic nerve sprouting.

      Strengths:

      The simultaneous visualization of different hepatic vascular compartments and their combination with immunostaining is a potentially interesting novel methodological approach.

      Weaknesses:

      This reviewer has some concerns about the validity of the microscopic/morphological findings as well as the transcriptomics results, and suggests that the conclusions of the paper may be critically viewed. Namely, at this point, it is still not fully clear that the 'periportal lamellar complex (PLC)' that the Authors describe really exists as a distinct anatomical or functional unit or these are fine portal branches that connect the larger portal veins into the adjacent sinusoid. Also, in my opinion, to identify the molecular characteristics of such small and spatially highly organized structures like those fine radial portal branches, the only way is to perform high-resolution spatial transcriptomics (instead of data mining in existing liver single cell database and performing Venn diagram intersection analysis in hepatic endothelial subpopulations). Yet, the existence of such structures with a distinct molecular profile cannot be excluded. Further research with advanced imaging and omics techniques (such as high resolution volume imaging, and spatial transcriptomics/proteomics) are needed to reproduce these initial findings.

    3. Reviewer #3 (Public review):

      Summary:

      In the revised version of the manuscript authors addressed multiple comments, clarifying especially the methodological part of their work and PLC identification as a novel morphological feature of the adult liver portal veins. Tet is now also much clearer and has better flow.

      The additional assessment of the smartSeq2 data from Pietilä et al., 2025 strengthens the transcriptomic profiling of the CD34+Sca1+ cells and the discussion of the possible implications for the liver homeostasis and injury response. Why it may suffer from similar bias as other scRNA seq datasets - multiple cell fate signatures arising from mRNA contamination from proximal cells during dissociation, it is less likely that this would happen to yield so similar results.

      Nevertheless, a more thorough assessment by functional experimental approaches is needed to decipher the functional molecules and definite protein markers before establishing the PLC as the key hub governing the activity of biliary, arterial, and neuronal liver systems.

      The work does bring a clear new insight into the liver structure and functional units and greatly improves the methodological toolbox to study it even further, and thus fully deserves the attention of the Elife readers.

      Strengths:

      The authors clearly demonstrate an improved technique tailored to the visualization of the liver vasulo-biliary architecture in unprecedented resolution.

      This work proposes a new morphological feature of adult liver facilitating interaction between the portal vein, hepatic arteries, biliary tree, and intrahepatic innervation, centered at previously underappreciated protrusions of the portal veins - the Periportal Lamellar Complexes (PLCs).

      Weaknesses:

      The importance of CD34+Sca1+ endothelial cell subpopulation for PLC formation and function was not tested and warrants further validation.

    1. Reviewer #1 (Public review):

      Domínguez-Rodrigo and colleagues make a largely convincing case for habitual elephant butchery by Early Pleistocene hominins at Olduvai Gorge (Tanzania), ca. 1.8-1.7 million years ago. They present this at a site scale (the EAK locality, which they excavated), as well as across the penecontemporaneous landscape, analyzing a series of findspots that contain stone tools and large-mammal bones. The latter are primarily elephants, but giraffids and bovids were also butchered in a few localities.

      The authors claim that this is the earliest well-documented evidence for elephant butchery; doing so requires debunking other purported cases of elephant butchery in the literature, or in one case, reinterpreting elephant bone manipulation as being nutritional (fracturing to obtain marrow) rather than technological (to make bone tools). The authors' critical discussion of these cases may not be consensual, but it surely advances the scientific discourse. The authors conclude by suggesting that an evolutionary threshold was achieved at ca. 1.8 ma, whereby regular elephant consumption rich in fats and perhaps food surplus, more advanced extractive technology (the Acheulian toolkit), and larger human group size had coincided.

      The fieldwork and spatial statistics methods are presented in detail and are solid and helpful, especially the excellent description (all too rare in zooarchaeology papers) of bone conservation and preservation procedures. The results are detailed and clearly presented.

      The authors achieved their aims, showcasing recurring elephant butchery in 1.8-1.7 million-year-old archaeological contexts. The authors cautiously emphasize the temporal and spatial correlation of 1) elephant butchery, 2) Acheulian toolkits, and 3) larger sites, and discuss how these elements may be causally related.

      Overall, this is an interesting manuscript of broad interest that presents original data and interpretations from the Early Pleistocene archaeology of Olduvai Gorge. These observations and the authors' critical review of previously published evidence are an important contribution that will form the basis for building models of Early Pleistocene hominin adaptation.

    2. Reviewer #2 (Public review):

      The manuscript makes a valuable contribution to the Olduvai Gorge record, offering a detailed description of the EAK faunal assemblage. In particular, the paper provides a high-resolution record of a juvenile Elephas recki carcass, associated lithic artifacts, and several green-broken bone specimens. These data are inherently valuable and will be of significant interest to researchers studying Early Pleistocene taphonomy. My concerns do not relate to the quality or importance of the data themselves, but rather to the interpretive inferences drawn from these data, particularly regarding the strength of the claim for unambiguous proboscidean butchery.

      This review follows the authors' response to an earlier round of reviewer feedback and addresses points raised in that exchange. In their rebuttal, the authors state that some of my initial concerns reflect misunderstandings of their analysis, but after carefully re-reading both the manuscript and their responses, I do not believe this is the case.

      In their response, the authors state that they do not treat the EAK evidence as decisive, yet the manuscript repeatedly characterizes the assemblage in very definitive terms. For example, EAK is described as "the oldest unambiguous proboscidean butchery site at Olduvai" and as "the oldest secure proboscidean butchery evidence." These phrases communicate a high level of confidence that does not align with the more qualified position articulated in the rebuttal and extends beyond what the documented evidence securely supports.

      I appreciate the authors' clarification regarding the fracture features, and I agree that these are well-established outcomes of dynamic hammerstone percussion. At the same time, several of these traits have been documented in non-anthropogenic contexts, including helicoidal spiral fractures resulting from trampling and carnivore activity (Haynes 1983), adjacent or flake-like scars created by carnivore gnawing (Villa and Bartram 1996), hackled break surfaces produced by heavy passive breakage such as trampling or sediment pressure (Haynes 1983), and impact-related bone flakes observed in carnivore-modified assemblages (Coil et al. 2020). One of the biggest issues is that there is no quantitative data or images of the bone fracture features that the authors refer to as the main diagnostic criteria at EAK. The only figures that show EAK specimens (S21, S22, S23) illustrate general green-bone fracture morphology but none of the specific traits listed in the text. In contrast, clear examples of similar features come from other Olduvai assemblages, which may be misleading to readers if they mistakenly interpret those as images from EAK. The manuscript also states that these traits "co-occur," but it is not defined whether this refers to multiple features on the same fragment or within the broader assemblage. Without images or counts that document these traits on EAK fossils, readers cannot evaluate the strength of the interpretation. Including that information would substantially strengthen the manuscript.

      Regarding the statement that "natural elephant long limb breaks have been documented only in pre or peri-mortem stages when an elephant breaks a leg, and only in femora (Haynes et al., 2021)," it is not entirely clear what this example is intended to illustrate in relation to the EAK assemblage. My understanding is that the authors are suggesting that naturally produced green bone fractures in elephants are very limited, perhaps occurring only in pre or peri-mortem broken leg cases, and that fractures on other elements should therefore be attributed to hominin activity. If that is not the intended argument, I would encourage clarifying this point. This appears to conflate pre-mortem injury with the broader issue of equifinality. My original comment was not referring to pre-mortem breaks but to the range of natural (i.e., non-hominin) and post-mortem processes that can generate spiral or green bone fractures similar to those described by the authors.

      I fully understand the spatial analyses, and I realize that the association between bones and lithics is statistically significant. My original concern was not about whether the correlation exists, but about how that correlation is interpreted. That point still stands. Statistical co-occurrence cannot distinguish among the multiple depositional and post-depositional processes that can generate similar spatial patterns. However, I agree that the spatial correlation is intriguing, particularly when viewed alongside the possible butchery evidence. The pattern is notable and worthy of publication, even if the behavioral interpretation requires caution.

      Finally, in considering the authors' response on the Nyayanga material, I still find the basis for their dismissal of that evidence difficult to follow and the contrasting treatment of the Nyayanga and EAK evidence raises concerns about interpretive consistency. Plummer et al. (2023) specify that bone surface modifications were examined using low-power magnification (10×-40×) and strong light sources to identify modifications and that they attributed agency (e.g., hominin, carnivore) to modifications only after excluding possible alternatives. The rebuttal does not engage with the procedures reported. The existence of newer analytical techniques does not diminish the validity of long-standing methods that have been applied across many studies. It is also unclear why abrasion is presented as a more likely explanation than stone tool cutmarks. The authors dismiss the Nyayanga images as "blurry," but this is irrelevant to the interpretation, since the analysis was based on the fossils, not the photographs. The Nyayanga dataset is dismissed without a thorough engagement, while the EAK material, despite similar uncertainties and potential for alternative explanations, is treated as definitive.

      These concerns do not diminish the significance of the EAK assemblage, and addressing them would allow the interpretations to more fully reflect the scope of the available data.

      Literature Cited:<br /> Coil, R., Yezzi-Woodley, K., & Tappen, M. (2020). Comparisons of impact flakes derived from hyena and hammerstone long bone breakage. Journal of Archaeological Science, 120, 105167.

      Haynes, G. (1983). A guide for differentiating mammalian carnivore taxa responsible for gnaw damage to herbivore limb bones. Paleobiology, 9(2), 164-172.<br /> Haynes, G., Krasinski, K., & Wojtal, P. (2021). A study of fractured proboscidean bones in recent and fossil assemblages. Journal of Archaeological Method and Theory, 28(3), 956-1025.

      Plummer, T. W., et al. (2023). Expanded geographic distribution and dietary strategies of the earliest Oldowan hominins and Paranthropus. Science, 379(6632), 561-566.<br /> Villa, P., & Bartram, L. (1996). Flaked bone from a hyena den. Paléo, Revue d'Archéologie Préhistorique, 8(1), 143-159.

    1. Reviewer #1 (Public review):

      Summary:

      This paper focuses on understanding how covalent inhibitors of peroxisome proliferator-activated receptor-gamma (PPARg) show improved inverse agonist activities. This work is important because PPARg plays essential roles in metabolic regulation, insulin sensitization, and adipogenesis. Like other nuclear receptors, PPARg, is a ligand-responsive transcriptional regulator. Its important role, coupled with its ligand-sensitive transcriptional activities, makes it an attractive therapeutic target for diabetes, inflammation, fibrosis, and cancer. Traditional non-covalent ligands like thiazolininediones (TZDs) show clinical benefit in metabolic diseases, but utility is limited by off-target effects and transient receptor engagement. In previous studies, the authors characterized and developed covalent PPARg inhibitors with improved inverse agonist activities. They also showed that these molecules engage unique PPARg ligand binding domain (LBD) conformations whereby the c-terminal helix 12 penetrates into the orthosteric binding pocket to stabilize a repressive state. In the nuclear receptor superclass of proteins, helix 12 is an allosteric switch that governs pharmacologic responses, and this new conformation was highly novel. In this study, the authors did a more thorough analysis of how two covalent inhibitors, SR33065 and SR36708 influence the structural dynamics of PPARg LBD.

      Strengths:

      (1) The authors employed a compelling integrated biochemical and biophysical approach.

      (2) The cobinding studies are unique for the field of nuclear receptor structural biology, and I'm not aware of any similar structural mechanism described for this class of proteins.

      (3) Overall, the results support their conclusions.

      (4) The results open up exciting possibilities for the development of new ligands that exploit the potential bidirectional relationship between the covalent versus non-covalent ligands studied here.

      Weaknesses:

      All weaknesses were addressed by the Authors in revision.

    2. Reviewer #2 (Public review):

      Summary:

      The authors use ligands (inverse agonists, partial agonists) for PPAR, and coactivators and corepressors, to investigate how ligands and cofactors interact in a complex manner to achieve functional outcomes (repressive vs. activating).

      Strengths:

      The data (mostly biophysical data) are compelling from well-designed experiments. Figures are clearly illustrated. The conclusions are supported by these compelling data. These results contribute to our fundamental understanding of the complex ligand-cofactor-receptor interactions.

      Weaknesses:

      Breaking down a complex system into a simpler model system, when possible, provides a unique lens with which to probe systems with mechanistic insight. While it works well in this particular paper, in general, caution should be taken when using simplified models to study a complex system.

    1. Reviewer #1 (Public review):

      Summary:

      This study aims to explore how different forms of "fragile nucleosomes" facilitate RNA Polymerase II (Pol II) transcription along gene bodies in human cells. The authors propose that pan-acetylated, pan-phosphorylated, tailless, and combined acetylated/phosphorylated nucleosomes represent distinct fragile states that enable efficient transcription elongation. Using CUT&Tag-seq, RNA-seq, and DRB inhibition assays in HEK293T cells, they report a genome-wide correlation between histone pan-acetylation/phosphorylation and active Pol II occupancy, concluding that these modifications are essential for Pol II elongation.

      Strengths:

      (1) The manuscript tackles an important and long-standing question about how Pol II overcomes nucleosomal barriers during transcription.

      (2) The use of genome-wide CUT&Tag-seq for multiple histone marks (H3K9ac, H4K12ac, H3S10ph, H4S1ph) alongside active Pol II mapping provides a valuable dataset for the community.

      (3) The integration of inhibition (DRB) and recovery experiments offers insight into the coupling between Pol II activity and chromatin modifications.

      (4) The concept of "fragile nucleosomes" as a unifying framework is potentially appealing and could stimulate further mechanistic studies.

      Weaknesses:

      (1) Misrepresentation of prior literature

      The introduction incorrectly describes findings from Bintu et al., 2012. The cited work demonstrated that pan-acetylated or tailless nucleosomes reduce the nucleosomal barrier for Pol II passage, rather than showing no improvement. This misstatement undermines the rationale for the current study and should be corrected to accurately reflect prior evidence.

      (2) Incorrect statement regarding hexasome fragility

      The authors claim that hexasome nucleosomes "are not fragile," citing older in vitro work. However, recent studies clearly showed that hexasomes exist in cells (e.g., PMID 35597239) and that they markedly reduce the barrier to Pol II (e.g., PMID 40412388). These studies need to be acknowledged and discussed.

      (3) Inaccurate mechanistic interpretation of DRB

      The Results section states that DRB causes a "complete shutdown of transcription initiation (Ser5-CTD phosphorylation)." DRB is primarily a CDK9 inhibitor that blocks Pol II release from promoter-proximal pausing. While recent work (PMID: 40315851) suggests that CDK9 can contribute to CTD Ser5/Ser2 di-phosphorylation, the manuscript's claim of initiation shutdown by DRB should be revised to better align with the literature. The data in Figure 4A indicate that 1 µM DRB fully inhibits Pol II activity, yet much higher concentrations (10-100×) are needed to alter H3K9ac and H4K12ac levels. The authors should address this discrepancy by discussing the differential sensitivities of CTD phosphorylation versus histone modification turnover.

      (4) Insufficient resolution of genome-wide correlations

      Figure 1 presents only low-resolution maps, which are insufficient to determine whether pan-acetylation and pan-phosphorylation correlate with Pol II at promoters or gene bodies. The authors should provide normalized metagene plots (from TSS to TTS) across different subgroups to visualize modification patterns at higher resolution. In addition, the genome-wide distribution of another histone PTM with a different localization pattern should be included as a negative control.

      (5) Conceptual framing

      The manuscript frequently extrapolates correlative genome-wide data to mechanistic conclusions (e.g., that pan-acetylation/phosphorylation "generate" fragile nucleosomes). Without direct biochemical or structural evidence. Such causality statements should be toned down.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors use various genomics approaches to examine nucleosome acetylation, phosphorylation, and PolII-CTD phosphorylation marks. The results are synthesized into a hypothesis that 'fragile' nucleosomes are associated with active regions of PolII transcription.

      Strengths:

      The manuscript contains a lot of genome-wide analyses of histone acetylation, histone phosphorylation, and PolII-CTD phosphorylation.

      Weaknesses:

      This reviewer's main research expertise is in the in vitro study of transcription and its regulation in purified, reconstituted systems. I am not an expert at the genomics approaches and their interpretation, and overall, I had a very hard time understanding and interpreting the data that are presented in this manuscript. I believe this is due to a problem with the manuscript, in that the presentation of the data is not explained in a way that's understandable and interpretable to a non-expert. For example:

      (1) Figure 1 shows genome-wide distributions of H3K9ac, H4K12ac, Ser2ph-PolII, mRNA, H3S10ph, and H4S1ph, but does not demonstrate correlations/coupling - it is not clear from these data that pan-acetylation and pan-phosphorylation are coupled with Pol II transcription.

      (2) Figure 2 - It's not clear to me what Figure 2 is supposed to be showing.

      (A) Needs better explanation - what is the meaning of the labels at the top of the gel lanes?

      (B) This reviewer is not familiar with this technique, its visualization, or its interpretation - more explanation is needed. What is the meaning of the quantitation graphs shown at the top? How were these calculated (what is on the y-axis)?

      (3) To my knowledge, the initial observation of DRB effects on RNA synthesis also concluded that DRB inhibited initiation of RNA chains (pmid:982026) - this needs to be acknowledged.

      (4) Again, Figures 4B, 4C, 5, and 6 are very difficult to understand - what is shown in these heat maps, and what is shown in the quantitation graphs on top?

    3. Reviewer #3 (Public review):

      Summary:

      Li et al. investigated the prevalence of acetylated and phosphorylated histones (using H3K9ac, H4K12ac, H3S10ph & H4S1ph as representative examples) across the gene body of human HEK293T cells, as well as mapping elongating Pol II and mRNA. They found that histone acetylation and phosphorylation were dominant in gene bodies of actively transcribing genes. Genes with acetylation/phosphorylation restricted to the promoter region were also observed. Furthermore, they investigated and reported a correlation between histone modifications and Pol II activity, finding that inhibition of Pol II activity reduced acetylation/phosphorylation levels, while resuming Pol II activity restored them. The authors then proposed a model in which pan-acetylation or pan-phosphorylation of histones generates fragile nucleosomes; the first round of transcription is accompanied by pan-acetylation, while subsequent rounds are accompanied by pan-phosphorylation.

      Strengths:

      This study addresses a highly significant problem in gene regulation. The author provided riveting evidence that certain histone acetylation and/or phosphorylation within the gene body is correlated with Pol II transcription. The author furthermore made a compelling case that such transcriptionally correlated histone modification is dynamic and can be regulated by Pol II activity. This work has provided a clearer view of the connection between epigenetics and Pol II transcription.

      Weaknesses:

      The title of the manuscript, "Fragile nucleosomes are essential for RNA Polymerase II to transcribe in eukaryotes", suggests that fragile nucleosomes lead to transcription. While this study shows a correlation between histone modifications in gene bodies and transcription elongation, a causal relationship between the two has not been demonstrated.

    1. Reviewer #1 (Public review):

      This study by Vitar et al. probes the molecular identity and functional specialization of pH-sensing channels in cerebrospinal fluid-contacting neurons (CSFcNs). Combining patch-clamp electrophysiology, laser-based local acidification, immunohistochemistry, and confocal imaging, the authors propose that PKD2L1 channels localized to the apical protrusion (ApPr) function as the predominant dual-mode pH sensor in these cells.

      The work establishes a compelling spatial-physiological link between channel localization and chemosensory behavior. The integration of optical and electrical approaches is technically strong, and the separation of phasic and sustained response modes offers a useful conceptual advance for understanding how CSF composition is monitored.

      Several aspects of data interpretation, however, require clarification or reanalysis-most notably the single-channel analyses (event counts, Po metrics, and mixed parameters), the statistical treatment, and the interpretation of purported "OFF currents." Additional issues include PKD2L1-TRPP3 nomenclature consistency, kinetic comparison with ASICs, and the physiological relevance of the extreme acidification paradigm. Addressing these points will substantially improve reproducibility and mechanistic depth.

      Overall, this is a scientifically important and technically sophisticated study that advances our understanding of CSF sensing, provided that the analytical and interpretative weaknesses are satisfactorily corrected.

      (1) The authors should re-analyze electrophysiological data, focusing on macroscopic currents rather than statistically unreliable Po calculations. Remove or revise the Po analysis, which currently conflates current amplitude and open probability.

      (2) PKD2L1-TRPP3 nomenclature should be clarified and all figure labels, legends, and text should use consistent terminology throughout.

      (3) The authors should reinterpret the so-called OFF currents as pH-dependent recovery or relaxation phenomena, not as distinct current species. Remove the term "OFF response" from the manuscript.

      (4) Evidence for physiological relevance should be provided, including data from milder acidification (pH 6.5-6.8) and, where appropriate, comparisons with ASIC-mediated currents to place PKD2L1 activity in context.

      (5) Terminology and data presentation should be unified, adopting consistent use of "predominant" (instead of "exclusive") and "sustained" (instead of "tonic"), and all statistical formats and units should be standardized.

      (6) The Discussion should be expanded to address potential Ca²⁺-dependent signaling mechanisms downstream of PKD2L1 activation and their possible roles in CSF flow regulation and central chemoreception.

    2. Reviewer #2 (Public review):

      Summary:

      Cerebrospinal fluid contacting neurons (CSF-cNs) are GABAergic cells surrounding the spinal cord central canal (CC). In mammals, their soma lies sub-ependymally, with a dendritic-like apical extension (AP) terminating as a bulb inside the CC.

      How this anatomy-soma and AP in distinct extracellular environments relate to their multimodal CSF-sensing function remains unclear.

      The authors confirm that in GATA3:GFP mice, where these cells are labeled, that CSFcNs exhibit prominent spontaneous electrical activity mediated by PKD2L1 (TRPP2) channels, non-selective cation channels with ~200 pS conductance modulated by protons and mechanical forces.

      They investigated PKD2L1 pH sensitivity and its effects on CSFcN excitability. They uncovered that PKD2L1 generates both phasic and tonic currents, bidirectionally modulated by pH with high sensitivity near physiological values.

      Combining electrophysiology (intact and isolated AP recordings) with elegant laser-photolysis, they show that functional PKD2L1 channels localize specifically to the apical extension (AP).

      This spatial segregation, coupled with PKD2L1's biophysical properties (high conductance, pH sensitivity) and the AP's unique features (very high input resistance), renders CSFcN excitability highly sensitive to PKD2L1 modulation. Their findings reveal how the AP's properties are optimised for its sensory role.

      Strengths:

      This is a very convincing demonstration using elegant and challenging approaches (uncaging, outside out patch of the AP) together to form a complete understanding of how these sensory cells can detect the changes of pH in the CSF so finely.

      Weaknesses:

      The following do not constitute weaknesses; rather, they are minor requests that this reviewer considers would complete this beautiful study.

      (1) It would be nice to quantify further the relation in spontaneous as well as in acidic or basic pH between the effects observed on channel opening and holding current: do they always vary together and in a linear way?

      (2) Since CSF-cNs also respond to changes in osmolarity (Orts Dell Immagine 2013) & mechanosensory stimulations in a PKD2L1 dependent manner (Sternberg NC 2018), it would be nice to test the same results whether the same results hold true on the role of PKD2L1 in AP for pressure application of changes in osmolarity.

      In mice, like in fish (Sternberg et al, NC 2018), we can observe throughout the figures that a large fraction of the channel activity occurs with partial and very fast openings of the PKD2L1 channel. I recommend the authors analyse the points below:<br /> a) To what extent do these partial openings of the channel contribute to the changes in holding current and resting potential?<br /> b) In the trace from the outside out AP, it looks like the partial transient openings are gone. Can the authors verify whether these partial openings are only present in somatic recordings?

      (3) Previous studies have observed expression of metabotropic Glutamate receptors in CSF-cNs (transcriptome from Prendergast et al CB 2023). The authors only used blockers for ionotropic glutamate receptors in their recordings: could it be that these metabotropic receptors influence the response to uncaging of MNI-Glu when glutamate is co-released with a proton?

      (4) In the outside out patch of the AP, PKD2L1 unitary currents appear rare. Could it be that the disruption in the cilium or underlying actin/myosin cytoskeleton drastically alter the open probability of the channel?

      (5) Could the authors use drugs against ASIC to specify which ASIC channels contribute to the pH response in the soma?

      (6) This is out of the scope of this study, but we did observe in fish a very rarely-opening channel in the PKD2L1KO mutant. I wonder if the authors have similar observations in the conditions where PKD2L1 is mainly in the closed state.

    1. Reviewer #1 (Public review):

      Summary:

      How the regenerative capacity of the heart varies among different species has been a long-standing question. Within teleosts, zebrafish can regenerate their hearts, while medaka and cavefish cannot. The authors examined heart regeneration in two livebearers, platyfish and swordtails. Interestingly, they found that these two fish species lack the compact myocardium layer that contains coronary vessels. Furthermore, these fish form a "pseudoaneurysm" after cryoinjury without initial deposition of fibrotic tissues. However, delayed leukocyte infiltration and prolonged inflammation lead to permanent scar tissue in the injured heart. Although their cardiomyocytes can also proliferate, platyfish and swordtails can only regenerate partially. The authors argue that the restorative mechanism of platyfish and swordtails likely reflects "evolutionary innovations in the ventricle type and the immune system".

      Strengths:

      The authors took advantage of the annotated genome of platyfish to perform transcriptomic analyses. The histological analyses and immunostaining are beautifully done.

      Minor Weaknesses:

      Transcriptomic analysis was only done for one time point. Different time points could be included to validate whether some processes occur at different time points. But this can be done in the future for more detailed studies."

    2. Reviewer #2 (Public review):

      This manuscript by Hisler, Rees, and colleagues examines the cardiac regenerative ability of two livebearer species, the platyfish and swordtail. Unlike zebrafish, these species lack cortical myocardium and coronary vasculature. Cryoinjury to their hearts caused persistent scarring at 60 and 90 days post-injury and prevented most of the myocardium from regenerating. Although the wound size progressively shrinks and fibronectin content decreases, the myocardial wall does not recover. Transcriptomic profiling at 7 dpi revealed significant differences between zebrafish and platyfish, including alterations in ECM deposition, immune regulation, and signaling pathways involved in regeneration, such as TGFβ, mTOR, and Erbb2. Platyfish exhibit a delayed but chronic immune response, and although some cardiomyocyte proliferation is observed, it does not appear to contribute to myocardial recovery significantly.

      Overall, this is an excellent manuscript that tackles a crucial question: do different fish lineages have the ability to regenerate hearts, or is this capability limited to a few groups? Therefore, this work is relevant to the fields of cardiac regeneration and comparative regenerative biology for a broad audience. I am very enthusiastic about expanding the list of species tested for their heart regeneration abilities, and this study is detailed and rigorous, providing a solid foundation for future comparative research. However, there are several aspects where additional work could significantly strengthen the manuscript.

      Major comments

      (1) Title selection

      The title the authors chose suggests that platyfish and swordtails "partially regenerate," but I do wonder how much these animals truly regenerate. This may be a semantic discussion and a matter of personal preference. Still, based on other significant work on regenerative capacity (see, for example, the landmark cavefish regeneration paper PMID: 30462998 or work on medaka PMID: 24947076), the persistence of such a prominent fibrotic scar would be considered a minimal regenerative capacity. Measuring this "partial regeneration" more precisely by comparing zebrafish with platyfish and swordtails would also greatly strengthen the comparisons made here - see below.

      The same can be said about line 152-153 - do these hearts "regenerate" with deformation and partial scarring, or would it be more fair to say that they are "healed" or "repaired" with a process that involves fibrosis?

      (2) Cross-species comparisons

      Having two species of livebearers strengthens the findings of this paper, but the presentation of results from both species is inconsistent. For example, the reader should not be asked to assume that the architecture of the swordtail ventricle is similar to that of the platyfish (line 125). The same applies to the presence or absence of coronary vessels (Figure 1), the reduction in wound area over time (Figure 3), and the immune system's response (Figure 5). Most importantly, the authors miss an opportunity to move from qualitative observations to quantifying the "partial regeneration" phenotype they observe. Specifically, providing a side-by-side comparison between these new species and zebrafish would help define the extent of differences in regeneration potential. For instance, in Figure 6, while the authors provide excellent quantification of PCNA staining in platyfish, these data are less meaningful without a direct comparison with zebrafish results. The same applies to Figures 6E and 6F - although differences are noted, quantifying these results would enable a more rigorous assessment of the process.

      (3) Lack of coronary vasculature

      There is a growing body of evidence highlighting the importance of the coronary vessels during zebrafish heart regeneration (PMIDs: 27647901, 31743664). Surprisingly, this finding has not been integrated or discussed in the context of this literature.

      The results of the alkaline phosphatase assay and anti-podocalyxin-2 staining appear inconsistent. Specifically, in Supplementary Figure 1L-M, we can see some vessels covering the bulbus arteriosus and also what appears to be a signal in the ventricle. However, in Figures 1 K and 1L, we cannot see any vessels, even in the bulbus. The authors should also be more rigorous and add a description of how many animals were analyzed, their ages, and sizes. In zebrafish, the formation of the coronary arteries appears to depend on animal size and age. With the data provided, we cannot say whether this is a one-time observation or a consistent finding across many animals at different ages and across both species.

      The link between livebearers' responses and pseudoaneurysms is overstated. This work is already extremely relevant without trying to make it medically oriented.

    1. Reviewer #1 (Public review):

      Mitochondrial staining difference is convincing, but the status of the mitos, fused vs fragmented, elongated vs spherical, does not seem convincing. Given the density of mito staining in CySC, it is difficult to tell what is an elongated or fused mito vs the overlap of several smaller mitos.

      I'm afraid the quantification and conclusions about the gstD1 staining in CySC vs. GSCs is just not convincing-I cannot see how they were able to distinguish the relevant signals to quantify once cell type vs the other.

      The overall increase in gstD1 staining with the CySC SOD KD looks nice, but again I can't distinguish different cel types. This experiment would have been more convincing if the SOD KD was mosaic, so that individual samples would show changes in only some of the cells. Still, it seems that KD of SOD in the CySC does have an effect on the germline, which is interesting.

      The effect of SOD KD on the number of less differentiated somatic cells seems clear. However, the effect on the germline is less clear and is somewhat confusing. Normally, a tumor of CySC or less differentiated Cyst cells, such as with activated JAK/STAT, also leads to a large increase in undifferentiated germ cells, not a decrease in germline as they conclude they observe here. The images do not appear to show reduced number of GSCs, but if they counted GSCs at the niche, then that is the correct way to do it, but its odd that they chose images that do not show the phenotype. In addition, lower number of GSCs could also be caused by "too many CySCs" which can kick out GSCs from the niche, rather than any affect on GSC redox state. Further, their conclusion of reduced germline overall, e.g. by vasa staining, does not appear to be true in the images they present and their indication that lower vasa equals fewer GSCs is invalid since all the early germline expresses Vasa.

      The effect of somatic SOD KD is perhaps most striking in the observation of Eya+ cyst cells closer to the niche. The combination of increased Zfh1+ cells with many also being Eya+ demonstrates a strong effect on cyst cell differentiation, but one that is also confusing because they observe increases in both early cyst cells (Zfh1+) as well as late cyst cells (Eya+) or perhaps just an increase in the Zfh1/Eya double-positive state that is not normally common. The effects on the RTK and Hh pathways may also reflect this disturbed state of the Cyst cells.

      However, the effect on germline differentiation is less clear-the images shown do not really demonstrate any change in BAM expression that I can tell, which is even more confusing given the clear effect on cyst cell differentiation.

      For the last figure, any effect of SOD OE in the germline on the germline itself is apparently very subtle and is within the range observed between different "wt" genetic backgrounds.

      Comments on revisions:

      Upon re-re-review, the manuscript is improved but retains many of the flaws outlined in the first reviews.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a comprehensive single-cell atlas of mouse anterior segment development, focusing on the trabecular meshwork and Schlemm's canal. The authors profiled ~130,000 cells across seven postnatal stages, providing detailed and solid characterization of cell types, developmental trajectories, and molecular programs.

      Strengths:

      The manuscript is well-written, with a clear structure and thorough introduction of previous literature, providing a strong context for the study. The characterization of cell types is detailed and robust, supported by both established and novel marker genes as well as experimental validation. The developmental model proposed is intriguing and well supported by the evidence. The study will serve as a valuable reference for researchers investigating anterior segment developmental mechanisms. Additionally, the discussion effectively situates the findings within the broader field, emphasizing their significance and potential impact for developmental biologists studying the visual system.

      Weaknesses:

      The weaknesses of the study are minor and addressable. As the study focuses on the mouse anterior segment, a brief discussion of potential human relevance would strengthen the work by relating the findings to human anterior segment cell types, developmental mechanisms, and possible implications for human eye disease. Data availability is currently limited, which restricts immediate use by the community. Similarly, the analysis code is not yet accessible, limiting the ability to reproduce and validate the computational analyses presented in the study.

    2. Reviewer #2 (Public review):

      Summary:

      This study presents a detailed single-cell transcriptomic analysis of the postnatal development of mouse anterior chamber tissues. Analysis focused on the development of cells that comprise Schlemm's Canal (SC) and trabecular meshwork (TM).

      Strengths:

      This developmental atlas represents a valuable resource for the research community. The dataset is robust, consisting of ~130,000 cells collected across seven time points from early post-natal development to adulthood. Analyses reveal developmental dynamics of SC and TM populations and describe the developmental expression patterns of genes associated with glaucoma.

      Weaknesses:

      (1) Throughout the paper, the authors place significant weight on the spatial relationships of UMAP clusters, which can be misleading (See Chari and Patcher, Plos Comb Bio 2023). This is perhaps most evident in the assessment of vascular progenitors (VP) into BEC and SEC types (Figures 4 and 5). In the text, VPs are described as a common progenitor for these types, however, the trajectory analysis in Figure 5 denotes a path of PEC -> BEC -> VP -> SEC. These two findings are incongruous and should be reconciled. The limitations of inferring relationships based on UMAP spatial positions should be noted.

      (2) Figure 2d does not include P60. It is also noted that technical variation resulted in fewer TM3 cells at P21; was this due to challenges in isolation? What is the expected proportion of TM3 cells at this stage?

      (3) In Figures 3a and b it is difficult to discern the morphological changes described in the text. Could features of the image be quantified or annotated to highlight morphological features?

      (4) Given the limited number of markers available to identify SC and TM populations during development, it would be useful to provide a table describing potential new markers identified in this study.

      (5) The paper introduces developmental glaucoma (DG), namely Axenfeld-Rieger syndrome and Peters Anomaly, but the expression analysis (Figure S20) does not annotate which genes are associated with DG.

    1. Reviewer #1 (Public review):

      Summary:

      Matsen et al. describe an approach for training an antibody language model that explicitly tries to remove effects of "neutral mutation" from the language model training task, e.g. learning the codon table, which they claim results in biased functional predictions. They do so by modeling empirical sequence-derived likelihoods through a combination of a "mutation" model and a "selection" model; the mutation model is a non-neural Thrifty model previously developed by the authors, and the selection model is a small Transformer that is trained via gradient descent. The sequence likelihoods themselves are obtained from analyzing parent-child relationships in natural SHM datasets. The authors validate their method on several standard benchmark datasets and demonstrate its favorable computational cost. They discuss how deep learning models explicitly designed to capture selection and not mutation, trained on parent-child pairs, could potentially apply to other domains such as viral evolution or protein evolution at large.

      Strengths:

      Overall, we think the idea behind this manuscript is really clever and shows promising empirical results. Two aspects of the study are conceptually interesting: the first is factorizing the training likelihood objective to learn properties that are not explained by simple neutral mutation rules, and the second is training not on self-supervised sequence statistics but on the differences between sequences along an antibody evolutionary trajectory. If this approach generalizes to other domains of life, it could offer a new paradigm for training sequence-to-fitness models that is less biased by phylogeny or other aspects of the underlying mutation process.

      Weaknesses:

      Some claims made in the paper are weakly or indirectly supported by the data. In particular, the claim that learning the codon table contributes to biased functional effect predictions may be true, but requires more justification. Additionally, the paper could benefit from additional benchmarking and comparison to enhanced versions of existing methods, such as AbLang plus a multi-hit correction. Further descriptions of model components and validation metrics could help make the manuscript more readable.

    2. Reviewer #2 (Public review):

      Summary:

      Endowing protein language models with the ability to predict the function of antibodies would open a world of translational possibilities. However, antibody language models have yet to achieve breakthrough success, which large language models have achieved for the understanding and generation of natural language. This paper elegantly demonstrates how training objectives imported from natural language applications lead antibody language models astray on function prediction tasks. Training models to predict masked amino acids teaches models to exploit biases of nucleotide-level mutational processes, rather than protein biophysics. Taking the underlying biology of antibody diversification and selection seriously allows for disentangling these processes through what the authors call deep amino acid selection models. These models extend previous work by the authors (Matsen MBE 2025) by providing predictions not only for the selection strength at individual sites, but also for individual amino acid substitutions. This represents a practically important advance.

      Strengths:

      The paper is based on a deep conceptual insight, the existence of a multitude of biological processes that affect antibody maturation trajectories. The figures and writing a very clear, which should help make the broader field aware of this important but sometimes overlooked insight. The paper adds to a growing literature proposing biology-informed tweaks for training protein language models, and should thus be of interest to a wide readership interested in the application of machine learning to protein sequence understanding and design.

      Weaknesses:

      Proponents of the state-of-the-art protein language models might counter the claims of the paper by appealing to the ability of fine-tuning to deconvolve selection and mutation-related signatures in their high-dimensional representation spaces. Leaving the exercise of assessing this claim entirely to future work somewhat diminishes the heft of the (otherwise good!) argument. In the context of predicting antibody binding affinity, the modeling strategy only allows prediction of mutations that improve affinity on average, but not those which improve binding to specific epitopes.

    3. Reviewer #3 (Public review):

      Summary:

      This work proposes DASM, a new transformer-based approach to learning the distribution of antibody sequences which outperforms current foundational models at the task of predicting mutation propensities under selected phenotypes, such as protein expression levels and target binding affinity. The key ingredient is the disentanglement, by construction, of selection-induced mutational effects and biases intrinsic to the somatic hypermutation process (which are embedded in a pre-trained model).

      Strengths:

      The approach is benchmarked on a variety of available datasets and for two different phenotypes (expression and binding affinity). The biologically informed logic for model construction implemented is compelling, and the advantage, in terms of mutational effects prediction, is clearly demonstrated via comparisons to state-of-the-art models.

      Weaknesses:

      The gain in interpretability is only mentioned but not really elaborated upon or leveraged for gaining insight. The following aspects could have been better documented: the hyperparametric search to establish the optimal model; the predictive performance of baseline approaches, to fully showcase the gain yielded by DASM.

    1. Reviewer #1 (Public review):

      This manuscript proposes that phosphorylation of a conserved Hsp70 residue (human T495 / yeast Ssa1 T492) is a BER-triggered, DDR-dependent phospho-switch that acts as a conserved brake on G1/S cell-cycle progression in response to DNA damage.

      Although the topic is interesting and potentially useful, the strength of evidence of the mechanistic and "conserved checkpoint" claims that this site is directly activated by DNA damage is inadequate and fundamentally incorrect. The work requires extensive additional experimentation and substantial tempering of conclusions.

      Specific comments:

      (1) Activation of T495:

      (a) The author's premise for the site being activated by DNA damage is Albuquerque et al, where PTMs on MMS treated yeast are analyzed. T492 (the yeast equivalent of human T495) is observed as phosphorylated. However, the authors fail to note that there is no untreated sample analysis in this study, and it is likely that T492 phosphorylation is also present in untreated cells. This is also backed up by later evidence from the same lab (Smolka et al), where they do not identify T492 as being dependent on Mec1/Tel/Rad53 kinases.

      (b) The kinase(s) directly responsible for T495 phosphorylation are not identified. Instead, the authors show that knockdown or pharmacological inhibition of DNA-PKcs, ATM, Chk2, and CK1 attenuate pHsp70.

      (c) ATM siRNA knockdown has no effect, while ATM inhibitors do, which the authors acknowledge but do not resolve. This discrepancy raises concerns about off-target drug effects.

      (d) No in vitro kinase assays, motif analysis, or phosphosite mapping confirming these kinases as direct T495 kinases are presented. Thus, the proposed signaling cascade remains speculative.

      (e) Smolka and many other labs characterized DDR sites as SQ/TQ motifs, and T492 doesn't fit that motif.

      (f) No genetic tests in yeast (e.g., BER mutants) are used to connect Ssa1 T492 phosphorylation to BER in that system, despite the strong BER-centric model.

      (g) Overexpression of MPG gives only a modest increase in pHsp70, while APE1 overexpression has no effect, and Polβ overexpression does not decrease pHsp70. These mixed results weaken the central claim that Hsp70 phosphorylation is a tuned sensor of BER burden.

      (h) A major concern is that pHsp70 is only convincingly detected after very high, prolonged MMS (10 mM, 5 h) or 0.5 mM arsenite treatments. Other DNA-damaging agents (bleomycin, camptothecin, hydroxyurea) that robustly activate DDR kinases do not induce pHsp70. This suggests to me that the authors are observing a side effect of proteotoxic stress. This is likely (see Paull et al, PMID: 34116476).

      (i) A recent study in Nature Communications (Omkar et al., 2025) demonstrates rapid phosphorylation of yeast T492 in a pkc1-dependent manner, diminishing the impact of these findings.

      (2) Downstream Effects of T492/T495:

      (a) The manuscript's central conceptual advance is that pHsp70 is a cell-cycle-regulated brake on G1/S. Yet in mammalian cells, the authors show only that pHsp70 appears late, after cells have traversed mitosis, and that blocking CDK1 (G2/M) prevents its accumulation.

      (b) There is no functional test in human cells: no knockdown/rescue experiments with T495A or T495E, no cell-cycle profiling upon altering Hsp70 phosphorylation state, and no demonstration that pHsp70 actually causes any delay in S-phase entry, rather than simply correlating with late damage responses. The strong conclusion that pT495 "stalls cell cycle progression" (e.g., Figure 6 model) is therefore not supported in the human system.

      (c) All functional conclusions rely on T492A/E point mutants at the endogenous SSA1 locus, usually in an ssa2Δ background, in a family of highly redundant Hsp70s. Without showing that this site is actually modified during their MMS treatments, the assignment of phenotypes to loss of a physiological phospho-switch is premature. The authors need to repeat their studies in an Ssa1-4 background, as in https://pubmed.ncbi.nlm.nih.gov/32205407/.

      (d) The authors infer that T495E "locks" Hsc70 in a pseudo-open state based on reduced J-protein-stimulated ATPase activity, unchanged ATP binding, altered trypsin sensitivity, and retained tau binding. However, there is no direct comparison of phosphorylated vs T495E protein (e.g., via in vitro phosphorylation with LegK4 followed by side-by-side biochemical assays, or structural analysis). Thus, it remains unclear to what extent the glutamate substitution mimics a phosphate at this position.

      (e) No client release kinetics, co-chaperone binding assays, or in vivo chaperone function tests are provided, yet the discussion builds a detailed model of a "pseudo-open" state that simultaneously resembles ATP-bound conformation and allows persistent substrate engagement.

    2. Reviewer #2 (Public review):

      Summary:

      This paper follows a clue provided by an earlier paper from the same lab, that the pathogen Legionella pneumophila translocates into its host cell a kinase LegK4 that phosphorylates the cytosolic Hsp70 on threonine 495. The consequences of modification of this conserved Hsp70 residue, whether by LegK4-phosphorylation in the cytosol (of infected cells) or by FICD-mediated AMPylation in the ER (under conditions of low ER stress) are to lock the chaperone in a JDP-refractory state, thus functionally inactivating it.

      Here, the claim is to have discovered an endogenous phosphorylation event targeting the same residue in cells in which DNA damage base-excision repair is overburdened.

      Strengths:

      The suggestion of physiological modulation of chaperone activity by covalent modification is an interesting area of cell physiology. Specifically, the claim for discovery of a discrete phosphorylation event of an Hsp70 chaperone, one with a well-defined biochemical consequence, is this paper's strength.

      Weaknesses:

      The kinase(s) responsible for the phosphorylation have not been identified (and hence remain inaccessible to experimental i.e., genetic or pharmacological manipulation). The mechanistic links to DNA damage repair and the fitness benefits of this proposed adaptation remain obscure. Of greater concern, the data provided in the paper fail to exclude the trivial possibility that the phosphorylation event described (and characterised through biochemical proxies) is biologically neutral, reflecting nothing more than a bystander event in which kinase(s) activated by application of high concentrations of a powerful alkylating agent (MMS) phosphorylate, at meaninglessly low stoichiometry, an abundant protein (Hsp70) on a surface exposed residue. Failure to exclude this (plausible) scenario is this paper's weakness.

    3. Reviewer #3 (Public review):

      In this manuscript, Moss et al. demonstrate that Hsp70 phosphorylation at a conserved threonine residue integrates DNA damage responses with cell-cycle control. The authors present unbiased biochemical, cell-based, and yeast genetic analyses showing that phosphorylation of human Hsp70 at T495 (and the analogous Ssa1 T492 in yeast) is triggered by base-excision-repair intermediates and downstream DDR kinase activity, leading to delayed G1/S progression after DNA damage. They used orthogonal approaches such as ATPase assays, phospho-specific detection, kinase-inhibition studies, synchronization experiments, and phenotypic analyses of phosphomutants. They presented robust data that collectively supported the conclusion that dynamic Hsp70 phosphorylation functions as a conserved "molecular brake" to prevent inappropriate S-phase entry under genotoxic stress. However, there are a few minor questions and clarifications that the authors are well-positioned to address.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aim to demonstrate that PGLYRP1 plays a dual role in host responses to B. pertussis infection. PGLYRP1 signaling is known to activate bactericidal responses due to recognition of peptidoglycan. Through NOD1 activation and TREM-1 engagement, it appears PGLYRP1 also has immunomodulator activities. The authors present mouse knockout studies and gene expression data to illustrate the role of PGLYRP1 in relation to B. pertussis peptidoglycan. Mice lacking PGLYRP1 had slightly lower pathology scores. When TCT peptidoglycan was removed from the bacteria, surprisingly IL23A, IL6, IL1B, and other pro-inflammatory genes encoding cytokines increased. The relationship to TCT and PGLYRP1 suggests the pathogen uses this strategy to decrease immune activation. The authors went on to show the relationship between PGLRP1 and TREM-1 as mediated by PGN using various versions of peptidoglycan. The study presents multiple angles of data to back up its findings and demonstrates an interesting strategy used by B. pertussis to downregulate innate responses to its presence during infection.

      Strengths:

      Use of knockout mice of the key factor being considered, paired with isogenic B. pertussis strains, to reveal the mechanism of immune modulation to benefit the bacteria. The authors used in vivo gene expression paired with in vivo assays to establish each aspect of the mechanism.

      Weaknesses:

      The main focus was on innate responses, and some analysis of antigen-specific antibody responses could improve the impact of the findings.

    2. Reviewer #2 (Public review):

      Since its original discovery, the mechanistic basis for TCT-mediated pathogenesis of Bordetella pertussis has been a moving target and difficult to uncouple from confounding variables. The current study provides some exciting data that suggest PGLYRP-1 modulates host responses upon 'activation' by TCT. While there are some strengths associated with the unbiased approaches and collective data to support the claims associated with TCT and PGLYRP-1's function in this system, caution should be used when interpreting and extrapolating some of the information provided. For instance, the amount and purity of TCT used in the studies are unclear, and the in vitro activity of PGLYRP1 on B. pertussis is questionable. Different mouse backgrounds are used for various assays throughout, and it is known that the PRRs vary in these systems, so the confounding variables are difficult to uncouple. Additional concerns include the types of statistical tests being performed to support some of the claims and the relevance of using whole, intact PG sacculi from other species for comparative studies with a fragment of released PG (i.e., TCT).

    3. Reviewer #3 (Public review):

      Summary:

      This study evaluates the contributions of the mammalian PG-binding protein PGLYRP1 to Bordetella infection. The authors find potential roles for PGLYRP1 in both bacterial killing (canonical) and regulation of inflammation (non-canonical). While these are interesting findings and the idea that PG fragment release has differential impacts on infection depending on fragment structure, the study is limited by the lack of connection between the in vivo and in vitro experiments, and determining the precise mechanism of how PGLYRP1 regulates host responses and bacterial fitness during infection requires further study.

      Strengths:

      (1) The combination of scRNAseq with in vitro and in vivo assays provides complementary views of PGLYRP1 function during infection.

      (2) The use of TCT-deficient B. pertussis provides a useful control and perturbation in the in vitro assays.

      Weaknesses:

      (1) The study does not ultimately resolve the initial early versus late phenotype divergence. While the in vitro assays suggest explanations for their in vivo observations, further mechanistic links are lacking and necessary for the author's conclusions throughout. To state one example, what is the early and late infection phenotype of TCT- Bp in mice lacking PGLYRP1? RNAseq data are reported from these mice, but there are no burden or pathology studies. Furthermore, what are the neutrophil phenotypes (NOD-1/TREM-1 activation) in vivo? And are they dependent on PGLYRP1 and/or TCT?

      (2) It is unclear whether or how the NOD1 and TREM-1 pathways interact.

      (3) Many of the study's conclusions rely on the use of HEK293 reporter lines in the absence of bacterial infection, which may not be physiologically representative.

      (4) The methods lack detail overall, and the experimental procedures should be described more concretely, especially for the scRNAseq datasets.

    1. Reviewer #1 (Public review):

      Summary:

      The aim of this work is to directly image collagen in tissue using a new MRI method with positive contrast. The work presents a new MRI method that allows very short, powerful radio frequency (RF) pulses and very short switching times between transmission and reception of radio frequency signals.

      Strengths:

      The experiments with and without the removal of 1H hydrogen, which is not firmly bound to collagen, on tissue samples from tendons and bones, are very well suited to prove the detection of direct hydrogen signals from collagen. The new method has great potential value in medicine, as it allows for better investigation of ageing processes and many degenerative diseases in which functional tissue is replaced by connective tissue (collagen).

      Weaknesses:

      It is clear that, due to the relatively long time intervals between RF excitation and signal readout, standard hardware in whole-body MRI systems can only be used to examine surrounding water and not hydrogen bound to collagen molecules.

    2. Reviewer #2 (Public review):

      Summary:

      This work presents direct magnetic resonance imaging (MRI) of collagen, which is not possible with conventional MRI or other tomographic imaging modalities.

      Strengths:

      The experimental work is impressive, and the presentation of results is clear and convincing. Through a series of thoughtfully prepared experiments, I found the evidence that the images reflect direct measurements of collagen to be highly compelling.

      Due to the technical demands, direct collagen imaging is unlikely to become widespread for routine clinical work, at least not anytime soon. That said, this work is nonetheless transformative and will likely be highly significant for research and perhaps clinical trials.

    3. Reviewer #3 (Public review):

      The paper is well written and well presented. The topic is important, and its significance is explained succinctly and accurately. I am only capable of reviewing the clinical aspects of this work, which is very largely technical in nature. Several clinical points are worth considering:

      (1) Tendons typically display large magic angle effects as a result of their highly ordered collagen structure (cortical bone much less so), and so it would have been of interest to know what orientation the tendons had to B 0 (in vitro and in vivo). This could affect the signal level at the longer echo time and thus the signal on the subtracted images.

      (2) The in vivo transverse image looks about mid-forearm, where tendons are not prominent. A transverse image of the lower forearm, where there is an abundance of tendons, might have been preferable.

      (3) The in vivo images show the interosseous membrane as a high signal on both the shorter and longer TE images. The structure contains ordered collagen with fibres at different oblique angles to the radius and ulnar, and thus potentially to B 0. Collagen fibres may have been at an orientation towards the magic angle, and this may account for the high signal on the longer TE image and the low signal on the subtracted image.

      (4) Some of the signals attributed to the muscle may be from an attachment of the muscle to the aponeurosis.

      (5) There is significant collagen in subcutaneous tissues, so the designation "skin" may more correctly be "skin and subcutaneous tissue".

      (6) Cortical bone is very heterogeneous, with boundaries between hard bone and soft tissue with significant susceptibility differences between the two across a small distance. This might be another mechanism for ultrashort T 2 * tissue values in addition to the presence of collagen. The two effects might be distinguished by also including a longer TE spin echo acquisition.

      Solid cortical bone may also have an ultrashort T 2 * in its own right.

      (7) It may be worth noting that in disease T 2 * may be increased. As a result, the subtraction image may make abnormal tissue less obvious than normal tissue. Magic angle effects may also produce this appearance.

      (8) It may be worth distinguishing fibrous connective tissue (loose or dense), which may be normal or abnormal, from fibrosis, which is an abnormal accumulation of fibrous connective tissue in damaged tissue. Fibrosis typically has a longer T 2 initially and decreases its T 2 * over time. In places, the context suggests that fibrous connective tissue may be more appropriate than fibrosis.

      Overall, the paper appears very well constructed and describes thoughtful and important work.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript analyzes a large dataset of [NiFe]-CODHs with a focus on genomic context and operon organization. Beyond earlier phylogenetic and biochemical studies, it addresses CODH-HCP co-occurrence, clade-specific gene neighborhoods, and operon-level variation, offering new perspectives on functional diversification and adaptation.

      Strengths:

      The study has a valuable approach.

      Weaknesses:

      Several points should be addressed.

      (1) The rationale for excluding clades G and H should be clarified. Inoue et al. (Extremophiles 26:9, 2022) defined [NiFe]-CODH phylogenetic clades A-H. In the present manuscript, clades A-H are depicted, yet the analyses and discussion focus only on clades A-F. If clades G and H were deliberately excluded (e.g., due to limited sequence data or lack of biochemical evidence), the rationale should be clearly stated. Providing even a brief explanation of their status or the reason for omission would help readers understand the scope and limitations of the study. In addition, although Figure 1 shows clades A-H and cites Inoue et al. (2022), the manuscript does not explicitly state how these clades are defined. An explicit acknowledgement of the clade framework would improve clarity and ensure that readers fully understand the basis for subsequent analyses.

      (2) The co-occurrence data would benefit from clearer presentation in the supplementary material. At present, the supplementary data largely consist of raw values, making interpretation difficult. For example, in Figure 3b, the co-occurrence frequencies are hard to reconcile with the text: clade A shows no co-occurrence with clade B and even lower tendencies than clades E or F, while clade E appears relatively high. Similarly, the claim that clades C and D "more often co-occur, especially with A, E, and F" does not align with the numerical trends, where D and E show stronger co-occurrence but C does not. A concise, well-organized summary table would greatly improve clarity and prevent such misunderstandings.

      (3) The rationale for analyzing gene neighborhoods at the single-operon level needs clarification. Many microorganisms encode more than one CODH operon, yet the analysis was carried out at the level of individual operons. The authors should clarify the biological rationale for this choice and discuss how focusing on single operons rather than considering the full complement per organism might affect the interpretation of genomic context.

    2. Reviewer #2 (Public review):

      The authors present a comparative genomic and phylogenetic analysis aimed at elucidating the functions of nickel-dependent carbon monoxide dehydrogenases (Ni-CODHs) and hybrid-cluster proteins (HCPs). By examining gene neighborhoods, phylogenetic relationships, and co-occurrence patterns, they propose functional hypotheses for different CODH clades and highlight those with the greatest potential for biotechnological applications.

      A major strength of this work lies in its systematic and conceptually clear approach, which provides a rapid and low-cost framework for predicting the functional potential of newly identified CODHs based on sequence data and genomic context. The analysis is careful in minimizing false positives and offers valuable insights into the diversity and distribution of CODH enzyme clades.

      However, several limitations should be considered when interpreting the findings. The use of incomplete genome assemblies may lead to the exclusion of relevant genes or operonic regions. Clade H was omitted due to a lack of information on its host, and the number of class II HCPs included is limited. Although the genomic window analyzed is relatively broad, it may still miss functionally relevant neighboring genes. The study assumes that the pathways associated with CODHs are encoded near the enzyme loci, but these could also occur elsewhere in the genome or on the complementary strand. The authors acknowledge these and other limitations clearly and thoughtfully, which strengthens the transparency and credibility of their analysis.

      Given the high evolutionary diversity of CODHs-both across and within clades-phenotypic predictions derived solely from sequence and neighborhood data should be interpreted with caution. Sequence-based searches, while specific, may have limited sensitivity, and structural homology searches could further enrich the dataset. Additionally, the visual inspection used to filter out non-CODH sequences is not described in detail, leaving uncertainty about reproducibility. The generalization of enzymatic activity or inactivity from a few characterized examples to entire clades should also be regarded as tentative.<br /> Despite these limitations, the study presents a solid and valuable methodological framework that can aid in the rapid functional screening of novel CODH enzymes and may inspire broader applications in enzyme discovery and metabolic annotation.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how the brain processes facial expressions across development by analyzing intracranial EEG (iEEG) data from children (ages 5-10) and post-childhood individuals (ages 13-55). The researchers used a short film containing emotional facial expressions and applied AI-based models to decode brain responses to facial emotions. They found that in children, facial emotion information is represented primarily in the posterior superior temporal cortex (pSTC)-a sensory processing area-but not in the dorsolateral prefrontal cortex (DLPFC), which is involved in higher-level social cognition. In contrast, post-childhood individuals showed emotion encoding in both regions. Importantly, the complexity of emotions encoded in the pSTC increased with age, particularly for socially nuanced emotions like embarrassment, guilt, and pride.The authors claim that these findings suggest that emotion recognition matures through increasing involvement of the prefrontal cortex, supporting a developmental trajectory where top-down modulation enhances understanding of complex emotions as children grow older.

      Strengths:

      (1) The inclusion of pediatric iEEG makes this study uniquely positioned to offer high-resolution temporal and spatial insights into neural development compared to non-invasive approaches, e.g., fMRI, scalp EEG, etc.

      (2) Using a naturalistic film paradigm enhances ecological validity compared to static image tasks often used in emotion studies.

      (3) The idea of using state-of-the-art AI models to extract facial emotion features allows for high-dimensional and dynamic emotion labeling in real time.

      Weaknesses:

      (1) The study has notable limitations that constrain the generalizability and depth of its conclusions. The sample size was very small, with only nine children included and just two having sufficient electrode coverage in the posterior superior temporal cortex (pSTC), which weakens the reliability and statistical power of the findings, especially for analyses involving age. Authors pointed out that a similar sample size has been used in previous iEEG studies, but the cited works focus on adults and do not look at the developmental perspectives. Similar work looking at developmental changes in iEEG signals usually includes many more subjects (e.g., n = 101 children from Cross ZR et al., Nature Human Behavior, 2025) to account for inter-subject variabilities.

      (2) Electrode coverage was also uneven across brain regions, with not all participants having electrodes in both the dorsolateral prefrontal cortex (DLPFC) and pSTC, making the conclusion regarding the different developmental changes between DLPFC and pSTC hard to interpret (related to point 3 below). It is understood that it is rare to have such iEEG data collected in this age group, and the electrode location is only determined by clinical needs. However, the scientific rigor should not be compromised by the limited data access. It's the authors' decision whether such an approach is valid and appropriate to address the scientific questions, here the developmental changes in the brain, given all the advantages and constraints of the data modality.

      (3) The developmental differences observed were based on cross-sectional comparisons rather than longitudinal data, reducing the ability to draw causal conclusions about developmental trajectories. Also, see comments in point 2.

      (4) Moreover, the analysis focused narrowly on DLPFC, neglecting other relevant prefrontal areas such as the orbitofrontal cortex (OFC) and anterior cingulate cortex (ACC), which play key roles in emotion and social processing. Agree that this might be beyond the scope of this paper, but a discussion section might be insightful.

      (5) Although the use of a naturalistic film stimulus enhances ecological validity, it comes at the cost of experimental control, with no behavioral confirmation of the emotions perceived by participants and uncertain model validity for complex emotional expressions in children. A non-facial music block that could have served as a control was available but not analyzed. The validation of AI model's emotional output needs to be tested. It is understood that we cannot collect these behavioral data retrospectively within the recorded subjects. Maybe potential post-hoc experiments and analyses could be done, e.g., collect behavioral, emotional perception data from age-matched healthy subjects.

      (6) Generalizability is further limited by the fact that all participants were neurosurgical patients, potentially with neurological conditions such as epilepsy that may influence brain responses. At least some behavioral measures between the patient population and the healthy groups should be done to ensure the perception of emotions is similar.

      (7) Additionally, the high temporal resolution of intracranial EEG was not fully utilized, as data were downsampled and averaged in 500-ms windows. It seems like the authors are trying to compromise the iEEG data analyses to match up with the AI's output resolution, which is 2Hz. It is not clear then why not directly use fMRI, which is non-invasive and seems to meet the needs here already. The advantages of using iEEG in this study are missing here.

      (8) Finally, the absence of behavioral measures or eye-tracking data makes it difficult to directly link neural activity to emotional understanding or determine which facial features participants attended to. Related to point 5 as well.

      Comments on revisions:

      A behavioral measurement will help address a lot of these questions. If the data continues collecting, additional subjects with iEEG recording and also behavioral measurements would be valuable.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, Fan et al. aim to characterize how neural representations of facial emotions evolve from childhood to adulthood. Using intracranial EEG recordings from participants aged 5 to 55, the authors assess the encoding of emotional content in high-level cortical regions. They report that while both the posterior superior temporal cortex (pSTC) and dorsolateral prefrontal cortex (DLPFC) are involved in representing facial emotions in older individuals, only the pSTC shows significant encoding in children. Moreover, the encoding of complex emotions in the pSTC appears to strengthen with age. These findings lead the authors to suggest that young children rely more on low-level sensory areas and propose a developmental shift from reliance on lower-level sensory areas in early childhood to increased top-down modulation by the prefrontal cortex as individuals mature.

      Strengths:

      (1) Rare and valuable dataset: The use of intracranial EEG recordings in a developmental sample is highly unusual and provides a unique opportunity to investigate neural dynamics with both high spatial and temporal resolution.

      (2 ) Developmentally relevant design: The broad age range and cross-sectional design are well-suited to explore age-related changes in neural representations.

      (3) Ecological validity: The use of naturalistic stimuli (movie clips) increases the ecological relevance of the findings.

      (4) Feature-based analysis: The authors employ AI-based tools to extract emotion-related features from naturalistic stimuli, which enables a data-driven approach to decoding neural representations of emotional content. This method allows for a more fine-grained analysis of emotion processing beyond traditional categorical labels.

      Weaknesses:

      (1) While the authors leverage Hume AI, a tool pre-trained on a large dataset, its specific performance on the stimuli used in this study remains unverified. To strengthen the foundation of the analysis, it would be important to confirm that Hume AI's emotional classifications align with human perception for these particular videos. A straightforward way to address this would be to recruit human raters to evaluate the emotional content of the stimuli and compare their ratings to the model's outputs.

      (2) Although the study includes data from four children with pSTC coverage-an increase from the initial submission-the sample size remains modest compared to recent iEEG studies in the field.

      (3) The "post-childhood" group (ages 13-55) conflates several distinct neurodevelopmental periods, including adolescence, young adulthood, and middle adulthood. As a finer age stratification is likely not feasible with the current sample size, I would suggest authors temper their developmental conclusions.

      (4) The analysis of DLPFC-pSTC directional connectivity would be significantly strengthened by modeling it as a continuous function of age across all participants, rather than relying on an unbalanced comparison between a single child and a (N=7) post-childhood group. This continuous approach would provide a more powerful and nuanced view of the developmental trajectory. I would also suggest including the result in the main text.

    1. Reviewer #1 (Public review):

      In this well-written and timely manuscript, Rieger et al. introduce Squidly, a new deep learning framework for catalytic residue prediction. The novelty of the work lies in the aspect of integrating per-residue embeddings from large protein language models (ESM2) with a biology-informed contrastive learning scheme that leverages enzyme class information to rationally mine hard positive/negative pairs. Importantly, the method avoids reliance on the use of predicted 3D structures, enabling scalability, speed, and broad applicability. The authors show that Squidly outperforms existing ML-based tools and even BLAST in certain settings, while an ensemble with BLAST achieves state-of-the-art performance across multiple benchmarks. Additionally, the introduction of the CataloDB benchmark, designed to test generalization at low sequence and structural identity, represents another important contribution of this work.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aim to develop Squidly, a sequence-only catalytic residue prediction method. By combining protein language model (ESM2) embedding with a biologically inspired contrastive learning pairing strategy, they achieve efficient and scalable predictions without relying on three-dimensional structure. Overall, the authors largely achieved their stated objectives, and the results generally support their conclusions. This research has the potential to advance the fields of enzyme functional annotation and protein design, particularly in the context of screening large-scale sequence databases and unstructured data. However, the data and methods are still limited by the biases of current public databases, so the interpretation of predictions requires specific biological context and experimental validation.

      Strengths:

      The strengths of this work include the innovative methodological incorporation of EC classification information for "reaction-informed" sample pairing, thereby enhancing the discriminative power of contrastive learning. Results demonstrate that Squidly outperforms existing machine learning methods on multiple benchmarks and is significantly faster than structure prediction tools, demonstrating its practicality.

    1. Reviewer #1 (Public review):

      Summary:

      This study used explicit-solvent simulations and coarse-grained models to identify the mechanistic features that allow for unidirectional motion of SMC on DNA. Shorter explicit-solvent models provides a description of relevant hydrogen bond energetics, which was then encoded in a coarse-grained structure-based model. In the structure-based model, the authors mimic chemical reactions as signaling changes in the energy landscape of the assembly. By cycling through the chemical cycle repeatedly, the authors show how these time-dependent energetic shifts naturally lead SMC to undergo translocation steps along DNA that are on a length scale that has been identified.

      Strengths:

      Simulating large-scale conformational changes in complex assemblies is extremely challenging. This study utilizes highly-detailed models to parameterize a coarse-grained model, thereby allowing the simulations to connect the dynamics of precise atomistic-level interactions with a large-scale conformational rearrangement. This study serves as an excellent example for this overall methodology, where future studies may further extend this approach to investigated any number of complex molecular assemblies.

      Comments on revisions:

      No additional recommendations. I removed the weakness description in the summary, since the authors have addressed that concern.

    2. Reviewer #2 (Public review):

      Summary:

      The authors perform coarse grained and all atom simulations to provide a mechanism for loop extrusion that is involved in genome compaction.

      Strengths:

      The simulations are very thoughtful. They provide insights into the the translocation process, which is only one of the mechanisms. Much of the analyses is very good. Over all the study advances the use of simulations in this complicated systems.

      Weaknesses:

      Even the authors point out several limitations, which cannot be easily overcome in paper because of the paucity of experimental data. Nevertheless, the authors could have done to illustrate the main assertion that loop extrusion occurs by the motor translocating on DNA. They should mention more clearly that there are alternate theory that have accounted for a number of experimental data.

      Comments on revisions:

      The authors have adequately addressed my concerns.

    1. Reviewer #1 (Public review):

      Summary:

      Press et al test, in three experiments, whether responses in a speeded response task reflect people's expectations, and whether these expectations are best explained by the objective statistics of the experimental context (e.g., stimulus probabilities) or by participants' mental representation of these probabilities. The studies use a classical response time and accuracy task, in which people are (1) asked to make a response (with one hand), this response then (2) triggers the presentation of one of several stimuli (with different probabilities depending on the response), and participants (3) then make a speeded response to identify this stimulus (with the other hand). In Experiment 1, participants are asked to rate, after the experiment, the subjective probabilities of the different stimuli. In Experiments 2 and 3, they rated, after each trial, to what extent the stimulus was expected (Experiment 2), or whether they were surprised by the stimulus (Experiment 3). The authors test (using linear models) whether the subjective ratings in each experiment predict stimulus identification times and accuracies better than objective stimulus probabilities (Experiment 1), or than their objective probability derived from a Rescorla-Wagner model of prior stimulus history (Experiment 2 and 3). Across all three experiments, the results are identical. Response times are best described by contributions from both subjective and objective probabilities. Accuracy is best described by subjective probability.

      Strengths:

      This is an exciting series of studies that tests an assumption that is implicit in predictive theories of response preparation (i.e., that response speed/accuracy tracks subjective expectancies), but has not been properly tested so far, to my knowledge. I find the idea of measuring subjective expectancy and surprise in the same trials as the response very clever. The manuscript is extremely well written. The experiments are well thought-out, preregistered, and the results seem highly robust and replicable across studies.

      Weaknesses:

      In my assessment, this is a well-designed, implemented, and analysed series of studies. I have one substantial concern that I would like to see addressed, and two more minor ones.

      (1) The key measure of the relationship between subjective ratings and response times/accuracy is inherently correlational. The causal relationship between both variables is therefore by definition ambiguous. I worry that the results don't reveal an influence of subjective expectancy of response times/accuracies, but the reverse: an influence of response times/accuracies on subjective expectancy ratings.

      This potential issue is most prominent in Experiments 2 and 3, where people rate their expectations in a given trial directly after they made their response. We can assume that participants have at least some insight into whether their response in the current trial was correct/erroneous or fast/slow. I therefore wonder if the pattern of results can simply be explained by participants noticing they made an error (or that they responded very slowly) and subsequently being more inclined to rate that they did not expect this stimulus (in Experiment 2) or that they were surprised by it (in Experiment 3).

      The specific pattern across the two response measures might support this interpretation. Typically, participants are more aware of the errors they make than of their response speed. From the above perspective, it would therefore be not surprising that all experiments show stronger associations between accuracy and subjective ratings than between response times and subjective ratings -- exactly as the three studies found.

      I acknowledge that this problem is less strong in Experiment 1, where participants do not rate expectancy or surprise after each response, but make subjective estimates of stimulus probabilities after the experiment. Still, even here, the flow of information might be opposite to what the authors suggest. Participants might not have made more errors for stimuli that they thought as least likely, but instead might have used the number of their responses to identify a given stimulus as a proxy for rating their likelihood. For example, if they identify a square as a square 25% of the time, even though 5% of these responses were in error, it is perhaps no surprise if their rating of the stimulus likelihood better tracks the times they identified it as a square (25%) than the actual stimulus likelihoods (20%).

      This potential reverse direction of effects would need to be ruled out to fully support the authors' claims.

      (2) My second, more minor concern, is whether the Rescorla-Wagner model is truly the best approximation of objective stimulus statistics. It is traditionally a model of how people learn. Isn't it, therefore, already a model of subjective stimulus statistics, derived from the trial history, instead of objective ones? If this is correct, my interpretation of Experiments 2 and 3 would be (given my point 1 above is resolved) that subjective expectancy ratings predict responses better than this particular model of learning, meaning that it is not a good model of learning in this task. Comparing results against Rescorla-Wagner may even seem like a stronger test than comparing them against objective stimulus statistics - i.e., they show that subjective ratings capture expectancies better even than this model of learning. The authors already touch upon this point in the General Discussion, but I would like to see this expanded, and - ideally - comparisons against objective stimulus statistics (perhaps up to the current trial) to be included, so that the authors can truly support the claim that it is not the objective stimulus statistics that determine response speed and accuracy.

      (3) There is a long history of research trying to link response times to subjective expectancies. For example, Simon and Craft (1989, Memory & Cognition) reported that stimuli of equal probability were identified more rapidly when participants had explicitly indicated they expect this stimulus to occur in the given trial, and there's similar more recent work trying to dissociate stimulus statistics and explicit expectations (e.g., Umbach et al., 2012, Frontiers; for a somewhat recent review, see Gaschler et al., 2014, Neuroscience & Biobehavioral Reviews). It has not become clear to me how the current results relate to this literature base. How do they impact this discussion, and how do they differ from what is already known?

    2. Reviewer #2 (Public review):

      Summary:

      This work by Clarke, Rittershofer, and colleagues used categorization and discrimination tasks with subjective reports of task regularities. In three behavioral experiments, they found that these subjective reports explain task accuracy and response times at least as well and sometimes better than objective measures. They conclude that subjective experience may play a role in predicting processing.

      Strengths:

      This set of behavioral studies addresses an important question. The results are replicated three times with a different experimental design, which strengthens the claims. The design is preregistered, which further strengthens the results. The findings could inspire many studies in decision-making.

      Weaknesses:

      It seems to me that it is important, but difficult to distinguish whether the objective and subjective measures stem from reasonably different mechanisms contributing to behavior, or whether they are simply two noisy proxies to the same mechanism, in which case it is not so surprising that both contribute to the explained variance. The authors acknowledge in the discussion that the type of objective measure that is chosen is crucial.

      For instance, the RW model's learning rates were not fitted to participants but to the sequence of stimuli, so they represent the optimal parameter values, not the true ones that participants are using. Is the subjective measure just a readout of the RW model's true state when using the participants' parameters? Relatedly, would the authors consider the RW predictions from participants using a sub-optimal alpha to be a subjective or an objective measure? Do the results truly show the importance of subjective measures, or is it another way of saying that humans are sub-optimal (Rahnev & Denison, 2018, BBS) ... or optimal for other goals. I see the difficulty of avoiding double-dipping on accuracy, but this seems essential to address. This relates to a more general question about the underlying mechanisms of subjective versus objective measures, which is alluded to in the discussion but could be interesting to develop a bit further.

      In terms of methods, I did not fully understand the 'RW model expectedness' objective metric in Experiments 2 and 3. VT is defined as the 'model's expectation for the given tone T. A (signed?) prediction error is defined for the expectation update, but it seems that the RW model expectedness used in the figures and statistical models is VT, sign-inverted for unexpected stimuli. So how do we interpret negative values, and how often do they occur? Shouldn't it be the unsigned value that is taken as objective surprise? This could be explained in a bit more detail. Could this be related to the quadratic effect that one can see in Figures 4E and 5E, which is not taken into account in the statistical model? Figures 4A and 5A also seem to show a combination of linear and quadratic effects. A more complete description of the objective measure could help determine whether this is a serious issue or just noise in the data.

      Gabor patches in Experiments 2 and 3 seemed to have been presented at quite a sharp contrast (I did not find this info), and accuracy seems to saturate at 100%. What was the distribution of error rates, i.e., how many participants were so close to 100% that there was no point in including them in the analysis?

      In the second preregistration, the authors announced that BIC comparisons between the full model and the objective model will test whether subjective measures capture additional variance [...] beyond objective prediction error. This is also the conclusion reached in sections 3.3 and 4.3. The model comparison, however, is performed by selecting the best of three models, excluding the null model. It seems that the full model still wins over the objective model, but sometimes quite marginally. Could the authors not test the significance of the model comparison since models are nested?

    3. Reviewer #3 (Public review):

      Summary:

      Clarke et al. investigate the role of subjective representations of task-based statistical structure on choice accuracy and reaction times during perceptual decision-making. Subjective representations of objective statistical structure are often overlooked in studies of predictive processing and, consequently, little is known about their role in predictive phenomena. By gauging the subjective experience of stimulus probability, expectedness, and surprise in tasks with fixed cue-stimulus contingencies, the authors aimed to separate subjective and objective (task-induced) contributions to predictive effects on behaviour.

      Across three different experiments, subjective and objective contributions to predictions were found to explain unique portions of variance in reaction time data. In addition, choice accuracy was best predicted by subjective representations of statistical structure in isolation. These findings reveal that the subjective experience of statistical regularities may play a key role in the predictive processes that shape perception and cognition.

      Strengths:

      This study combines careful and thorough behavioral experimentation with an innovative focus on subjective experience in predictive processing. By collecting three independent datasets with different perceptual decision-making paradigms, the authors provide converging evidence that subjective representations of statistical structure explain unique variance in behavior beyond objective task structure. The analysis strategy, which directly contrasts the contributions of subjective and objective predictors, is conceptually rigorous and allows clear insight into how subjective and objective influences shape behavior. The methods are consistently applied across all three datasets and produce coherent results, lending strong support to the authors' conclusions. The study emphasizes the critical role of subjective experience in predictive processing, with implications for understanding learning, perception, and decision-making.

      Weaknesses:

      Despite these strengths, there are several conceptual and technical issues that should be addressed. In Experiments 2 and 3, the authors use a Rescorla-Wagner (RW) learning model to estimate trialwise expectedness (Experiment 2) and surprise (Experiment 3). While the RW model is a well-established model for explaining learning behaviour, it does not represent the objective 'ground truth' statistical structure of the environment, and treating RW trajectories as such imposes assumptions about learning that may not match participants' actual behavior. This assumption could strongly affect the comparison between subjective and 'objective' predictors. It would strengthen the primary conclusions of the manuscript if other implementations of the objective statistical structure, such as the true task-defined probabilities (i.e., 25% or 75%), were considered to provide a complementary 'ground truth' perspective.

      Additionally, because objective statistical structure was predictive of subjective ratings in all three experiments, these predictors are likely collinear in the full model. Collinearity can lead to inflated standard errors and unstable coefficient estimates, even if the models converge. Currently, this potential critical problem of the applied statistical models is not assessed, reported on, or controlled for (e.g., by residualizing predictors). RW trajectories are also not reported in the manuscript, limiting the ability to assess how the model evolves over time and whether it maps onto the task-induced probabilities in a sensible way. This is particularly relevant because participants' subjective estimates of the task-induced probabilities seem to converge to the ground truth after just a few trials, especially for the 75% stimuli (Figure 3C).

    1. Reviewer #1 (Public review):

      This is an interesting manuscript aimed at improving the transcriptome characterization of 52 C. elegans neuron classes. Previous single-cell RNA seq studies already uncovered transcriptomes for these, but the data are incomplete, with a bias against genes with lower expression levels. Here, the authors use cell-specific reporter combinations to FACS purify neurons and use bulk RNA sequencing to obtain better sequencing depth. This reveals more rare transcripts, as well as non-coding RNAs, pseudo genes, etc. The authors develop computational approaches to combine the bulk and scRNA transcriptome results to obtain more definitive gene lists for the neurons examined.

      To ultimately understand features of any cell, from morphology to function, an understanding of the full complement of the genes it expresses is a pre-requisite. This paper gets us a step closer to this goal, assembling a current "definitive list" of genes for a large proportion of C. elegans neurons. The computational approaches used to generate the list are based on reasonable assumptions, the data appear to have been treated appropriately statistically, and the conclusions are generally warranted. I have a few issues that the authors may chose to address:

      (1) As part of getting rid of cross contamination in the bulk data, the authors model the scRNA data, extrapolate it to the bulk data and subtract out "contaminant" cell types. One wonders, however, given that low expressed genes are not represented in the scRNA data, whether the assignment of a gene to one or another cell type can really be made definitve. Indeed, it's possible that a gene is expressed at low levels in one cell, and in high levels in another, and would therefore be considered a contaminant. The result would be to throw out genes that actually are expressed in a given cell type. The definitive list would therefore be a conservative estimate, and not necessarily the correct estimate.

      (2) It would be quite useful to have tested some genes with lower expression levels using in vivo gene-fusion reporters to assess whether the expression assignments hold up as predicted. i.e. provide another avenue of experimentation, non-computational, to confirm that the decontamination algorithm works.

      (3) In many cases, each cell class would be composed of at least 2 if not more neurons. Is it possible that differences between members of a single class would be missed by applying the cleanup algorithms? Such transcripts would be represented only in a fraction of the cells isolated by scRNAseq, and might then be considered not real?

      (4) I didn't quite catch whether the precise staging of animals was matched between the bulk and scRNAseq datasets. Importantly, there are many genes whose expression is highly stage specific or age specific so that even slight temporal difference might yield different sets of gene expression.

      (5) To what extent does FACS sorting affect gene expression? Can the authors provide some controls?

      Comments on revisions:

      The authors have made reasonable arguments in response to my questions, and have done some additional experiments. I believe that although they did not do so, they could have generated additional reporters for the lower expressed genes, that would have validated their method of data integration. Nonetheless, I think the paper is rigorous and will be of use to the community.

    2. Reviewer #2 (Public review):

      Summary:

      This study from the CenGEN consortium addresses several limitations of single-cell RNA (scRNA) and bulk RNA sequencing in C. elegans with a focus on cells in the nervous system. scRNA datasets can give very specific expression profiles, but detecting rare and non-polyA transcripts is difficult. In contrast, bulk RNA sequencing on isolated cells can be sequenced to high depth to identify rare and non-polyA transcripts but frequently suffers from RNA contamination from other cell types. In this study, the authors generate a comprehensive set of bulk RNA datasets from 53 individual neurons isolated by fluorescence activated cell sorting (FACS). The authors combine these datasets with a previously published scRNA dataset (Taylor et al., 2021) to develop a novel method, called LittleBites, to estimate and subtract contamination from the bulk RNA data. The authors validate the method by comparing detected transcripts against gold-standard datasets on neuron-specific and non-neuronal transcripts. The authors generate an "integrated" list of protein-coding expression profiles for the 53 neuron sub-types, with fewer but higher confidence genes compared to expression profiles based only on scRNA. Also, the authors identify putative novel pan-neuronal and cell-type specific non-coding RNAs based on the bulk RNA data. LittleBites should be generally useful for extracting higher confidence data from bulk RNA-seq data in organisms where extensive scRNA datasets are available. The additional confidence in neuron-specific expression and non-coding RNA expands the already great utility of the neuronal expression reference atlas generated by the CenGEN consortium.

      Strengths:

      The study generates and analyzes a very comprehensive set of bulk RNA datasets from individual fluorescently tagged transgenic strains. These datasets are technically challenging to generate and significantly expand our knowledge of gene expression, particularly in cells that were poorly represented in the initial scRNA-seq datasets. Additionally, all transgenic strains are made available as a resource from the Caenorhabditis Elegans Genetics Center (CGC).

      The study uses the authors' extensive experience with neuronal expression to benchmark their method for reducing contamination utilizing a set of gold-standard validated neuronal and non-neuronal genes. These gold-standard genes will be helpful for benchmarking any C. elegans gene expression study.

      Weaknesses:

      The bulk RNA-seq data collected by the authors has high levels of contamination and, in some cases, is based on very few cells. The methodology to remove contamination partly makes up for this shortcoming, but the high background levels of contaminating RNA in the FACS-isolated neurons limit the confidence in cell-specific transcripts.

      The study does not experimentally validate any of the refined gene expression predictions, which was one of the main strengths of the initial CenGEN publication (Taylor et al, 2021). No validation experiments (e.g., fluorescence reporters or single molecule FISH) were performed for protein-coding or non-coding genes, which makes it difficult for the reader to assess how much gene predictions are improved, other than for the gold standard set, which may have specific characteristics (e.g., bias toward high expression as they were primarily identified in fluorescence reporter experiments).

      The study notes that bulk RNA-seq data, in contrast to scRNA-seq data, can be used to identify which isoforms are expressed in a given cell. Although not included in this manuscript, two bioRxiv papers have used the generous openness of the CenGEN consortium to study alternative splicing in C. elegans neurons [bioRxiv, 2024.2005.2016.594567 (2024) and bioRxiv, 2024.2005.2016.594572 (2024)], nicely showing the strengths of the data.

      Comments on revisions: I agree that the paper is improved.

    3. Reviewer #3 (Public review):

      Summary

      This study aims to overcome key limitations of single-cell RNA-seq in C. elegans neurons-especially the under-detection of lowly expressed and non-polyadenylated transcripts and residual contamination-by integrating bulk RNA-seq from FACS-isolated neuron types with an existing scRNA-seq atlas. The authors introduce LittleBites, an iterative, reference-guided decontamination algorithm that uses a single-cell reference together with ground-truth reporter datasets to optimize subtraction of contaminating signal from bulk profiles. They then generate an "Integrated" dataset that combines the sensitivity of bulk data with the specificity of scRNA-seq and use it to call neuron-specific expression for protein-coding genes, "rescued" genes not detected in scRNA-seq, and multiple classes of non-coding RNAs across 53 neuron classes. All data, code, and thresholded matrices are made publicly available to enable community reuse.

      Strengths

      (1) Conceptual advance and useful resource. The work demonstrates in a concrete way how bulk and single-cell datasets can be combined to overcome the weaknesses of each approach, and delivers a high-resolution transcriptomic resource for a substantial fraction of C. elegans neuron classes . The integrated matrices, thresholded expression calls, and non-coding RNA catalog will be useful both for basic neurobiology and for method developers.

      (2) Careful benchmarking and transparency. The revised manuscript includes extensive benchmarking of LittleBites and the Integrated dataset against multiple independent "ground-truth" sets: neuron-specific reporter lines, curated non-neuronal markers, and ubiquitous genes. The authors evaluate AUROCs over a wide range of thresholds, explain ROC/AUROC metrics for non-specialists, and quantify how integration affects both sensitivity and specificity relative to scRNA-seq alone.

      (3) Improved methodological clarity. In response to review, the authors now provide a much more intuitive description of the LittleBites algorithm, including a stepwise explanation of (1) contamination estimation via NNLS using single-cell references, (2) weighted subtraction tuned by a learning-rate parameter, and (3) performance optimization based on AUROC against ground-truth genes. this makes the approach accessible to readers who are not computational specialists and will facilitate re-implementation.

      (4) Systematic analysis of reference dependence. The authors explicitly address the concern that LittleBites depends on the completeness and accuracy of the scRNA-seq reference. They examine how performance varies with cluster size and by simulated degradation of the reference (e.g., reducing the number of cells per cluster), and show that AUROCs remain robust, but that gene-level assignments are more variable for clusters represented by fewer cells. This is an important and honest characterization of when the method is reliable and when users should be cautious.

      (5) Additional biological context. The manuscript now more clearly situates the dataset in the context of previous and ongoing work. In particular, the authors highlight that other groups have already used these bulk data to discover and validate cell-type-specific alternative splicing events, strengthening the case that the data are biologically meaningful beyond the immediate analyses presented here. The expanded analysis of non-coding RNAs and GPCR pseudogenes also adds biological interest.

      (6) Improved handling and documentation of "unexpressed" genes. The authors have trimmed the original list of 4,440 genes called "unexpressed" in scRNA-seq to a higher-confidence subset and provide new supplementary tables that include gene identities and tissue annotations. They also use a curated set of non-neuronal markers to estimate residual contamination and show that most such markers are not detected in the integrated data, with only a small number of apparent false positives remaining.

      Weaknesses

      (1) Novel assignments remain predictive rather than experimentally validated. Although the authors have strengthened their benchmarking and refer to external work that validates some splicing patterns from these data, the large sets of newly assigned lowly expressed genes and non-coding RNAs-particularly those rescued from the "unexpressed" gene pool-are still inferred from computational criteria (thresholding plus correlation-based decontamination) rather than direct orthogonal assays (e.g., smFISH, in situ hybridization, or reporter lines). This is understandable given scale and cost, but it means that many of these calls should be interpreted as well-supported predictions, not definitive expression maps. The revised manuscript acknowledges this, and a dedicated "Limitations of this study" subsection will further clarify this point for readers.

      (2) Reduced stability for neuron types with sparse single-cell representation. The authors' new analyses show that while integration improves overall correlation and AUROC across a wide range of neuron types, gene-level assignments are less stable for neuron classes represented by relatively few cells in the scRNA-seq reference. For such neuron types, both false negatives and false positives are more likely, and users should be cautious when interpreting cell-type-specific expression differences based solely on these calls.

      (3) Residual contamination and misclassification are not completely eliminated. Despite the careful design of LittleBites and the additional correlation-based decontamination of "unexpressed" genes, the authors' benchmarking against curated non-neuronal markers shows that a small fraction of putative non-neuronal genes remains detectable even at stricter thresholds, and some bona fide neuronal genes are removed as likely contaminants. The new supplementary tables documenting "unexpressed" genes and their tissue annotations, together with explicit statements about residual error rates and the predictive nature of these classifications, help users to judge the reliability of specific genes, but they also underscore that the dataset is not a perfect ground truth.

      (4) Scope and coverage remain incomplete. As the authors note, the dataset covers 53 neuron classes and does not fully represent all 302 neurons or all known neuron subtypes. In addition, bulk samples represent pools of neurons, and so the approach cannot resolve within-class heterogeneity or subtype-specific expression within those pools. These are inherent limitations of the current experimental design rather than flaws in the analysis, but they are important for readers to keep in mind when using the resource.

      Overall, the revised manuscript presents solid evidence for the main methodological and resource claims, with clearly articulated limitations. The work is likely to have valuable impact on the C. elegans community and provides a template for integrating bulk and single-cell data in other systems.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript Lu & Cui et al. observe that adult male zebrafish are more resistant to infection and disease following exposure to Spring Viremia of Carp Virus (SVCV) than female fish. The authors then attempt to identify some of the molecular underpinnings of this apparent sexual dimorphism and focus their investigations on a gene called cytochrome P450, family 17, subfamily A, polypeptide 2 (cyp17a2) because it was among genes that they found to be more highly expressed in kidney tissue from males than in females. Their investigations lead them to propose a direct connection between cyp17a2 and modulation of interferon signaling as the key underlying driver of difference between male and female susceptibility to SVCV.

      Strengths:

      Strengths of this study include the interesting observation of a substantial difference between adult male and female zebrafish in their susceptibility to SVCV, and also the breadth of experiments that were performed linking cyp17a2 to infection phenotypes and molecularly to the stability of host and virus proteins in cell lines. The authors place the infection phenotype in an interesting and complex context of many other sexual dimorphisms in infection phenotypes in vertebrates. This study succeeds in highlighting an unexpected factor involved in antiviral immunity that will be an important subject for future investigations of infection, metabolism, and other contexts.

      Weaknesses:

      Weaknesses of this study include a proposed mechanism underlying the sexual dimorphism phenotype based on experimentation in only males, and widespread reliance on over-expression when investigating protein-protein interaction and localization.

    2. Reviewer #2 (Public review):

      This study conducted by Lu et al. explores the molecular underpinnings of sexual dimorphism in antiviral immunity in zebrafish, with a particular emphasis on the male-biased gene cyp17a2. The authors demonstrate that male zebrafish exhibit stronger antiviral responses than females, and they identify a teleost-specific gene cyp17a2 as a key regulator of this dimorphism. Utilizing a combination of in vivo and in vitro methodologies, they demonstrate that Cyp17a2 potentiates IFN responses by stabilizing STING via K33-linked polyubiquitination and directly degrades the viral P protein via USP8-mediated deubiquitination. The work challenges conventional views of sex-based immunity and proposes a novel, hormone- and sex chromosome-independent mechanism.

      Strengths:

      (1) The following constitutes a novel concept, sexual dimorphism in immunity can be driven by an autosomal gene rather than sex chromosomes or hormones represents a significant advance in the field, offering a more comprehensive understanding of immune evolution.

      (2) The present study provides a comprehensive molecular pathway, from gene expression to protein-protein interactions and post-translational modifications, thereby establishing a link between Cyp17a2 and both host immune enhancement (via STING) and direct antiviral activity (via viral protein degradation).

      (3) In order to substantiate their claims, the authors utilize a wide range of techniques, including transcriptomics, Co-IP, ubiquitination assays, confocal microscopy, and knockout models.

      (4) The utilization of a singular model is imperative. Zebrafish, which are characterized by their absence of sex chromosomes, offer a clear genetic background for the dissection of autosomal contributions to sexual dimorphism.

    1. Reviewer #1 (Public review):

      Summary:

      This study employed a saccade-shifting sequential working memory paradigm, manipulating whether a saccade occurred after each memory array to directly compare retinotopic and transsaccadic working memory for both spatial location and color. Across four participant groups (young and older healthy adults, and patients with Parkinson's disease and Alzheimer's disease), the authors found a consistent saccade-related cost specifically for spatial memory - but not for color - regardless of differences in memory precision. Using computational modeling, they demonstrate that data from healthy participants are best explained by a complex saccade-based updating model that incorporates distractor interference. Applying this model to the patient groups further elucidates the sources of spatial memory deficits in PD and AD. The authors then extend the model to explain copying deficits in these patient groups, providing evidence for the ecological validity of the proposed saccade-updating retinotopic mechanism.

      Strengths:

      Overall, the manuscript is well written, and the experimental design is both novel and appropriate for addressing the authors' key research questions. I found the study to be particularly comprehensive: it first characterizes saccade-related costs in healthy young adults, then replicates these findings in healthy older adults, demonstrating how this "remapping" cost in spatial working memory is age-independent. After establishing and validating the best-fitting model using data from both healthy groups, the authors apply this model to clinical populations to identify potential mechanisms underlying their spatial memory impairments. The computational modeling results offer a clearer framework for interpreting ambiguities between allocentric and retinotopic spatial representations, providing valuable insight into how the brain represents and updates visual information across saccades. Moreover, the findings from the older adult and patient groups highlight factors that may contribute to spatial working memory deficits in aging and neurological disease, underscoring the broader translational significance of this work.

      Weaknesses:

      Several concerns should be addressed to enhance the clarity of the manuscript:

      (1) Relevance of the figure-copy results (pp. 13-15).

      Is it necessary to include the figure-copy task results within the main text? The manuscript already presents a clear and coherent narrative without this section. The figure-copy task represents a substantial shift from the LOCUS paradigm to an entirely different task that does not measure the same construct. Moreover, the ROCF findings are not fully consistent with the LOCUS results, which introduces confusion and weakens the manuscript's coherence. While I understand the authors' intention to assess the ecological validity of their model, this section does not effectively strengthen the manuscript and may be better removed or placed in the Supplementary Materials.

      (2) Model fitting across age groups (p. 9).

      It is unclear whether it is appropriate to fit healthy young and healthy elderly participants' data to the same model simultaneously. If the goal of the model fitting is to account for behavioral performance across all conditions, combining these groups may be problematic, as the groups differ significantly in overall performance despite showing similar remapping costs. This suggests that model performance might differ meaningfully between age groups. For example, in Figure 4A, participants 22-42 (presumably the elderly group) show the best fit for the Dual (Saccade) model, implying that the Interference component may contribute less to explaining elderly performance.

      Furthermore, although the most complex model emerges as the best-fitting model, the manuscript should explain how model complexity is penalized or balanced in the model comparison procedure. Additionally, are Fixation Decay and Saccade Update necessarily alternative mechanisms? Could both contribute simultaneously to spatial memory representation? A model that includes both mechanisms-e.g., Dual (Fixation) + Dual (Saccade) + Interference-could be tested to determine whether it outperforms Model 7 to rule out the sole contribution of complexity.

      Minor point: On p. 9, line 336, Figure 4A does not appear to include the red dashed vertical line that is mentioned as separating the age groups.

      (3) Clarification of conceptual terminology.

      Some conceptual distinctions are unclear. For example, the relationship between "retinal memory" and "transsaccadic memory," as well as between "allocentric map" and "retinotopic representation," is not fully explained. Are these constructs related or distinct? Additionally, the manuscript uses terms such as "allocentric map," "retinotopic representation," and "reference frame" interchangeably, which creates ambiguity. It would be helpful for the authors to clarify the relationships among these terms and apply them consistently.

      (4) Rationale for the selective disruption hypothesis (p. 4, lines 153-154).

      The authors hypothesize that "saccades would selectively disrupt location memory while leaving colour memory intact." Providing theoretical or empirical justification for this prediction would strengthen the argument.

      (5) Relationship between saccade cost and individual memory performance (p. 4, last paragraph).

      The authors report that larger saccades were associated with greater spatial memory disruption. It would be informative to examine whether individual differences in the magnitude of saccade cost correlate with participants' overall/baseline memory performance (e.g. their memory precision in the no-saccade condition). Such analyses might offer insights into how memory capacity/ability relates to resilience against saccade-induced updating.

      (6) Model fitting for the healthy elderly group to reveal memory-deficit factors (pp. 11-12).

      The manuscript discusses model-based insights into components that contribute to spatial memory deficits in AD and PD, but does not discuss components that contribute to spatial memory deficits in the healthy elderly group. Given that the EC group also shows impairments in certain parameters, explaining and discussing these outcomes of the EC group could provide additional insights into age-related memory decline, which would strengthen the study's broader conclusions.

      (7) Presentation of saccade conditions in Figure 5 (p. 11).

      In Figure 5, it may be clearer to group the four saccade conditions together within each patient group. Since the main point is that saccadic interference on spatial memory remains robust across patient groups, grouping conditions by patient type rather than intermixing conditions would emphasize this interpretation.

    2. Reviewer #2 (Public review):

      Summary:

      Zhao et al investigate how object location and colour are degraded across saccadic eye movements. They employ an eye-tracking task that requires participants to remember two sequentially presented items and subsequently report the colour and position of either one of these. Through counterbalancing of the presence or absence of saccades across items, the authors endeavour to dissect the impact of saccades independently on item location or colour. These behavioural findings form the basis of generative models designed to test competing, nested accounts of how stored information is stored and updated across saccades.

      Strengths:

      The combination of eye-tracking and generative modelling is a strength of the paper, which opens new perspectives into the impact of Alzheimer's and Parkinson's disease on the performance of visuospatial cognitive tests. The finding that the model parameters covary with clinical performance on the ROCF test is a nice example of a "computational assay" of disease.

      Weaknesses:

      I have a number of substantial and minor concerns for the authors to consider in a revision:

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript introduces a visual paradigm aimed at studying trans-saccadic memory.

      The authors observe how memory of object location is selectively impaired across eye movements, whereas object colour memory is relatively immune to intervening eye movements.<br /> Results are reported for young and elderly healthy controls, as well as PD and AD participants.

      A computational model is introduced to account for these results, indicating how early differences in memory encoding and decay (but not trans-saccadic updating per se) can account for the observed differences between healthy controls and clinical groups.

      Strengths:

      The data presented encompasses healthy and elderly controls, as well as clinical groups.

      The authors introduce an interesting modelling strategy, aimed at isolating and identifying the main components behind the observed pattern of results.

      Weaknesses:

      The models tested differ in terms of the number of parameters. In general, a larger number of parameters leads to a better goodness of fit. It is not clear how the difference in the number of parameters between the models was taken into account.

      It is not clear whether the modelling results could be influenced by overfitting (it is not clear how well the model can generalize to new observations).

      Results specificity: it is not clear how specific the modelling results are with respect to constructional ability (measured via the Rey-Osterrieth Complex Figure test). As with any cognitive test, performance can also be influenced by general, non-specific abilities that contribute broadly to test success.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript describes the use of computational tools to design a mimetic of the interleukin-7 (IL-7) cytokine with superior stability and receptor binding activity compared to the naturally occurring molecule. The authors focused their engineering efforts on the loop regions to preserve receptor interfaces while remediating structural irregularities that destabilize the protein. They demonstrated the enhanced thermostability, production yield, and bioactivity of the resulting molecule through biophysical and functional studies. Overall, the manuscript is well written, novel, and of high interest to the fields of molecular engineering, immunology, biophysics, and protein therapeutic design. The experimental methodologies used are convincing; however, the article would benefit from more quantitative comparisons of bioactivity through titrations.

      Comments on revisions:

      All comments have been sufficiently addressed, with the exception of comment 24 from Reviewer 1. The authors need to modify the manuscript abstract, introduction, and/or discussion to clarify which limitations of IL-7 were addressed by their molecule and to note the limitations of their approach in terms of mitigating toxicity or enhancing half-life.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript reports a prospective longitudinal study examining whether infants with high likelihood (HL) for autism differ from low-likelihood (LL) infants in two levels of word learning: brain-to-speech cortical entrainment and implicit word segmentation. The authors report reduced syllable tracking and post-learning word recognition in the HL group relative to the LL group. Importantly, both the syllable-tracking entrainment measure and the word recognition ERP measure are positively associated with verbal outcomes at 18-20 months, as indexed by the Mullen Verbal Developmental Quotient. Overall, I found this to be a thoughtfully designed and carefully executed study that tackles a difficult and important set of questions. With some clarifications and modest additional analyses or discussion on the points below, the manuscript has strong potential to make a substantial contribution to the literature on early language development and autism.

      Strengths:

      This is an important study that addresses a central question in developmental cognitive neuroscience: what mechanisms underlie variability in language learning, and what are the early neural correlates of these individual differences? While language development has a relatively well-defined sensitive period in typical development, the mechanisms of variability - particularly in the context of neurodevelopmental conditions - remain poorly understood, in part because longitudinal work in very young infants and toddlers is rare. The present study makes a valuable contribution by directly targeting this gap and by grounding the work in a strong theoretical tradition on statistical learning as a foundational mechanism for early language acquisition.

      I especially appreciate the authors' meticulous approach to data quality and their clear, transparent description of the methods. The choice of partial least squares correlation (PLS-c) is well motivated, given the multidimensional nature of the data and collinearity among variables, and the manuscript does a commendable job explaining this technique to readers who may be less familiar with it.

      The results reveal interesting developmental changes in syllable tracking and word segmentation from birth to 2 years in both HL and LL infants. Simply mapping these trajectories in both groups is highly valuable. Moreover, the associations between neural indices of brain-to-speech entrainment and word segmentation with later verbal outcomes in the LL group support a critical role for speech perception and statistical learning in early language development, with clear implications for understanding autism. Overall, this is a rich dataset with substantial potential to inform theory.

      Weaknesses:

      (1) Clarifying longitudinal vs. concurrent associations

      Because the current analytical approach incorporates all time points, including the final visit, it is challenging to determine to what extent the brain-language associations are driven by longitudinal relationships vs. concurrent correlations at the last time point. This does not undermine the main findings, but clarifying this issue could significantly enhance the impact of the individual-differences results. If feasible, the authors might consider (a) showing that a model excluding the final visit still predicts verbal outcomes at the last visit in a similar way, or (b) more explicitly acknowledging in the discussion that the observed associations may be partly or largely driven by concurrent correlations. Either approach would help readers interpret the strength and nature of the longitudinal claims.

      (2) Incorporating sleep status into longitudinal models

      Sleep status changes systematically across developmental stages in this cohort. Given that some of the papers cited to justify the paradigm also note limitations in speech entrainment and word segmentation during sleep or in patients with impaired consciousness, it would be helpful to account for sleep more directly. Including sleep status as a factor or covariate in the longitudinal models, or at least elaborating more fully on its potential role and limitations, would further strengthen the conclusions and reassure readers that these effects are not primarily driven by differences in sleep-wake state.

      (3) Use of PLS-c and potential group × condition interactions

      I am relatively new to PLS-c. One question that arose is whether PLS-c could be extended to handle a two-way interaction between group and condition contrasts (STR vs. RND). If so, some of the more complex supplementary models testing developmental trajectories within each group (Page 8, Lines 258-265) might be more directly captured within a single, unified framework. Even a brief comment in the methods or discussion about the feasibility (or limitations) of modeling such interactions within PLS-c would be informative for readers and could streamline the analytic narrative.

      (4) STR-only analyses and the role of RND

      Page 8, Lines 241-245: This analysis is conducted only within the STR condition. The lack of group difference observed here appears consistent with the lack of group difference in word-level entrainment (Page 9, Lines 292-294), suggesting that HL and LL groups may not differ in statistical learning per se, but rather in syllabic-level entrainment. As a useful sanity check and potential extension, it might be informative to explore whether syllable-level entrainment in the RND condition differs between groups to a similar extent as in Figure 2C-D. In other work (e.g., adults vs. children; Moreau et al., 2022), group differences can be more pronounced for syllable-level than for word-level entrainment. Figure S6 seems to hint that a similar pattern may exist here. If feasible, including or briefly reporting such an analysis could help clarify the asymmetry between the two learning measures and further support the interpretation of syllabic-level differences.

      (5) Multi-speaker input and voice perception (Page 15, Lines 475-483)

      The multi-speaker nature of the speech input is an interesting and ecologically relevant feature of the design, but it does add interpretive complexity. The literature on voice perception in autism is still mixed: for example, Boucher et al. (2000) reported no differences in voice recognition and discrimination between children with autism and language-matched non-autistic peers, whereas behavioral work in autistic adults suggests atypical voice perception (e.g., Schelinski et al., 2016; Lin et al., 2015). I found the current interpretation in this paragraph somewhat difficult to follow, partly because the data do not directly test how HL and LL infants integrate or suppress voice information. I think the authors could strengthen this section by slightly softening and clarifying the claims.

      (6) Asymmetry between EEG learning measures

      Page 16, Lines 502-507 touches on the asymmetry between the two EEG learning measures but leaves some questions for the reader. The presence of word recognition ERPs in the LL group suggests that a failure to suppress voice information during learning did not prevent successful word learning. At the same time, there is an interesting complementary pattern in the HL group, who show LL-like word-level entrainment but does not exhibit robust word recognition. Explicitly discussing this asymmetry - why HL infants might show relatively preserved word-level entrainment yet reduced word recognition ERPs, whereas LL infants show both - would enrich the theoretical contribution of the manuscript.

      References:

      (1) Moreau, C. N., Joanisse, M. F., Mulgrew, J., & Batterink, L. J. (2022). No statistical learning advantage in children over adults: Evidence from behaviour and neural entrainment. Developmental Cognitive Neuroscience, 57, 101154. https://doi.org/10.1016/j.dcn.2022.101154

      (2) Boucher, J., Lewis, V., & Collis, G. M. (2000). Voice processing abilities in children with autism, children with specific language impairments, and young typically developing children. Journal of Child Psychology and Psychiatry, 41(7), 847-857. https://doi.org/10.1111/1469-7610.00672

      (3) Schelinski, S., Borowiak, K., & von Kriegstein, K. (2016). Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition. Social Cognitive and Affective Neuroscience, 11(11), 1812-1822. https://doi.org/10.1093/scan/nsw089

      (4) Lin, I.-F., Yamada, T., Komine, Y., Kato, N., Kato, M., & Kashino, M. (2015). Vocal identity recognition in autism spectrum disorder. PLOS ONE, 10(6), e0129451. https://doi.org/10.1371/journal.pone.0129451

    2. Reviewer #2 (Public review):

      Summary:

      This article looks at differences in how the brain entrains to, or tracks, the rhythmic presentation of syllables and words in speech in infants at increased likelihood versus low likelihood for autism. The authors first sought to characterize how brain responses are modulated by learning the statistical probability of a given syllable following the one before it over the first two years of life. They then sought to identify at which stages of word learning infants with increased likelihood of autism showed difficulties, and whether those difficulties worsened over time. Finally, they sought to indicate whether infants' statistical learning and word learning abilities could predict later verbal skills. The authors found similar developmental trajectories of neural entrainment to syllables in infants at high and low likelihood for autism, but infants at high likelihood for autism had overall weaker syllable-level entrainment. Infants at high versus low likelihood for autism showed different developmental trajectories for word entrainment. Lower syllable entrainment in high-likelihood infants corresponded with poorer verbal outcomes, but word entrainment was not associated with verbal outcomes. Event-related potential responses to words and part words were positively associated with verbal outcomes, however, but only in low-likelihood infants.

      Strengths:

      Overall, the article provides rigorous statistical analysis of longitudinal EEG data to provide strong support for the claims that neural entrainment to syllable and word features of speech may be a useful marker for language development difficulties, particularly in infants at increased likelihood for neurodevelopmental disorders. The EEG data collection and preprocessing procedures are well within standards in the field. Readers should take care to note that authors indexed neural entrainment to speech using phase-locking values instead of spectral power.

      Weaknesses:

      While the statistical analyses are rigorous, a few of the components of the models are not clearly defined, and some corrections and thresholds for significance warrant further justification. Further, a few stimuli and participant details that could influence results are not specified. It is not clear whether all participants came from majority French-speaking families; differences in the amount of French language exposure (compared to other languages that may be spoken by a participant's family) could influence results. The standardized volume of the stimuli is also not included. As a result, readers should be encouraged to interpret that neural entrainment to speech features is likely a useful mechanism to explain differences in language development, while taking this interpretation with some caution.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel investigation of the movement vigor of individuals completing a synchronous extension-flexion task. Participants were placed into groups of two (so-called "dyads") and asked to complete shared movements (connected via a virtual loaded spring) to targets placed at varying amplitudes. The authors attempted to quantify what, if any, adjustments in movement vigor individual participants made during the dyadic movements, given the combined or co-dependent nature of the task. This is a novel, timely question of interest within the broader field of human sensorimotor control.

      Participants from each dyad were labeled as "slow" (low vigor) or "fast" (high vigor), and their respective contributions to the combined movement metrics were assessed. The authors presented four candidate models for dyad interactions: (a) independent motor plans (i.e., co-activity hypothesis), (b) individual-led motor plans (i.e., leader-follower hypothesis), (c) generalization to a weighted average motor plan (i.e., weighted adaptation hypothesis), and (d) an uncertainty-based model of dynamic partner-partner interaction (i.e., interactive adaptation hypothesis). The final model allowed for dynamic changes in individual motor plans (and therefore, movement vigor) based on partner-partner interactions and observations. After detailed observations of interaction torque and movement duration (or vigor), the authors concluded that the interactive adaptation model provided the best explanation of human-human interaction during self-paced dyadic movements.

      Strengths:

      The experimental setup (simultaneous wrist extension-flexion movements) has been thoroughly vetted. The task was designed particularly well, with adequate block pseudo-randomization to ensure general validity of the results. The analyses of torque interaction, movement kinematics, and vigor are sound, as are the statistical measures used to assess significance. The authors structured the work via a helpful comparison of several candidate models of human-human interaction dynamics, and how well said models explained variance in the vigor of solo and combined movements. The research question is timely and extends current neuroscientific understanding of sensorimotor control, particularly in social contexts.

      Weaknesses:

      (1) My chief concern about the study as it currently stands is the relatively low number of data points (n=10). The authors recruited 20 participants, but the primary conclusions are based on dyad-specific interactions (i.e., analyses of "fast" vs "slow" participants in each pair). Some of these analyses would benefit greatly, in terms of power, from the addition of more data points.

      1a) The distribution of delta-vigor (Fast group vs Slow group) is highly skewed (see Figures 3D, S6D), with over half of the dyads exhibiting delta-vigor less than 0.2 (i.e., less than 20% of unit vigor). Given the relatively low number of dyads, it would be helpful for the authors to provide explicit listings of VigorFast, VigorSlow, and VigorCombined for each of the 10 separate dyads or pairings.

      1b) The authors concluded that the interactive adaptation hypothesis provided the best summary of the combined movement dynamics in the study. If this is indeed the case, then the relative degree of difference in vigor between the fast and slow participants in a dyad should matter. How well did the interactive adaptation model explain variance in the dyads with relatively low delta-vigor (e.g., less than 0.2) vs relatively high delta-vigor?

      (2) The authors shared the results of one analysis of reaction time, showing that the reaction times of the slow partners and the fast partners did not differ during the initial passive block. Did the authors observe any changes in RT of either the slow or fast partner during the combined (primary task) blocks (KL, KH, etc.)? If the pairs of participants did indeed employ a form of interactive adaptation, then it is certainly plausible that this interaction would manifest in the initial movement planning phase (i.e., RT) in addition to the vigor and smoothness of the movements themselves.

    2. Reviewer #2 (Public review):

      Summary:

      This study examines how individual movement vigor is integrated into a shared, dyadic vigor when two individuals are physically coupled. Participants performed wrist-reaching movements toward targets at different distances while mechanically linked via a virtual elastic band, and dyads were formed by pairing participants with different baseline vigor profiles. Under interaction conditions, movements converged to coordinated patterns that could not be explained by simple averaging, indicating that each dyad behaved as a single functional unit. Notably, under coupling, movement durations for both partners were shorter than in the solo condition, arguing against the view that each individual simply executed an independent movement plan. Furthermore, dyadic vigor was primarily predicted by the slower partner's vigor rather than by the faster partner's, suggesting that neither a leader-follower strategy nor a weighted averaging account fully explains the observed behavior. The authors propose a computational model in which both partners adapt to the emerging interaction dynamics ("interactive adaptation strategy"), providing a coherent explanation of the behavioral observations.

      Strengths:

      The study is carefully designed and addresses an important question about how individual movement vigor is integrated during joint action. The experimental paradigm allows systematic manipulation of interaction strength and partner asymmetry. The behavioral results show clear and robust patterns, particularly the shortening of movement durations under elastic coupling (KL and KH conditions) and the asymmetrical contribution of the slower partner's vigor to dyadic vigor. The computational model captures the main behavioral patterns well and provides a principled framework for interpreting dyadic vigor not as a simple combination of two independent motor plans, but as an emergent property arising from mutual adaptation. Conceptually, the study is notable in extending the notion of vigor from an individual attribute to a dyad-level construct, opening a new perspective on coordinated movement and motor decision-making.

      Weaknesses:

      A key conceptual issue concerns the apparent asymmetry between partners in the computational framework. While dyadic vigor is empirically better predicted by the slower partner's vigor, the model formulation appears to emphasize the faster partner's time-related cost and interaction forces. Although the cost function includes an uncertainty-related component associated with the slower partner, it remains unclear from the current formulation and description how dyadic vigor is formally derived from the slower partner's control policy within the same modeling framework. This raises an important question regarding whether the model offers a symmetric account of dyadic vigor formation for both partners or whether it is effectively anchored to the faster partner's control architecture.

      A second conceptual issue concerns the interpretation of the term "motor plan." It remains unclear whether this term refers primarily to movement-related characteristics such as speed or duration, or more broadly to the underlying optimization structure that governs these variables. This distinction is theoretically important, as it determines whether the reported interaction effects should be understood as adjustments in movement characteristics or as changes in the structure of the control policy itself.

    3. Reviewer #3 (Public review):

      Summary:

      This study provides novel insights into how individuals regulate the speed of their movements both alone and in pairs, highlighting consistent differences in movement vigor across people and showing that these differences can adapt in dyadic contexts. The findings are significant because they reveal stable individual patterns of action that are flexible when interacting with others, and they suggest that multiple factors, beyond reward sensitivity, may contribute to these idiosyncrasies. The evidence is generally strong, supported by careful behavioral measurements and appropriate modeling, though clarifying some statistical choices and including additional measures of accuracy and smoothness would further strengthen the support for the conclusions.

      Major Comments:

      (1) Given the idiosyncrasies in individual vigor, would linear mixed models (LMMs) be more appropriate than ANOVAs in some analyses (e.g., in the section "Solo session"), as they can account for random intercepts and slopes on vigor measures? Some figures (e.g., Figure 2.B and 3.E) indeed seem to show that some aspects of behaviour may present variability in slopes and intercepts across participants. In fact, I now realize that LMMs are used in the "Emergence of dyadic vigor from the partners' individual vigor" section, so could the authors clarify why different statistical approaches were applied depending on the sections?

      (2) If I understand correctly, the introduction suggests that idiosyncrasies in movement vigor may be driven by inter-individual differences in reward sensitivity. However, the current task does not involve any explicit rewards, yet the authors still observe idiosyncrasies in vigor, which is interesting. Could this indicate that other factors contribute to these consistent individual differences? For example, could sensitivity to temporal costs or physical effort explain the slow versus fast subgrouping? Specifically, might individuals more sensitive to temporal costs move faster to minimize opportunity costs, and might those less sensitive to effort costs also move faster? Along the same lines, could the two subgroups (slow vs. fast) be characterized in terms of underlying computational "phenotypes," such as their sensitivities to time and effort? If this is not feasible with the current dataset, it would still be valuable to discuss whether these factors could plausibly account for the observed patterns, based on existing literature.

      (3) The observation that dyads did not lose accuracy or smoothness despite changes in vigor is interesting and suggests a shift in the speed-accuracy tradeoff. Could the authors include accuracy and smoothness measures in the main figures rather than only in supplementary materials? I think it would make the manuscript more complete.

      (4) It is a bit unclear to me whether the variance assumptions for ANOVAs were checked, for instance, in Figure 3H.

    1. Reviewer #1 (Public review):

      Summary

      The strength of this manuscript lies in the behavior: mice use a continuous auditory background (pink vs brown noise) to set a rule for interpreting an identical single-whisker deflection (lick in W+ and withhold in W− contexts) while always licking to a brief 10 kHz tone. Behaviorally, animals acquire the rule and switch rapidly at block transitions and take a few trials to fully integrate the context cue. What's nice about this behavior is the separate auditory cue, which shows the animals remain engaged in the task, so it's not just that the mice check out (i.e., become disengaged in the W- context). The authors then use optical tools, combining cortex-wide optogenetic inactivation (using localized inhibition in a grid-like fashion) with widefield calcium imaging to map what regions are necessary for the task and what the local and global dynamics are. Classic whisker sensorimotor nodes (wS1/wS2/wM/ALM) behave as expected with silencing reducing whisker-evoked licking. Retrosplenial cortex (RSC) emerges as a somewhat unexpected, context-specific node: silencing RSC (and tjS1) increases licking selectively in W−, arguing that these regions contribute to applying the "don't lick" policy in that context. I say somewhat because work from the Delamater group points to this possibility, albeit in a Pavlovian conditioning task and without neural data. I would still recommend the authors of the current manuscript review that work to see whether there is a relevant framework or concept (Castiello, Zhang, Delamater, 'The retrosplenial cortex as a possible 'sensory integration' area: a neural network modeling approach of the differential outcomes effect of negative patterning', 2021, Neurobiology of Learning and Memory).

      The widefield imaging shows that RSC is the earliest dorsal cortical area to show W+ vs W− divergence after the whisker stimulus, preceding whisker motor cortex, consistent with RSC injecting context into the sensorimotor flow. A "Context Off" control (continuous white noise; same block structure) impairs context discrimination, indicating the continuous background is actually used to set the rule (an important addition!) Pre-stimulus functional-connectivity analyses suggest that there is some activity correlation that maps to the context presumably due to the continuous background auditory context. Simultaneous opto+imaging projects perturbations into a low-dimensional subspace that separates lick vs no-lick trajectories in an interpretable way.

      In my view, this is a clear, rigorous systems-level study that identifies an important role for RSC in context-dependent sensorimotor transformation, thereby expanding RSC's involvement beyond navigation/memory into active sensing and action selection. The behavioral paradigm is thoughtfully designed, the claims related to the imaging are well defended, and the causal mapping is strong. I have a few suggestions for clarity that may require a bit of data analysis. I also outline one key limitation that should be discussed, but is likely beyond the scope of this manuscript.

      Major strengths

      (1) The task is a major strength. It asks the animal to generate differential motor output to the same sensory stimulus, does so in a block-based manner, and the Context-Off condition convincingly shows that the continuous contextual cue is necessary. The auditory tone control ensures this is more than a 'motivational' context but is decision-related. In fact, the slightly higher bias to lick on the catch trials in the W+ context is further evidence for this.

      (2) The dorsal-cortex optogenetic grid avoids a 'look-where-we-expect' approach and lets RSC fall out as a key node. The authors then follow this up with pharmacology and latency analyses to rule out simple motor confounds. Overall, this is rigorous and thoughtfully done.

      (3) While the mesoscale imaging doesn't allow for cellular resolution, it allows for mapping of the flow of information. It places RSC early in the context-specific divergence after whisker onset, a valuable piece that complements prior work.

      (4) The baseline (pre-stim) functional connectivity and the opto-perturbation projections into a task subspace increase the significance of the work by moving beyond local correlates.

      Key limitation

      The current optogenetic window begins ~10 ms before the sensory cue and extends 1s after, which is ideal for perturbing within-trial dynamics but cannot isolate whether RSC is required to maintain the context-specific rule during the baseline. Because context is continuously available, it makes me wonder whether RSC is the locus maintaining or, instead, gating the context signal. The paper's results are fully consistent with that possibility, but causality in the pre-stimulus window remains an open question. (As a pointer for future work, pre-stimulus-only inactivation, silencing around block switches, or context-omission probe trials (e.g., removing the background noise unexpectedly within a W+ or W- context block), could help separate 'holding' from 'gating' of the rule. But I'm not suggesting these are needed for this manuscript, but would be interesting for future studies.)

    2. Reviewer #2 (Public review):

      Summary:

      The authors aim to understand the neural basis of context-dependent sensory processing and decision-making.

      Strengths:

      They used an innovative behavioral paradigm where the action-outcome association changes independent of the sensory stimulus. This theoretically allows the authors to disentangle the effect of behavioral context on sensory processing. Using this approach combined with optogenetic silencing, they discover that RSC activity is necessary for suppressing a lick response when the stimulus switches to the unrewarded context.

      Weaknesses:

      Sensory processing appears to be entangled with jaw/tongue movement initiation. Activity in M1 and RSC during auditory-evoked lick responses appears to be identical to activity during whisker-evoked lick responses, indicating that movement initiation is the main driver of M1/RSC activity, rather than changes in the flow of sensory information. If sensory information were the main driver of the initial M1/RSC response, then auditory evoked responses should have a longer latency. Perhaps this is beyond the resolution of the calcium indicator or imaging frame rate. It is not clear from the data shown if differences in S1 activity when comparing W+ and W- stimulation are caused by context-sensitive sensory processing or whisker movement following whisker deflection.

    1. Reviewer #1 (Public review):

      The authors address a set of important and challenging questions at the interface of (developmental) neuroscience, genetics, and computation. They ask how complex neural circuits could emerge from compact genomic information, and they outline a bold vision in which this process might eventually be harnessed to design synthetic biological intelligence through genetic control of synaptogenesis. These are significant and stimulating ideas that merit rigorous theoretical and experimental exploration.

      However, the present work does not convincingly engage with these questions at a mechanistic level. Most of the circuit formation aspects appear to be adopted from prior models, and it is not clear how the main methodological modifications-introducing synaptic conductance and stochastic formalisms-provide new conceptual insight into genomic specification of neural circuitry. The manuscript does not include significant biological data or validation to support the proposed framework, and the results provided instead use artificial reinforcement learning benchmarks, which do not appear informative with respect to the biological claims.

      Overall, while the manuscript raises intriguing themes and ambitions, the proposed model is conceptually disconnected from the biological problem it purports to address. The strength of evidence does not support the strong interpretative or translational claims, and substantial rethinking of the modeling framework, in particular its validation strategy, would be required for the work to match the claims of our improved understanding of the genomic basis of neural circuit formation and our ability to engineer it.

    2. Reviewer #2 (Public review):

      In this manuscript, the authors built upon the Connectome Model literature and proposed SynaptoGen, a differentiable model that explicitly takes into account multiplicity and conductance in neural connectivity. The authors evaluated SynaptoGen through simulated reinforcement learning tasks and established its performance as often superior to two considered baselines. This work is a valuable addition to the field, supported by a solid methodology with some details and limitations missing.

      Major points:

      (1) The genetic features in the X and Y matrices in the CM were originally introduced as combinatorial gene expression patterns that correspond to the presence and even absence of a subset of genes. The authors oversimplify this original scope by only considering single-gene expression features. While this was arguably a reasonable first approximation for a case study of gap junctions in C. elegans, it is by no means expected to be a plausible expectation for chemical synapses. As the authors appear to motivate their model by chemical synapses that have polarities, they should either consider combinatorial rules in the model or at least present this explicitly as a key limitation of the model. Omitting combinatorial effects also renders the presented "bioplausible" baseline much less bioplausible, likely calling for a different name.

      (2) It is not fully explained how Equation (11) is obtained, even conceptually. It is unclear why \bar{B} and \bar{G} should be element-wise multiplied together, both already being expected values. Moreover, the authors acknowledged in lines 147-149 that the components of \bar{G} actually depend on gene expression X, which is a component in \bar{B}, so the logic here seems circular.

      (3) The authors considered two baselines, namely SNES and a bioplausible control. However, it would be of interest to also investigate: a) Vanilla DQN with the same size trained on the same MLP, to judge whether the biological insights behind SynaptoGen parameterization add value to performance. b) Using Equation (7) instead of Equation (11) to construct the weight matrices, to judge whether incorporating the conductance adds value to performance.

    3. Reviewer #3 (Public review):

      Summary

      Boccato et al. present an ambitious and thoughtfully developed framework, SynaptoGen, which proposes a differentiable model of synaptogenesis grounded in gene-expression vectors, protein interaction probabilities, and conductance rules. The authors aim to bridge the gap between computational connectomics and synthetic biological intelligence by enabling gradient-based optimization of genetically encoded circuit architectures. They support this goal with mathematical derivations, simulation experiments across several RL benchmarks, and a biologically grounded validation using C. elegans adhesion-molecule co-expression data. The paper is timely and conceptually compelling, offering a unified formulation of synaptic multiplicity and synaptic weight formation that can be integrated directly into learning systems.

      Strengths

      (1) Well-motivated framework with clear conceptual contributions.

      (2) Rigorous mathematical development.

      (3) Compelling empirical validation.

      (4) Excellent framing and discussion of future impact.

      Weaknesses

      (1) Overstated claims in the abstract and discussion.

      (2) Ambiguity in "first of its kind" assertions.

    1. Reviewer #1 (Public review):

      Summary:

      The study examined the extent to which children's word recognition skill improves across early development, becoming faster, more accurate and less variable, and the extent to which word recognition skill is related to children's concurrent and later vocabulary knowledge.

      Strengths:

      The main strength of the study comes from the dataset, which recycles previously collected data from 24 studies to examine the development of word recognition skill using data from 1963 children. This maximizes the impact of previously collected data while also allowing the study to reliably ask big-picture questions on the development of word recognition skill and its relation to chronological age and vocabulary knowledge. Data analysis is rigorous, thought through and very clearly described. Data and code necessary to reproduce the manuscript are shared on the project's GitHub.

      Weaknesses:

      The limitations of the study are acknowledged to some extent, but need to be improved and ensured that they run throughout the manuscript. Thus, in the discussion, the authors note that the approach is observational and exploratory, and highlight for me a key alternative explanation of the findings, namely that faster children could be faster due to their larger vocabulary, rather than faster children learning more words. Indeed, the latter explanation for the relationship is called into question, given that growth in speed was not related to growth in vocabulary. Here, the authors note that the null result may be related to the fact that they do not sufficiently precise estimates of growth slopes, rather than taking the alternative explanation seriously that there may not be as causal a link between being a faster word learner and a better word learner (learn more words). This is especially since, but correct me if I'm wrong here, the current vocabulary size is not taken into consideration in the model examining vocabulary growth. Given the increasing number of studies showing that current vocabulary knowledge predicts vocabulary growth (Laing, Kalinowski et al, Siew & Vitevitch), one simple alternative explanation is that current vocabulary knowledge predicts both current word recognition skill and later vocabulary knowledge. Is there anything in the data speaking against this hypothesis?

      Equally, while the SEM examines vocabulary growth controlling for age, I wonder about the other way around. What would happen to the effect of age on word recognition skill (in the LME model, S8) if one were to add concurrent vocabulary size? So does chronological age explain word recognition skill or vocabulary knowledge? Right now, the manuscript describes this effect purely related to chronological age, but is it age per se or other cognitive abilities, including a key change across development, namely, vocabulary size? Thus, the presentation of the skill learning hypothesis suggests that age is a proxy for experience, while you actually have here a very nice proxy for experience in terms of children's vocabulary size.

      Critically, while the discussion is more nuanced, the way the abstract is concluded and the way the Introduction is phrased suggest that the study is able to answer a causal question, which, as the authors themselves note, is not possible. The abstract, for instance, states that word recognition becomes faster, more accurate and less variable...consistent with a process of skill learning. And also that this skill plays a role in supporting early language learning, which is very causal language. I don't think you can really claim that you are testing the two hypotheses you suggest here. The work is definitely embedded in the context of these hypotheses, but are you really able to test them? My worry is that while the discussion is more nuanced, the extent to which this study will then be cited down the line as showing that children learn more words down the line because they are faster at recognizing words, and anything that you can do to tamper with such interpretations would be good for the literature. For me, this should not just be relegated to the discussion but should be touched upon in the abstract and Introduction.

      Finally, it would help to talk more about the mechanisms at work in any relationship between word recognition and language learning. It seems to me that this would rely on some predictive processing framework, given the description on page 4, and it would be good to make this clear (faster and more accurately you can recognize a ball, better use this evidence to infer the speaker's intended meaning). Equally, when referring to word recognition, it would be good to clarify what this refers to - how well a child knows what a word refers to (and in the context of LWL, what it does not refer to) or how quickly it directs attention to what is referred to.

      With regards to the data, I wonder if there is a clustering of kids past 24 months that is happening here, looking at Figures 1 and 2, where it seems like there is less change past the 24-month point. Is there any way to look at whether the effect of age or vocabulary on word recognition is not linear but asymptotic?

    2. Reviewer #2 (Public review):

      Summary:

      This paper presents a series of analyses of a large dataset combining many prior studies of early word recognition (Peekbank). The analyses demonstrate that the speed, accuracy and consistency of word learning improve with age. Moreover, the speed of word learning early in development was related to vocabulary growth over time.

      Strengths:

      A key strength of the paper is the use of a large multi-study dataset. This is particularly valuable in the field of early cognitive development, which has (due to practical limitations) often been based on small-scale studies that necessarily provide a shaky foundation for conclusions. The analyses are also well-motivated.

      Weaknesses:

      The weaknesses I saw are primarily in some aspects of the conceptual motivation for the research.

      First, I wasn't entirely clear about what the authors meant by "word recognition ability". For much of the manuscript (including the use of the term "word recognition ability" itself), this comes across as an intrinsic ability or skill that improves with development. Alternatively, the speed and accuracy metrics taken from studies in Peekbank might capture children's increasing knowledge of the common, concrete words typically used in these studies. To me, this is a somewhat different construct from a general skill at recognizing words. It would be helpful if the authors could clarify which construct they intend to capture, or if it is not possible to distinguish between these constructs from the Peekbank data.

      Second, and relatedly, if the source of the age-related improvements is increasing experience with the common concrete words used in the Peekbank studies, then one might expect word recognition and improvements with age to be related to word frequency, given that more frequent words are experienced more often. Word frequency predicts word knowledge when assessed using CDI data. Can effects of frequency be detected in Peekbank word recognition metrics? If not, why? Similarly, is the speed and accuracy of word recognition in Peekbank data related to CDI-derived word age of acquisition, and again, if not, why?

      Finally, there is a bit of a risk of the main findings of this paper coming across as a foregone conclusion. I.e., how could it be otherwise that word recognition improves with development?

    1. Reviewer #1 (Public review):

      Summary:

      Wu and colleagues aimed to explain previous findings that adolescents, compared to adults, show reduced cooperation following cooperative behaviour from a partner in several social scenarios. The authors analysed behavioural data from adolescents and adults performing a zero-sum Prisoner's Dilemma task and compared a range of social and non-social reinforcement learning models to identify potential algorithmic differences. Their findings suggest that adolescents' lower cooperation is best explained by a reduced learning rate for cooperative outcomes, rather than differences in prior expectations about the cooperativeness of a partner. The authors situate their results within the broader literature, proposing that adolescents' behaviour reflects a stronger preference for self-interest rather than a deficit in mentalising.

      Strengths:

      The work as a whole suggests that, in line with past work, adolescents prioritise value accumulation, and this can be, in part, explained by algorithmic differences in wegithed value learning. The authors situate their work very clearly in past literature, and make it obvious the gap they are testing and trying to explain. The work also includes social contexts which move the field beyond non-social value accumulation in adolescents. The authors compare a series of formal approaches that might explain the results and establish generative and model-comparison procedures to demonstrate the validity of their winning model and individual parameters. The writing was clear, and the presentation of the results was logical and well-structured.

      Weaknesses:

      I had some concerns about the methods used to fit and approximate parameters of interest. Namely, the use of maximum likelihood versus hierarchical methods to fit models on an individual level, which may reduce some of the outliers noted in the supplement, and also may improve model identifiability.

      There was also little discussion given the structure of the Prisoner's Dilemma, and the strategy of the game (that defection is always dominant), meaning that the preferences of the adolescents cannot necessarily be distinguished from the incentives of the game, i.e. they may seem less cooperative simply because they want to play the dominant strategy, rather than a lower preferences for cooperation if all else was the same.

      The authors have now addressed my comments and concerns in their revised version.

      Appraisal & Discussion:

      Overall, I believe this work has the potential to make a meaningful contribution to the field. Its impact would be strengthened by more rigorous modelling checks and fitting procedures, as well as by framing the findings in terms of the specific game-theoretic context, rather than general cooperation.

      Comments on revisions:

      Thank you to the authors for addressing my comments and concerns.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates age-related differences in cooperative behavior by comparing adolescents and adults in a repeated Prisoner's Dilemma Game (rPDG). The authors find that adolescents exhibit lower levels of cooperation than adults. Specifically, adolescents reciprocate partners' cooperation to a lesser degree than adults do. Through computational modeling, they show that this relatively low cooperation rate is not due to impaired expectations or mentalizing deficits, but rather a diminished intrinsic reward for reciprocity. A social reinforcement learning model with asymmetric learning rate best captured these dynamics, revealing age-related differences in how positive and negative outcomes drive behavioral updates. These findings contribute to understanding the developmental trajectory of cooperation and highlight adolescence as a period marked by heightened sensitivity to immediate rewards at the expense of long-term prosocial gains.

      Strengths:

      (1) Rigid model comparison and parameter recovery procedure.

      (2) Conceptually comprehensive model space.

      (3) Well-powered samples.

      Weaknesses:

      A key conceptual distinction between learning from non-human agents (e.g., bandit machines) and human partners is that the latter are typically assumed to possess stable behavioral dispositions or moral traits. When a non-human source abruptly shifts behavior (e.g., from 80% to 20% reward), learners may simply update their expectations. In contrast, a sudden behavioral shift by a previously cooperative human partner can prompt higher-order inferences about the partner's trustworthiness or the integrity of the experimental setup (e.g., whether the partner is truly interactive or human). The authors may consider whether their modeling framework captures such higher-order social inferences. Specifically, trait-based models-such as those explored in Hackel et al. (2015, Nature Neuroscience)-suggest that learners form enduring beliefs about others' moral dispositions, which then modulate trial-by-trial learning. A learner who believes their partner is inherently cooperative may update less in response to a surprising defection, effectively showing a trait-based dampening of learning rate.

      This asymmetry in belief updating has been observed in prior work (e.g., Siegel et al., 2018, Nature Human Behaviour) and could be captured using a dynamic or belief-weighted learning rate. Models incorporating such mechanisms (e.g., dynamic learning rate models as in Jian Li et al., 2011, Nature Neuroscience) could better account for flexible adjustments in response to surprising behavior, particularly in the social domain.

      Second, the developmental interpretation of the observed effects would be strengthened by considering possible non-linear relationships between age and model parameters. For instance, certain cognitive or affective traits relevant to social learning-such as sensitivity to reciprocity or reward updating-may follow non-monotonic trajectories, peaking in late adolescence or early adulthood. Fitting age as a continuous variable, possibly with quadratic or spline terms, may yield more nuanced developmental insights.

      Finally, the two age groups compared-adolescents (high school students) and adults (university students)-differ not only in age but also in sociocultural and economic backgrounds. High school students are likely more homogenous in regional background (e.g., Beijing locals), while university students may be drawn from a broader geographic and socioeconomic pool. Additionally, differences in financial independence, family structure (e.g., single-child status), and social network complexity may systematically affect cooperative behavior and valuation of rewards. Although these factors are difficult to control fully, the authors should more explicitly address the extent to which their findings reflect biological development versus social and contextual influences.

      Comments on revisions:

      The authors have adequately addressed my previous comments.

    1. Reviewer #1 (Public review):

      Summary:

      Zhang and colleagues examine neural representations underlying abstract navigation in entorhinal cortex (EC) and hippocampus (HC) using fMRI. This paper replicates a previously identified hexagonal modulation of abstract navigation vectors in abstract space in EC in a novel task involving navigating in a conceptual Greeble space. In HC, the authors identify a three-fold signal of the navigation angle. They also use a novel analysis technique (spectral analysis) to look at spatial patterns in these two areas and identify phase coupling between HC and EC. Interestingly, the three-fold pattern identified in the hippocampus explains quirks in participants' behavior where navigation performance follows a three-fold periodicity. Finally, the authors propose a EC-HPC PhaseSync Model to understand how the EC and HC construct cognitive maps. The wide array and creativity of the techniques used is impressive but because of their unique nature, the paper would benefit from more details on how some of these techniques were implemented.

      Comments on revisions:

      Most of my concerns were adequately addressed, and I believe the paper is greatly improved. I have two more points. I noticed that the legend for Figure 4 still refers to some components of the previous figure version, this should be updated to reflect the current version of the figure. I also think the paper would benefit from more details regarding some of the analyses. Specifically, the phase-amplitude coupling analysis should have a section in the methods which should be sure to clarify how the BOLD signals were reconstructed.

    2. Reviewer #2 (Public review):

      The authors report results from behavioral data, fMRI recordings, and computer simulations during a conceptual navigation task. They report 3-fold symmetry in behavioral and simulated model performance, 3-fold symmetry in hippocampal activity, and 6-fold symmetry in entorhinal activity (all as a function of movement directions in conceptual space). The analyses seem thoroughly done, and the results and simulations are very interesting.

    1. Reviewer #1 (Public review):

      Summary

      The authors propose a transformer-based model for prediction of condition- or tissue-specific alternative splicing and demonstrate its utility in design of RNAs with desired splicing outcomes, which is a novel application. The model is compared to relevant exising approaches (Pangolin and SpliceAI) and the authors clearly demonstrate its advantage. Overall, a compelling method that is well thought out and evaluated.

      Strengths:

      (1) The model is well thought out: rather than modeling a cassette exon using a single generic deep learning model as has been done e.g. in SpliceAI and related work, the authors propose a modular architecture that focuses on different regions around a potential exon skipping event, which enables the model to learn representations that are specific to those regions. Because each component in the model focuses on a fixed length short sequence segment, the model can learn position-specific features. Furthermore, the architecture of the model is designed to model alternative splicing events, whereas Pangolin and SpliceAI are focused on modeling individual splice junctions, which is an easier problem.

      (2) The model is evaluated in a rigorous way - it is compared to the most relevant state-of-the-art models, uses machine learning best practices, and an ablation study demonstrates the contribution of each component of the architecture.

      (3) Experimental work supports the computational predictions: Regulatory elements predicted by the model were experimentally verified; novel tissue-specific cassette exons were verified by LSV-seq.

      (4) The authors use their model for sequence design to optimize splicing outcome, which is a novel application.

      Weaknesses:

      None noted.

    2. Reviewer #2 (Public review):

      Summary:

      The authors present a transformer-based model, TrASPr, for the task of tissue-specific splicing prediction (with experiments primarily focused on the case of cassette exon inclusion) as well as an optimization framework (BOS) for the task of designing RNA sequences for desired splicing outcomes.

      For the first task, the main methodological contribution is to train four transformer-based models on the 400bp regions surrounding each splice site, the rationale being that this is where most splicing regulatory information is. In contrast, previous work trained one model on a long genomic region. This new design should help the model capture more easily interactions between splice sites. It should also help in cases of very long introns, which are relatively common in the human genome.

      TrASPr's performance is evaluated in comparison to previous models (SpliceAI, Pangolin, and SpliceTransformer) on numerous tasks including splicing predictions on GTEx tissues, ENCODE cell lines, RBP KD data, and mutagenesis data. The scope of these evaluations is ambitious; however, significant details on most of the analyses are missing, making it difficult to evaluate the strength of evidence.

      In the second task, the authors combine Latent Space Bayesian Optimization (LSBO) with a Transformer-based variational auto encoder to optimize RNA sequences for a given splicing-related objective function. This method (BOS) appears to be a novel application of LSBO, with promising results on several computational evaluations and the potential to be impactful on sequence design for both splicing-related objectives and other tasks. However, comparison of BOS against existing methods for sequence design is lacking.

      Strengths:

      - A novel machine learning model for an important problem in RNA biology with excellent prediction accuracy.

      - Instead of being based on a generic design as in previous work, the proposed model incorporates biological domain knowledge (that regulatory information is concentrated around splice sites). This way of using inductive bias can be important to future work on other sequence-based prediction tasks.

      Weaknesses:

      - Most of the analyses presented in the manuscript are described in broad strokes and are often confusing. As a result, it is difficult to assess the significance of the contribution.

      - As more and more models are being proposed for splicing prediction (SpliceAI, Pangolin, SpliceTransformer, TrASPr), there is a need for establishing standard benchmarks, similar to those in computer vision (ImageNet). Without such benchmarks, it is exceedingly difficult to compare models.<br /> *This point is now addressed in the revision *<br /> *Moreover, datasets have been made available by the authors on BitBucket. *

      - Related to the previous point, as discussed in the manuscript, SpliceAI and Pangolin are not designed to predict PSI of cassette exons. Instead, they assign a "splice site probability" to each nucleotide. Converting this to a PSI prediction is not obvious, and the method chosen by the authors (averaging the two probabilities (?)) is likely not optimal. It would interesting to see what happens if an MLP is used on top of the four predictions (or the outputs of the top layers) from SpliceAI/Pangolin. This could also indicate where the improvement in TrASPr comes from: is it because TrASPr combines information from all four splice sites? Also consider fine-tuning Pangolin on cassette exons only (as you do for your model).<br /> *This point is still not addressed in the revision. *

      - L141, "TrASPr can handle cassette exons spanning a wide range of window sizes from 181 to 329,227 bases-thanks to its multi-transformer architecture." This is reported to be one of the primary advantages compared to existing models. Additional analysis should be included on how TrASPr performs across varying exon and intron sizes, with comparison to SpliceAI, etc.

      Added after revision: The authors have added additional analyses of performance based on both the length of the exon under consideration and the total length of the surrounding intronic contexts. The result that TrASPr performs well across various context sizes (i.e., the length of the sequence between the upstream and downstream exons, ranging from <1k to >10k) is highly encouraging and supports the claim that most of the sequence-based splicing logic is located proximal to the splice sites. It is also noteworthy that TrASPr performs well for exons longer than 200, suggesting that most of the "regulatory code" is present at the exon boundaries rather than in its center (which TrASPr is blind to).<br /> Additionally, Pearson correlation is used as the sole performance metric in many analyses (e.g., Fig 2 - Supp 2). The authors should consider alternative accuracy metrics, such as RMSE, which better convey the magnitude of prediction error and are more easily comparable across datasets. Pearson correlation may also be more sensitive to outliers on the smaller samples that arise when binning sequences.

      - L171, "training it on cassette exons". This seems like an important point: previous models were trained mostly on constitutive exons, whereas here the model is trained specifically on cassette exons. This should be discussed in more detail.<br /> * Our initial comment was incorrect, as pointed out by the authors. *

      - L214, ablations of individual features are missing.<br /> * This was addressed in the revision. *

      - L230, "ENCODE cell lines", it is not clear why other tissues from GTEx were not included<br /> * This was addressed in the revision. *

      - L239, it is surprising that SpliceAI performs so badly, and might suggest a mistake in the analysis. Additional analysis and possible explanations should be provided to support these claims. Similarly for the complete failure of SpliceAI and Pangolin shown in Fig 4d.<br /> * The authors should consider adding SpliceAI/Pangolin predictions for the alternative 5' and 3' splice site selection tasks (and code for related analyses) to the BitBucket repository.*

      - BOS seems like a separate contribution that belongs in a separate publication. Instead, consider providing more details on TrASPr.

      *Minor comment added after revision: regarding the author response that "A completely independent evaluation would have required a high-throughput experimental system to assess designs, which is beyond the scope of the current paper.":<br /> It's not clear why BOS cannot be evaluated as a separate contribution by instead using different "teacher" models instead of TrASPr. Additionally, BOS lacks evaluation against existing methods for sequence optimization. *

      - The authors should consider evaluating BOS using Pangolin or SpliceTransformer as the oracle, in order to measure the contribution to the sequence generation task provided by BOS vs TrASPr.<br /> * See comment above *

    1. Joint Public Review:

      Summary:

      This is an excellent, timely study investigating and characterizing the underlying neural activity that generates the neuroendocrine GnRH and LH surges that are responsible for triggering ovulation. Abundant evidence accumulated over the past 20 years implicated the population of kisspeptin neurons in the hypothalamic RP3V region (also referred to as the POA or AVPV/PeN kisspeptin neurons) as being involved in driving the GnRH surge in response to elevated estradiol (E2), also known as the "estrogen positive feedback". However, while former studies used Cfos coexpression as a marker of RP3V kisspeptin neuron activation at specific times and found this correlates with the timing of the LH surge, detailed examination of the live in vivo activity of these neurons before, during, and after the LH surge remained elusive due to technical challenges.

      Here, Zhou and colleagues use fiber photometry to measure the long-term synchronous activity of RP3V kisspeptin neurons across different stages of the mouse estrous cycle, including on proestrus when the LH surge occurs, as well as in a well-established OVX+E2 mouse model of the LH surge.

      The authors report that RP3V kisspeptin neuron activity is low on estrous and diestrus, but increases on proestrus several hours before the late afternoon LH surge, mirroring prior reports of rising GnRH neuron activity in proestrus female mice. The measured increase in RP3V kisspeptin activation is long, spanning ~13 hours in proestrus females and extending well beyond the end of the LH secretion, and is shown by the authors to be E2 dependent.

      For this work, Kiss-Cre female mice received a Cre-dependent AAV injection, containing GCaMP6, to measure the neuronal activation of RP3V Kiss1 cells. Females exhibited periods of increased neuronal activation on the day of proestrus, beginning several hours prior to the LH surge and lasting for about 12 hours. Though oscillations in the pattern of GCaMP fluorescence were occasionally observed throughout the ovarian cycle, the frequency, duration, and amplitude of these oscillations were significantly higher on the day of proestrus. This increase in RP3V Kiss1 neuronal activation that precedes the increase in LH supports the hypothesis that these neurons are critical in regulating the LH surge. The authors compare this data to new data showing a similar increased activation pattern in GnRH neurons just prior to the LH surge, further supporting the hypothesis that RP3V Kiss1 cell activation causes the release of kisspeptin to stimulate GnRH neurons and produce the LH surge.

      Strengths:

      This study provides compelling data demonstrating that RP3V kisspeptin neuronal activity changes throughout the ovarian cycle, likely in response to changes in estradiol levels, and that neuronal activation increases on the day of the LH surge.

      The observed increase in RP3V kisspeptin neuronal activation precedes the LH surge, which lends support to the hypothesis that these neurons play a role in regulating the estradiol-induced LH surge. Continuing to examine the complexities of the LH surge and the neuronal populations involved, as done in this study, is critical for developing therapeutic treatments for women's reproductive disorders.

      This innovative study uses a within-subject design to examine neuronal activation in vivo across multiple hormone milieus, providing a thorough examination of the changes in activation of these neurons. The variability in neuronal activity surrounding the LH surge across ovarian cycles in the same animals is interesting and could not be achieved without this within-subjects design. The inclusion and comparison of ovary-intact females and OVX+E2 females is valuable to help test mechanisms under these two valuable LH surge conditions, and allows for further future studies to tease apart minor differences in the LH surge pattern between these 2 conditions.

      This study provides an excellent experimental setup able to monitor the daily activity of preoptic kisspeptin neurons in freely moving female mice. It will be a valuable tool to assess the putative role of these kisspeptin neurons in various aspects of altered female fertility (aging, pathologies...). This approach also offers novel and useful insights into the impact of E2 and circadian cues on the electrical activity of RP3V kisspeptin neurons.

      An intriguing cyclical oscillation in kisspeptin neural activity every 90 minutes exists, which may offer critical insight into how the RP3V kisspeptin system operates. Interestingly, there was also variability in the onset and duration of RP3V Kisspeptin neuron activity between and within mice in naturally cycling females. Preoptic kisspeptin neurons show an increased activity around the light/dark transition only on the day of proestrus, and this is associated with an increase in LH secretion. An original finding is the observation that the peak of kisspeptin neuron activation continues a few hours past the peak of LH, and the authors hypothesize that this prolonged activity could drive female sexual behaviors, which usually appear after the LH surge.

      The authors demonstrated that ovariectomy resulted in very little neuronal activity in RP3V kisspeptin neurons. When these ovarietomized females were treated with estradiol benzoate (EB) and an LH surge was induced, there was an increase in RP3V kisspeptin neuronal activation, as was seen during proestrus. However, the magnitude of the change in activity was greater during proestrus than during the EB-induced LH surge. Interestingly, the authors noted a consistent peak in activity about 90 minutes prior to lights out on each day of the ovarian cycle and during EB treatment, but not in ovariectomized females. The functional purpose of this consistent neuronal activity at this time remains to be determined.

      Though not part of this study, the comparison of neuronal activation of GnRH neurons during the LH surge to the current data was convincing, demonstrating a similar pattern of increased activation that precedes the LH surge.

      In summary, the study is well-designed, uses proper controls and analyses, has robust data, and the paper is nicely organized and written. The data from these experiments is compelling, and the authors' claims and conclusions are nicely supported and justified by the data. The data support the hypothesis in the field that these RP3V neurons regulate the LH surge. Overall, these findings are important and novel, and lend valuable insight into the underlying neural mechanisms for neuroendocrine control of ovulation.

      Weaknesses:

      (1) LH levels were not measured in many mice or in robust temporal detail, such as every 30 or 60 min, to allow a more detailed comparison between the fine-scale timing of RP3V neuron activation with onset and timing of LH surge dynamics.

      (2) The authors report that the peak LH value occurred 3.5 hours after the first RP3V kisspeptin neuron oscillation. However, it is likely, and indeed evident from the 2 example LH patterns shown in Figures 3A-B, that LH values start to increase several hours before the peak LH. This earlier rise in LH levels ("onset" of the surge) occurs much closer in time to the first RP3V kisspeptin neuron oscillatory activation, and as such, the ensuing LH secretion may not be as delayed as the authors suggest.

      (3) The authors nicely show that there is some variation (~2 hours) in the peak of the first oscillation in proestrus females. Was this same variability present in OVX+E2 females, or was the variability smaller or absent in OVX+E2 versus proestrus? It is possible that the variability in proestrus mice is due to variability in the timing and magnitude of rising E2 levels, which would, in theory, be more tightly controlled and similar among mice in the OVX+E2 model. If so, the OVX+E2 mice may have less variability between mice for the onset of RP3V kisspeptin activity.

      (4) One concern regarding this study is the lack of data showing the specificity of the AAV and the GCaMP6s signals. There are no data showing that GCaMP6s is limited to the RP3V and is not expressed in other Kiss1 populations in the brain. Given that 2ul of the AAV was injected, which seems like a lot considering it was close to the ventricle, it is important to show that the signal and measured activity are specific to the RP3V region. Though the authors discuss potential reasons for the low co-expression of GCaMP6 and kisspeptin immunoreactivity, it does raise some concern regarding the interpretation of these results. The low co-expression makes it difficult to confirm the Kiss1 cell-specificity of the Cre-dependent AAV injections. In addition, if GFP (GCaMP6s) and kisspeptin protein co-localization is low, it is possible that the activation of these neurons does not coincide with changes in kisspeptin or that these neurons are even expressing Kiss1 or kisspeptin at the time of activation. It is important to remember that the study measures activation of the kisspeptin neuron, and it does not reveal anything specific about the activity of the kisspeptin protein.

      (5) One additional minor concern is that LH levels were not measured in the ovariectomized females during the expected time of the LH surge. The authors suggest that the lower magnitude of activation during the LH surge in these females, in comparison to proestrus females, may be the result of lower LH levels. It's hard to interpret the difference in magnitude of neuronal activation between EB-treated and proestrus females without knowing LH levels. In addition, it's possible that an LH surge did not occur in all EB-treated females, and thus, having LH levels would confirm the success of the EB treatment.

      (6) This kisspeptin neuron peak activity is abolished in ovariectomized mice, and estradiol replacement restored this activity, but only partially. Circulating levels of estradiol were not measured in these different setups, but the authors hypothesize that the lack of full restoration may be due to the absence of other ovarian signals, possibly progesterone.

      (7) Recordings in several mice show inter- and intra-variability in the time of peak onset. It is not shown whether this variability is associated with a similar variability in the timing of the LH surge onset in the recorded mice. The authors hypothesized that this variability indicates a poor involvement of the circadian input. However, no experiments were done to investigate the role of the (vasopressinergic-driven) circadian input on the kisspeptin neuron activation at the light/dark transition. Thus, we suggest that the authors be more tentative about this hypothesis.

    1. Reviewer #1 (Public review):

      This study aims to identify the proteins that compose the electrical synapse, which are much less understood than those of the chemical synapse. Identifying these proteins is important to understand how synaptogenesis and conductance are regulated in these synapses.

      Using a proteomics approach, the authors identified more than 50 new proteins and used immunoprecipitation and immunostaining to validate their interaction of localization. One new protein, a scaffolding protein (Sipa1l3), shows particularly strong evidence of being an integral component of the electrical synapse. The function of Sipa1l3 remains to be determined.

      Another strength is the use of two different model organisms (zebrafish and mice) to determine which components are conserved across species. This approach also expands the utility of this work to benefit researchers working with both species.

      The methodology is robust and there is compelling evidence supporting the findings.

      Comments on revisions:

      I thank the authors for responding to the comments. No further recommendations.

    2. Reviewer #3 (Public review):

      Summary:

      This study by Tetenborg S et al. identifies proteins that are physically closely associated with gap junctions in retinal neurons of mice and zebrafish using BioID, a technique that labels and isolates proteins in proximal to a protein of interest. These proteins include scaffold proteins, adhesion molecules, chemical synapse proteins, components of the endocytic machinery, and cytoskeleton-associated proteins. Using a combination of genetic tools and meticulously executed immunostaining, the authors further verified the colocalizations of some of the identified proteins with connexin-positive gap junctions. The findings in this study highlight the complexity of gap junctions. Electrical synapses are abundant in the nervous system, yet their regulatory mechanisms are far less understood than those of chemical synapses. This work will provide valuable information for future studies aiming to elucidate the regulatory mechanisms essential for the function of neural circuits.

      Strengths:

      A key strength of this work is the identification of novel gap junction-associated proteins in AII amacrine cells and photoreceptors using BioID in combination with various genetic tools. The well-studied functions of gap junctions in these neurons will facilitate future research into the functions of the identified proteins in regulating electrical synapses.

      Comments on revisions:

      The authors have addressed my concerns in the revised manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      The authors note that there is a large corpus of research establishing the importance of LC-NE projections to medial prefrontal cortex (mPFC) of rats and mice in attentional set or 'rule' shifting behaviours. However, this is complex behavior and the authors were attempting to gain an understanding of how locus coeruleus modulation of the mPFC contributes to set shifting.

      The authors replicated the ED-shift impairment following NE denervation of mPFC by chemogenetic inhibition of the LC. They further showed that LC inhibition changed the way neurons in mPFC responded to the cues, with a greater proportion of individual neurons responsive to 'switching', but the individual neurons also had broader tuning, responding to other aspects of the task (i.e., response choice and response history). The population dynamics was also changed by LC inhibition, with reduced separation of population vectors between early-post-switch trials, when responding was at chance, and later trials when responding was correct. This was what they set out to demonstrate and so one can conclude they achieved their aims.

      The authors concluded that LC inhibition disrupted mPFC "encoding capacity for switching" and suggest that this "underlie[s] the behavioral deficits."

      Strengths:

      The principal strength is combining inactivation of LC with calcium imaging in mPFC. This enabled detailed consideration of the change in behavior (i.e., defining epochs of learning, with an 'early phase' when responding is at chance being compared to a 'later phase' when the behavioral switch has occurred) and how these are reflected in neuronal activity in the mPFC, with and without LC-NE input.

      Comments on revised version:

      In their response to reviewers, the authors say "We report p values using 2 decimal points and standard language as suggested by this reviewer". However, no changes were made in the manuscript: for example, "P = 4.2e-3" rather than "p = 0.004".

      In their response to the reviewers, they wrote: "Upon closer examination of the behavioral data, we exclude several sessions where more trials were taken in IDS than in EDS." If those sessions in which EDSIDS. Most problematic is the fact that the manuscript now reads "Importantly, control mice (pooled from Fig. 1e, 1h, Supp. Fig. 1a, 1b) took more trials to complete EDS than IDS (Trials to criterion: IDS vs. EDS, 10 {plus minus} 1 trials vs. 16 {plus minus} 1 trials, P < 1e-3, Supp. Fig. 1c), further supporting the validity of attentional switching (as in Fig. 1c)" without mentioning that data has been excluded.

    2. Reviewer #3 (Public review):

      Summary:

      Nigro et al examine how the locus coeruleus (LC) influences the medial prefrontal cortex (mPFC) during attentional shifts required for behavioral flexibility. Specifically, the propose that LC-mPFC inputs enable mice to shift attention effectively from texture to odor cues to optimize behavior. The LC and its noradrenergic projections to the mPFC have previously been implicated in this behavior. The authors further establish this by using chemogenetics to inhibit LC terminals in mPFC and show a selective deficit in extradimensional set shifting behavior. But the study's primary innovation is the simultaneous inhibition of LC while recording multineuron patterns of activity in mPFC. Analysis at the single neuron and population levels revealed broadened tuning properties, less distinct population dynamics, and disrupted predictive encoding when LC is inhibited. These findings add to our understanding of how neuromodulatory inputs shape attentional encoding in mPFC and are an important advance. There are some methodological limitations and/or caveats that should be considered when interpreting the findings, and these are described below.

      Strengths:

      The naturalistic set-shifting task in freely-moving animals is a major strength and the inclusion of localized suppression of LC-mPFC terminals is builds confidence in the specificity of their behavioral effect. Combining chemogenetic inhibition of LC while simultaneously recording neural activity in mPFC with miniscopes is state-of-the-art. The authors apply analyses to population dynamics in particular that can advance our understanding of how the LC modifies patterns of mPFC neural activity. The authors show that neural encoding at both the single cell level and the population level are disrupted when LC is inhibited. They also show that activity is less able to predict key aspects of the behavior when the influence of LC is disrupted. This is quite interesting and adds to a growing understanding of how neuromodulatory systems sharpen tuning of mPFC activity.

      Weaknesses:

      Weaknesses are mostly minor, but there are some caveats that should be considered. First, the authors use a DBH-Cre mouse line and provide histological confirmation of overlap between HM4Di expression and TH immunostaining. While this strongly suggests modulation of noradrenergic circuit activity, the results should be interpreted conservatively as there is no independent confirmation that norepinephrine (NE) release is suppressed and these neurons are known to release other neurotransmitters and signaling peptides. In the absence of additional control experiments, it is important to recognize that effects on mPFC activity may or may not be directly due to LC-mPFC NE.

      Another caveat is that the imaging analyses are entirely from the extradimensional shift session. Without analyzing activity data from the intradimensional shift (IDS) session, one cannot be certain that the observed changes are to some feature of activity that is specific to extradimensional shifts. Future experiments should examine animals with LC suppression during the IDS as well, which would show whether the observed effects are specific to an extradimensional shift and might explain behavioral effects.

    1. Reviewer #1 (Public review):

      Summary:

      The paper uses rigorous methods to determine phase dynamics from human cortical stereotactic EEGs. It finds that the power of the phase is higher at the lowest spatial phase. The application to data illustrates the solidity of the method and their potential for discovery.

      Comments on revised submission:

      The authors have provided responses to the previous recommendations.

    2. Reviewer #3 (Public review):

      Summary:

      The authors propose a method for estimating the spatial power spectrum of cortical activity from irregularly sampled data and apply it to iEEG data from human patients during a delayed free recall task. The main findings are that the spatial spectra of cortical activity peak at low spatial frequencies and decrease with increasing spatial frequency. This is observed over a broad range of temporal frequencies (2-100 Hz).

      Strengths:

      A strength of the study is the type of data that is used. As pointed out by the authors, spatial spectra of cortical activity are difficult to estimate from non-invasive measurements (EEG and MEG) and from commonly used intracranial measurements (i.e. electrocorticography or Utah arrays) due to their limited spatial extent. In contrast, iEEG measurements are easier to interpret than EEG/MEG measurements and typically have larger spatial coverage than Utah arrays. However, iEEG is irregularly sampled within the three-dimensional brain volume and this poses a methodological problem that the proposed method aims to address.

      Weaknesses:

      Although the proposed method is evaluated in several indirect ways, a direct evaluation is lacking. This would entail simulating cortical current source density (CSD) with known spatial spectrum and using a realistic iEEG volume-conductor model to generate iEEG signals.

      Comments on revised version:

      In my original review, I raised the following issue:

      "The proposed method of estimating wavelength from irregularly sampled three-dimensional iEEG data involves several steps (phase-extraction, singular value-decomposition, triangle definition, dimension reduction, etc.) and it is not at all clear that the concatenation of all these steps actually yields accurate estimates. Did the authors use more realistic simulations of cortical activity (i.e. on the convoluted cortical sheet) to verify that the method indeed yields accurate estimates of phase spectra?"

      And the authors' response was:

      "We now included detailed surrogate testing, in which varying combinations of sEEG phase data and veridical surrogate wavelengths are added together. See our reply from the public reviewer comments. We assess that real neurophysiological data (here, sEEG plus surrogate and MEG manipulated in various ways) is a more accurate way to address these issues. In our experience, large scale TWs appear spontaneously in realistic cortical simulations, and we now cite the relevant papers in the manuscript (line 53)."

      The point that I wanted to make is not that traveling waves appear in computational models of cortical activity, as the authors seem to think. My point was that the only direct way to evaluate the proposed method for estimating spatial spectra is to use simulated cortical activity with known spatial spectrum. In particular, with "realistic simulations" I refer to the iEEG volume-conductor model that describes the mapping from cortical current source density (CSD) to iEEG signals, and that incorporates the reference electrodes and the particular montage used.

      Although in the revised manuscript the authors have provided indirect evidence for the soundness of the proposed estimation method, the lack of a direct evaluation using realistic simulations with ground truth as described above makes that remain sceptical about the soundness of the method.

    1. Reviewer #1 (Public review):

      Summary:

      The authors scrutinized differences in C-terminal region variant profiles between Rett syndrome patients and healthy individuals and pinpointed that subtle genetic alternation can cause benign or pathogenic output, which harbors important implications in Rett syndrome diagnosis and proposes a therapeutic strategy. This work will be beneficial to clinicians and basic scientists who work on Rett syndrome, and carries the potential to be applied to other Mendelian rare diseases.

      Strengths:

      Well-designed genetic and molecular experiments translate genetic differences into functional and clinical changes. This is a unique study resolving subtle changes in sequences that give rise to dramatic phenotypic consequences.

      Weaknesses:

      There are many base-editing and protein-expression changes throughout the manuscript, and they cause confusion. It would be helpful to readers if authors could provide a simple summary diagram at the end of the paper.

    2. Reviewer #2 (Public review):

      Summary:

      This study by Guy and Bird and colleagues is a natural follow-up to their 2018 Human Molecular Genetics paper, further clarifying the molecular basis of C-terminal deletions (CTDs) in MECP2 and how they contribute to Rett syndrome. The authors combine human genetic data with well-designed experiments in embryonic stem cells, differentiated neurons, and knock-in mice to explain why some CTD mutations are disease-causing while others are harmless. They show that pathogenic mutations create a specific amino acid motif at the C-terminus, where +2 frameshifts produce a PPX ending that greatly reduces MeCP2 protein levels (likely due to translational stalling) whereas +1 frameshifts generating SPRTX endings are well tolerated.

      Strengths:

      This is a comprehensive and rigorous study that convincingly pinpoints the molecular mechanism behind CTD pathogenicity, with strong agreement between the cell-based and animal data. The authors also provide a proof of principle that modifying the PPX termination codon can restore MeCP2-CTD protein levels and rescue symptoms in mice. In addition, they demonstrate that adenine base editing can correct this defect in cultured cells and increase MeCP2-CTD protein levels. Overall, this is a well-executed study that provides important mechanistic and translational insight into a clinically important class of MECP2 mutations.

      Weaknesses:

      The adenine base editing to change the termination codon is shown to be feasible in generated cell lines, but has yet to be shown in vivo in animal models.

    3. Reviewer #3 (Public review):

      Summary:

      Guy et al. explored the variation in the pathogenicity of carboxy-terminal frameshift deletions in the X-linked MECP2 gene. Loss-of-function variants in MECP2 are associated with Rett syndrome, a severe neurodevelopmental disorder. Although 100's of pathogenic MECP2 variants have been found in people with Rett syndrome, 8 recurrent point mutations are found in ~65% of disease cases, and frameshift insertions/deletions (indels) variants resulting in production of carboxy-terminal truncated (CTT) MeCP2 protein account for ~10% of cases. Many of these occur in a "deletion prone region" (DPR) between c.1110-1210, with common recurrent deletions c.1157-1197del (CTD1) and c.1164_1207del (CTD2). While two major protein functional domains have been defined in MeCP2, the methyl-binding domain (MBD) and the NCoR interacting domain (NID), the functional role of the carboxy-terminal domain (CTD, beyond the NID, predicted to have a disordered protein structure) has not been identified, and previous work by this group and others demonstrated that a Mecp2 "minigene" lacking the CTD retains MeCP2 function suggesting that the CTD is dispensable. This raises an important question: If the CTD is dispensable, what is the pathogenic basis of the various CTT frameshift variants? Prior work from this group demonstrated that genetically engineered mice expressing the CTD1 variant had decreased expression of Mecp2 RNA and MeCP2 protein and decreased survival, but those expressing the CTD2 variant had normal Mecp2 RNA and protein and survival. However, they noted that differences between the mouse and human coding sequences resulted in different terminal sequences between the two common CTD, with CTD1 ending in -PPX in both mouse and human, but CTD2 ending in -PPC in human but -SPX in mouse, and in the previous paper they demonstrated in humanized mouse ES cells (edited to have the same -PPX termination) containing the CTD2 deletion resulted in decreased Mecp2 RNA and protein levels. This previous work provides the underlying hypotheses that they sought to explore, which is that the pathological basis of disease causing CTD relates to the formation of truncated proteins that end with a specific amino acid sequence (-PPX), which leads to decreased mRNA and protein levels, whereas tolerated, non-pathogenic CTD do not lead to production of truncated proteins ending in this sequence and retain normal mRNA/protein expression.

      In this manuscript, they evaluate missense variants, in-frame deletions, and frame shift deletions within the DPR from the aggregated Genome Aggregated Database (gnomAD) and find that the "apparently" normal individuals within gnomAD had numerous tolerated missense variants and in-frame deletions within this region, as well as frameshift deletions (in hemizygous males) in the defined region. All of the gnomAD deletions within this region resulted in terminal amino acid sequences -SPRTX (due to +1 frameshift), whereas nearly all deletion variants in this region from people with Rett syndrome (from the Clinvar copy of the former RettBase database) had a terminal -PPX sequence, due to a +2 frameshift. They hypothesized that terminal proline codons causing ribosomal stalling and "nonsense mediated decay like" degradation of mRNA (with subsequent decreased protein expression) was the basis of the specific pathogenicity of the +2 frameshift variants, and that utilizing adenine base editors (ABE) to convert the termination codon to a tryptophan could correct this issue. They demonstrate this by engineering the change into mouse embryonic stem cell lines and mouse lines containing the CTD1 deletion and show that this change normalized Mecp2 mRNA and protein levels and mouse phenotypes. Finally, they performed an initial proof-of-concept in an inducible HEK cell line and showed the ability of targeted ABE to edit the correct adenine and cause production of the expected larger truncated Mecp2 protein from CTD1 constructs.

      The findings of this manuscript provide a level of support for their hypothesis about the pathogenicity versus non-pathogenicity of some MECP2 CTT intragenic deletions and provide preliminary evidence for a novel therapeutic approach for Rett syndrome; however, limitations in their analysis do not fully support the broader conclusions presented.

      Strengths:

      (1) Utilization of publicly available databases containing aggregated genetic sequencing data from adult cohorts (gnomAD) and people with Rett syndrome (Clinvar copy of RettBase) to compare differences in the composition of the resulting terminal amino acid sequences resulting from deletions presumed to be pathogenic (n+2) versus presumed to be tolerated (n+1).

      (2) Evaluation of a unique human pedigree containing an n+1 deletion in this region that was reported as pathogenic, with demonstration of inheritance of this from the unaffected father and presence within other unaffected family members.

      (3) Development of a novel engineered mouse model of a previously assumed n+1 pathogenic variant to demonstrate lack of detrimental effect, supporting that this is likely a benign variant and not causative of Rett syndrome.

      (4) Creation and evaluation of novel cell lines and mouse models to test the hypothesis that the pathogenicity of the n+2 deletion variants could be altered by a single base change in the frameshifted stop codon.

      (5) Initial proof-of-concept experiments demonstrating the potential of ABE to correct the pathogenicity of these n+2 deletion variants.

      Weaknesses:

      (1) While the use of the large aggregated gnomAD genetic data benefits from the overall size of the data, the presence of genetic variants within this collection does not inherently mean that they are "neutral" or benign. While gnomAD does not include children, it does include aggregated data from a variety of projects targeting neuropsychiatric (and other conditions), so there is information in gnomAD from people with various medical/neuropsychiatric conditions. The authors do make some acknowledgement of this and argue that the presence of intragenic deletion variants in their region of interest in hemizygous males indicates that it is highly likely that these are tolerated, non-pathogenic variants. Broadly, it is likely true that gnomAD MECP2 variants found in hemizygous males are unlikely to cause Rett syndrome in heterozygous females, it does not necessarily mean that these variants have no potential to cause other, milder, neuropsychiatric disorders. As a clear example, within gnomAD, there is a hemizygous male with the rs28934908 C>T variant that results in p.A140V (p.A152V in e1 transcript numbering convention). This pathogenic variant has been found in a number of pedigrees with an X-linked intellectual disability pattern, in which males have a clear neurodevelopmental disorder and heterozygous females have mild intellectual disability (see PMIDs 12325019, 24328834 as representative examples of a large number of publications describing this). Thus, while their claim that hemizygous deletion variants in gnomAD are unlikely to cause Rett syndrome, that cannot make the definitive statement that they are not pathogenic and completely benign, especially when only found in a very small number of individuals in gnomAD.

      (2) The authors focus exclusively on deletions within the "DPR", they define as between c.1110-1210 and say that these deletions account for 10% of Rett syndrome cases. However, the published studies that are the basis for this 10% estimate include all genetic variants (frameshift deletions, insertions, complex insertion/deletions, nonsense variants) resulting in truncations beyond the NID. For example, Bebbington 2010 (PMID: 19914908), which includes frameshift indels as early as c.905 and beyond c.1210. Further specific examples from RettBase are described below, but the important point is that their evaluation of only frameshift variants within c.1110-1210 is not truly representative of the totality of genetic variants that collectively are considered CTT and account for 10% of Rett cases.

      (3) The authors say that they evaluated the putative pathogenic variants contained within RettBase (which is no longer available, but the data were transferred to Clinvar) for all cases with Classic Rett syndrome and de novo deletion variants within their defined DPR domain. Looking at the data from the Clinvar copy of RettBase, there are a number (n=143) of c-terminal truncating variants (either frameshift or nonsense) present beyond the NID, but the authors only discuss 14 deletion frameshift variants in this manuscript. A number of these variants have molecular features that do not fall into the pathogenic classification proposed by the authors and are not addressed in the manuscript and do not support the generalization of the conclusions presented in this manuscript, especially the conclusion that the determination of pathogenicity of all c-terminal truncating variants can be determined according to their proposed n+2 rule, or that all of the 10% of people with Rett syndrome and c-terminal truncating variants could be treated by using a base editor to correct the -PPX termination codon.

      (4) The HEK-based system utilized is convenient for doing the initial experiments testing ABE; however, it represents an artificial system expressing cDNA without splicing. Canonical NMD is dependent on splicing, and while non-canonical "NMD-like" processes are less well understood, a concern is whether the artificial system used can adequately predict efficacy in a native setting that includes introns and splicing.

    1. Reviewer #1 (Public review):

      Summary:

      The authors integrate multiple large databases to test whether body sizes were positively associated with which species tolerate urban areas. In general, many plant families showed a positive association between body size and urban tolerance, whereas a smaller, though still non-trivial, percentage of animal families showed the same pattern. Notably, the authors are careful in the interpretation of their findings and provide helpful context for the ways that this analysis can be generative in shaping new hypotheses and theory around how urbanization influences biodiversity at large. They are careful to discuss how body size is an important trait, but the absence of a relationship between body size and urban tolerance in many families suggests a variety of other traits undergird urban success.

      Strengths:

      The authors aggregated a large dataset, but they also applied robust filters to ensure they had an adequate and representative number of detections for a given species, family, geography, etc. The authors also applied their analysis at multiple taxonomic scales (family and order), which allowed for a better interpretation of the patterns in the data and at what taxonomic scale body size might be important.

      Weaknesses:

      My main concern is that it is not fully clear how the measure of body size might influence the result. The authors were unable to obtain consistent measures of body size (mean, median, maximum, or sex variation). This, of course, could be very consequential as means and medians can differ quite a bit, and they certainly will differ substantially from a maximum. And of course, sex differences can be marked in multiple directions or absent altogether. The authors do note that they selected the measure that was most common in a family, but it was not clear whether species in that family that did not have that measure were removed or not. This could potentially shape the variability in the dataset and obscure true patterns. This may require additional clarity from the authors and is also a real constraint in compiling large data from disparate sources.

    2. Reviewer #2 (Public review):

      I have completed a thorough review of this paper, which seeks to use the large datasets of species occurrences available through GBIF to estimate variation in how large numbers of plant and animal species are associated with urbanization throughout the world, describing what they call the "species urbanness distribution" or SUD. They explore how these SUDs differ between regions and different taxonomic levels. They then calculate a measure of urban tolerance and seek to explore whether organism size predicts variation in tolerance among species and across regions.

      The study is impressive in many respects. Over the course of several papers, Callaghan and coauthors have been leaders in using "big [biodiversity] data" to create metrics of how species' occurrence data are associated with urban environments, and in describing variation in urban tolerance among taxa and regions. This work has been creative, novel, and it has pushed the boundaries of understanding how urbanization affects a wide diversity of taxa. The current paper takes this to a new level by performing analyses on over 94000 observations from >30,000 species of plants and animals, across more than 370 plant and animal taxonomic families. All of these analyses were focused on answering two main questions:

      (1) What is the shape of species' urban tolerance distributions within regional communities?

      (2) Does body size consistently correlate with species' urban tolerance across taxonomic groups and biogeographic contexts?

      Overall, I think the questions are interesting and important, the size and scope of the data and analyses are impressive, and this paper has a potentially large contribution to make in pushing forward urban macroecology specifically and urban ecology and evolution more generally.

      Despite my enthusiasm for this paper and its potential impact, there are aspects that could be improved, and I believe the paper requires major revision.

      Some of these revisions ideally involve being clearer about the methodology or arguments being made. In other cases, I think their metrics of urban tolerance are flawed and need to be rethought and recalculated, and some of the conclusions are inaccurate. I hope the authors will address these comments carefully and thoroughly. I recognize that there is no obligation for authors to make revisions. However, revising the paper along the lines of the comments made below would increase the impact of the paper and its clarity to a broad readership.

      Major Comments:

      (1) Subrealms

      Where does the concept of "subrealms" come from? No citation is given, and it could be said that this sounds like an idea straight out of Middle Earth. How do subrealms relate to known bioclimatic designations like Koppen Climate classifications, which would arguably be more appropriate? Or are subrealms more socio-ecologically oriented? From what I can tell, each subrealm lumps together climatically diverse areas. It might be better and more tractable to break things in terms of continents, as the rationale for subrealms is unclear, and it makes the analyses and results more confusing. The authors rationalized the use of subrealms to account for potential intraspecific differences in species' response to urbanization, but that is never a core part of the questions or interpretation in the paper, and averaging across subrealms also accounts for intraspecific variation. Another issue with using the subrealm approach is that the authors only included a species if it had 100 observations in a given subrealm, leading to a focus on only the most common species, which may be biased in their SUD distribution. How many more species would be included if they did their analysis at the continental or global scale, and would this change the shape of SUDs?

      (2) Methods - urban score

      The authors describe their "urban score" as being calculated as "the mean of the distribution of VIIRS values as a relative species specific measure of a response to urban land cover."

      I don't understand how this is a "relative species-specific measure". What is it relative to? Figures S4 and S5 show the mean distribution of VIIRS for various taxa, and this mean looks to be an absolute measure. Mean VIIRS for a given species would be fine and appropriate as an "urban score", but the authors then state in the next sentence: "this urban score represents the relative ranking of that species to other species in response to urban land cover".

      That doesn't follow from the description of how this is calculated. Something is missing here. Please clarify and add an explicit equation for how the urban score is calculated because the text is unclear and confusing.

      (3) Methods - urban tolerance

      How the authors are defining and calculating tolerance is unclear, confusing, and flawed in my opinion.

      Tolerance is a common concept in ecology, evolution, and physiology, typically defined as the ability for an organism to maintain some measure of performance (e.g., fitness, growth, physiological homeostasis) in the presence versus absence of some stressor. As one example, in the herbivory literature, tolerance is often measured as the absolute or relative difference in fitness of plants that are damaged versus undamaged (e.g., https://academic.oup.com/evolut/article/62/9/2429/6853425?login=true).

      On line 309, after describing the calculation of urban scores across subrealms, they write: "Therefore, a species could be represented across multiple subrealms with differing measures of urban tolerance (Fig. S4). Importantly, this continuous metric of urban tolerance is a relative measure of a species' preference, or affinity, to urban areas: it should be interpreted only within each subrealm".

      This is problematic on several fronts. First, the authors never define what they mean by the term "tolerance". Second, they refer to urban tolerance throughout the paper, but don't describe the calculation until lines 315-319, where they write (text in [ ] is from the reviewer):

      "Within each subrealm, we further accounted for the potential of different levels of urbanization by scaling each species' urban score by subtracting the mean VIIRS of all observations in the subrealm (this value is hereafter referred to as urban tolerance). This 'urban tolerance' (Fig. S5) value can be negative - when species under-occupy urban areas [relative to the average across all species] suggesting they actively avoid them-or positive-when species over-occupy urban areas [relative to the average across all species] suggesting they prefer them (i.e., ranging from urban avoiders to urban exploiters, respectively).<br /> They are taking a relativized urban score and then subtracting the mean VIIRS of all observations across species in a subrealm. How exactly one interprets the magnitude isn't clear and they admit this metric is "not interpretative across subrealms".

      This is not a true measure of tolerance, at least not in the conventional sense of how tolerance is typically defined. The problem is that a species distribution isn't being compared to some metric of urbanness, but instead it is relative to other species' urban scores, where species may, on average, be highly urban or highly nonurban in their distribution, and this may vary from subrealm to subrealm. A measure of urban tolerance should be independent of how other species are responding, and should be interpretable across subrealms, continents, and the globe.

      I propose the authors use one of two metrics of urban tolerance:

      (i) Absolute Urban Tolerance = Mean VIIRS of species_i - Mean VIIRS of city centers<br /> Here, the mean VIIRS of city centers could be taken from the center of multiple cities throughout a subrealm, across a continent, or across the world. Here, the units are in the original VIIRS units where 0 would correspond to species being centered on the most extreme urban habitats, and the most extreme negative values would correspond to species that occupy the most non-urban habitats (i.e., no artificial light at night). In essence, this measure of tolerance would quantify how far a species' distribution is shifted relative to the most highly urbanized habitat available.

      (ii) % Urban Tolerance = (Mean VIIRS of species_i - Mean VIIRS of city centers)/MeanVIIRS of city centers * 100%<br /> This metric provides a % change in species mean VIIRS distribution relative to the most urban habitats. This value could theoretically be negative or positive, but will typically be negative, with -100% being completely non-urban, and 0% being completely urban tolerant.

      Both of these metrics can be compared across the world, as it would provide either absolute (equation 1) or relative (equation 2) metrics of urban tolerance that are comparable and easily interpretable in any region.

      In summary, the definition of tolerance should be clear, the metric should be a true measure of tolerance that is comparable across regions, and an equation should be given.

      (4) Figure 1: The figure does not stand alone. For example, what is the hypothesis for thermophily or the temperature-size rule? The authors should expand the legend slightly to make the hypotheses being illustrated clearer.

      (5) SUDs: I don't agree with the conclusion given on line 83 ("pattern was consistent across subrealms and several taxonomic levels") or in the legend of Figure 2 ("there were consistent patterns for kingdoms, classes, and orders, as shown by generally similar density histograms shapes for each of these").

      The shapes of the curves are quite different, especially for the two Kingdoms and the different classes. I agree they are relatively consistent for the different taxonomic Orders of insects.

    3. Reviewer #3 (Public review):

      Summary:

      This paper reports on an association between body size and the occurrence of species in cities, which is quantified using an 'urban score' that can be visualized as a 'Species Urbanness Distribution' for particular taxa. The authors use species records from the Global Biodiversity Information Facility (GBIF) and link the occurrence data to nighttime lighting quantified using satellite data (Visible Infrared Imaging Radiometer Suite-VIIRS). They link the urban score to body size data to find 'heterogeneous relationship between body size and urban tolerance across the tree'. The results are then discussed with reference to potential mechanisms that could possibly produce the observed effects (cf. Figure 1).

      Strengths:

      The novelty of this study lies in the huge number of species analyzed and the comparison of results among animal taxa, rather than in a thorough analysis of what traits allow species to persist under urban conditions. Such analyses have been done using a much more thorough approach that employs presence-absence data as well as a suite of traits by other studies, for example, in (Hahs et al. 2023, Neate-Clegg et al. 2023). The dataset that the authors produced would also be very valuable if these raw data were published, both the cleaned species records as well as the body sizes.

      The paper could strongly add to our understanding of what species occur in cities when the open questions are addressed.

      Weaknesses:

      I value the approach of the authors, but I think the paper needs to be revised.

      In my view, the authors could more carefully validate their approach. Currently, any weakness or biases in the approach are quickly explained away rather than carefully explored. This concerns particularly the use of presence-only data, but also the calculation of the urban score.

      The vast majority of data in GBIF is presence-only data. This produces a strong bias in the analysis presented in the paper. For some taxa, it is likely that occurrences within the city are overrepresented, and for other taxa, the opposite is true (cf. Sweet et al. 2022). I think the authors should try to address this problem.

      The authors should compare their results to studies focusing on particular taxa where extensive trait-based analyses have already been performed, i.e., plants and birds. In fact, I strongly suggest that the authors should compare their results to previous studies on the relationship between traits, including body size and occurrences along a gradient of urbanisation, to draw conclusions about the validity of the approach used in the current study, which has a number of weaknesses.

      They should be be more careful in coming up with post-hoc explanations of why the pattern found in this study makes sense or suggests a particular mechanism. This reviewer considers that there is no way in which the current study can disentangle the different possible mechanisms without further analyses and data, so I would suggest pointing out carefully how the mechanisms could be studied

      More details should be given about the methodology. The readers should be able to understand the methods without having to read a number of other papers.

      References:

      Hahs, A. K., B. Fournier, M. F. Aronson, C. H. Nilon, A. Herrera-Montes, A. B. Salisbury, C. G. Threlfall, C. C. Rega-Brodsky, C. A. Lepczyk, and F. A. La Sorte. 2023. Urbanisation generates multiple trait syndromes for terrestrial animal taxa worldwide. Nature Communications 14:4751.

      Neate-Clegg, M. H. C., B. A. Tonelli, C. Youngflesh, J. X. Wu, G. A. Montgomery, Ç. H. Şekercioğlu, and M. W. Tingley. 2023. Traits shaping urban tolerance in birds differ around the world. Current Biology 33:1677-1688.

      Sweet, F. S. T., B. Apfelbeck, M. Hanusch, C. Garland Monteagudo, and W. W. Weisser. 2022. Data from public and governmental databases show that a large proportion of the regional animal species pool occur in cities in Germany. Journal of Urban Ecology 8:juac002.

    1. Reviewer #1 (Public review):

      Summary:

      In this paper, Qiu et al. developed a novel spatial navigation task to investigate the formation of multi-scale representations in the human brain. Over multiple sessions and diverse tasks, participants learned the location of 32 objects distributed across 4 different rooms. The key task was a "judgement of relative direction" task delivered in the scanner, which was designed to assess whether object representations reflect local (within-room) or global (across-room) similarity structures. In between the two scanning sessions, participants received extensive further training. The goal of this manipulation was to test how spatial representations change with learning.

      Strengths:

      The authors designed a very comprehensive set of tasks in virtual reality to teach participants a novel spatial map. The spatial layout is well-designed to address the question of interest in principle. Participants were trained in a multi-day procedure, and representations were assessed twice, allowing the authors to investigate changes in the representation over multiple days.

      Weaknesses:

      Unfortunately, I see multiple problems with the experimental design that make it difficult to draw conclusions from the results.

      (1) In the JRD task (the key task in this paper), participants were instructed to imagine standing in front of the reference object and judge whether the second object was to their left or right. The authors assume that participants solve this task by retrieving the corresponding object locations from memory, rotating their imagined viewpoint and computing the target object's relative orientation. This is a challenging task, so it is not surprising that participants do not perform particularly well after the initial training (performance between 60-70% accuracy). Notably, the authors report that after extensive training, they reached more than 90% accuracy.

      However, I wonder whether participants indeed perform the task as intended by the authors, especially after the second training session. A much simpler behavioural strategy is memorising the mapping between a reference object and an associated button press, irrespective of the specific target object. This basic strategy should lead to quite high success rates, since the same direction is always correct for four of the eight objects (the two objects located at the door and the two opposite the door). For the four remaining objects, the correct button press is still the same for four of the six target objects that are not located opposite to the reference object. Simply memorising the button press associated with each reference object should therefore lead to a high overall task accuracy without the necessity to mentally simulate the spatial geometry of the object relations at all.

      I also wonder whether the random effect coefficients might reflect interindividual differences in such a strategy shift - someone who learnt this relationship between objects and buttons might show larger increases in RTs compared to someone who did not.

      (2) On a related note, the neural effect that appears to reflect the emergence of a global representation might be more parsimoniously explained by the formation of pairwise associations between reference and target objects. Since both objects always came from the same room, an RDM reflecting how many times an object pair acted as a reference-target pair will correlate with the categorical RDM reflecting the rooms corresponding to each object. Since the categorical RDM is highly correlated with the global RDM, this means that what the authors measure here might not reflect the formation of a global spatial map, but simply the formation of pairwise associations between objects presented jointly.

      (3) In general, the authors attribute changes in neural effects to new learning. But of course, many things can change between sessions (expectancy, fatigue, change in strategy, but also physiological factors...). Baseline phsiological effects are less likely to influence patterns of activity, so the RSA analyses should be less sensitive to this problem, but especially the basic differences in activation for the contrast of post-learning > pre-learning stages in the judgment of relative direction (JRD) task could in theory just reflect baseline differences in blood oxygenation, possibly due to differences in time of day, caffeine intake, sleep, etc. To really infer that any change in activity or representation is due to learning, an active control would have been great.

      (4) RSA typically compares voxel patterns associated with specific stimuli. However, the authors always presented two objects on the screen simultaneously. From what I understand, this is not considered in the analysis ("The β-maps for each reference object were averaged across trials to create an overall β-map for that object."). Furthermore, participants were asked to perform a complex mental operation on each trial ("imagine standing at A, looking at B, then perform the corresponding motor response"). Assuming that participants did this (although see points 1 and 2 above), this means that the resulting neural representation likely reflects a mixture of the two object representations, the mental transformation and the corresponding motor command, and possibly additionally the semantic and perceptual similarity between the two presented words. This means that the βs taken to reflect the reference object representation must be very noisy.

      This problem is aggravated by two additional points. Firstly, not all object pairs occurred equally often, because only a fraction of all potential pairs were sampled. If the selection of the object pairs is not carefully balanced, this could easily lead to sampling biases, which RSA is highly sensitive to.

      Secondly, the events in the scanner are not jittered. Instead, they are phase-locked to the TR (1.2 sec TR, 1.2 sec fixation, 4.8 sec stimulus presentation). This means that every object onset starts at the same phase of the image acquisition, making HRF sampling inefficient and hurting trial-wise estimation of betas used for the RSA. This likely significantly weakens the strength of the neural inferences that are possible using this dataset.

      (5) It is not clear why the authors focus their report of the results in the main manuscript on the preselected ROIs instead of showing whole-brain results. This can be misleading, as it provides the false impression that the neural effects are highly specific to those regions.

      (6) I am missing behavioural support for the authors' claims.

      Overall, I am not convinced that the main conclusion that global spatial representations emerge during learning is supported by the data. Unfortunately, I think there are some fundamental problems in the experimental design that might make it difficult to address the concerns.

      However, if the authors can provide convincing evidence for their claims, I think the paper will have an impact on the field. The question of how multi-scale representations are represented in the human brain is a timely and important one.

    2. Reviewer #2 (Public review):

      Summary:

      Qui and colleagues studied human participants who learned about the locations of 32 different objects located across 4 different rooms in a common spatial environment. Participants were extensively trained on the object locations, and fMRI scans were done during a relative direction judgement task in a pre- and post-session. Using RSA analysis, the authors report that the hippocampus increased global relative to local representations with learning; the RSC showed a similar pattern, but also increased effects of both global and local information with time.

      Strengths:

      (1) The manuscript asks a generally interesting question concerning the learning of global versus local spatial information.

      (2) The virtual environment task provides a rich and naturalistic spatial setting for participants, and the setup with 32 objects across 4 rooms is interesting.

      (3) The within-subject design and use of verbal cues for spatial retrieval is elegant .

      Weaknesses:

      (1) My main concern is that the global Euclidean distances and room identity are confounded. I fear this means that all neural effects in the RSA could be alternatively explained by associations to the visual features of the rooms that build up over time.

      (2) The direction judgement task is not very informative about cognitive changes, as only objects in a room are compared. The setup also discourages global learning, and leaves unclear whether participants focussed on learning the left/right relationships required by the task.

      (3) With N = 23, the power is low, and the effects are weak.

      (4) It appears no real multiple comparisons correction is done for the ROI based approach, and significance across ROIs is not tested directly.

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript by Qui et al. explores the issue of spatial learning in both local (rooms) and global (connected rooms) environments. The authors perform a pointing task, which involves either pressing the right or left button in the scanner to indicate where an object is located relative to another object. Participants are repeatedly exposed to rooms over sessions of learning, with one "pre" and one "post" learning session. The authors report that the hippocampus shifted from lower to higher RSA for the global but not the local environment after learning. RSC and OFC showed higher RSA for global object pointing. Other brain regions also showed effects, including ACC, which seemed to show a similar pattern as the hippocampus, as well as other regions shown in Figure S5. The authors attempt to tie their results in with local vs. global spatial representations.

      Strengths:

      Extensive testing of subjects before and after learning a spatial environment, with data suggesting that there may be fMRI codes sensitive to both global and local codes. Behavioral data suggest that subjects are performing well at the task and learning both global and local object locations, although see further comments.

      Weaknesses:

      (1) The authors frame the entire introduction around confirming the presence of the cognitive map either locally or globally. There are some significant issues with this framing. For one, the introduction appears to be confirmatory and not testing specific hypotheses that can be falsified. What exactly are the hypotheses being tested? I believe that this relates to the testing whether neural representations are global and/or local. However, this is not clear. Given that a previous paper (Marchette et al. 2014 Nature Neuro, which bears many similarities) showed only local coding in RSC, this paper needs to be discussed in far more depth in terms of its similarities and differences. This paper looked at both position and direction, while the current paper looks at direction. Even here, direction in the current study is somewhat impoverished: it involves either pointing right or left to an object, and much of this could be categorical or even lucky guesses. From what I could tell, all behavioral inferences are based on reaction time and not accuracy, and therefore, it is difficult to determine if the subject's behavior actually reflects knowledge gained or simply faster reaction time, either due to motor learning or a speed-accuracy trade-off. The pointing task is largely egocentric: it can be solved by remembering a facing direction and an object relative to that. It is not the JRD task as has been used in other studies (e.g., Huffman et al. 2019 Neuron), which is continuous and has an allocentric component. This "version" of the task would be largely egocentric. In this way, the pointing task used does not test the core tenets of the cognitive map during navigation, which is defined as allocentric and Euclidean (please see O'Keefe and Nadel 1978, The Hippocampus as a Cognitive Map). Since neither of these assumptions appears valid, the paper should be reframed to reflect spatial representations more broadly or even egocentric spatial representations.

      (2) The fMRI data workup is insufficient. What do the authors mean by "deactivations" in Figure 3b? Does this mean the object task showed more activation than the spatial task in HSC? Given that HSC is one of these regions, this would seem to suggest that the hippocampus is more involved in object than spatial processing, although it is difficult to tell from how things are written. The RSA is more helpful, but now a concern is that the analysis focuses on small clusters that are based on analyses determined previously. This appears to be the case for the correlations shown in Figure 3e as well. The issues here are several-fold. For one, it has been shown in previous work that basing secondary analyses on related first analyses can inflate the risk of false positives (i.e., Kriegeskorte et al. 2009 Nature Neuro). The authors should perform secondary analyses in ways that are unbiased by the first analyses, preferably, selecting cluster centers (if they choose to go this route) from previous papers rather than their own analyses. Another option would be to perform analyses at the level of the entire ROI, meaning that the results would generalize more readily. The authors should also perform permutation tests to ensure that the RSA results are reliable, as these can run the risk of false positives (e.g., Nolan et al. 2018 eNeuro). If these results hold, the authors should perform post-hoc (corrected) t-tests for global vs. local before and after learning to ensure these differences are robust and not simply rely on the interaction effect. The figures were difficult to follow in this regard, and an interaction effect does not necessarily mean the differences that are critical (global higher than local after) are necessarily significant. The end part of the results was hard to follow. If ACC showed a similar effect to HC and RSC, why is it not being considered? Many other areas that seemed to show local vs. global effects were dismissed, but these should instead be discussed in terms of whether they are consistent or inconsistent with the hypotheses.

      (3) Concerns about the discussion: there are areas involving reverse inference about brain areas rather than connecting the findings with hypotheses (see Poldrack et al. 2006 Trends in Cognitive Science). The authors also argue for 'transfer" of information (for example, from ACC to OFC), but did not perform any connectivity analyses, so these conclusions are not based on any results. Instead, the authors should carefully compare what can be concluded from the reaction time findings and the fMRI data. What is consistent vs. inconsistent with the hypotheses? The authors should also provide a much more detailed comparison with past work. The Marchette et al. paper comes to different conclusions regarding RSC and involves more detailed analyses than those done here, including position. What is different in the current paper that might explain the differences in results? Another previous paper that came to a different conclusion (hippocampus local, retrosplenial global) and should be carefully considered and compared, as it also involved learning of environments and comparisons at different phases (e.g., Wolbers & Buchel 2005 J Neuro). Other papers that have used the JRD task have demonstrated similar, although not identical, networks (e.g., Huffman et al. 2019 Neuron) and the results here should be more carefully compared, as the current task is largely egocentric while the Huffman et al. paper involves a continuous and allocentric version of the JRD task.

      (4) The authors cite rodent papers involving single neuron recordings. These are quite different experiments, however: they involve rodents, the rodents are freely moving, and single neurons are recorded. Here, the study involves humans who are supine and an indirect vascular measure of neural activity. Citations should be to studies of spatial memory and navigation in humans using fMRI: over-reliance on rodent studies should be avoided for the reasons mentioned above.

    1. Reviewer #1 (Public review):

      Summary:

      Englert et al. proposed a functional connectivity-based Attractor Neural Network (fcANN) to reveal attractor states and activity flows across various conditions, including resting state, task-evoked, and pathological conditions. The large-scale brain attractors reconstructed by fcANNs are orthogonal organization, which is in line with the free-energy theoretical framework. Additionally, the fcANN demonstrates differences in attractor states between individuals with autism and typically developing individuals.

      The study used seven datasets, which ensures robust replication and validation of generalization across various conditions. The study is a representative example that combines experimental evidence based on fcANN and the theoretical framework. The fcANN projection offers an interesting way of visualization, allowing researchers to observe attractor states and activity flow patterns directly. Overall, the study may offer valuable insights into brain dynamics and computational neuroscience.

      Comments on revision:

      The authors have addressed my previous concerns and substantially improved the manuscript. Fig.4 and Fig.5 still keep fcHNN rather than the updated fcANN.

    2. Reviewer #2 (Public review):

      Summary:

      Englert et al. use a novel modelling approach called functional connectome-based Hopfield Neural Networks (fcHNN) to describe spontaneous and task-evoked brain activity, and the alterations in brain disorders. Given its novelty, the authors first validate the model parameters (the temperature and noise) with empirical resting-state function data and against null models. Through the optimisation of the temperature parameter, they first show that the optimal number of attractor states is four before fixing the optimal noise that best reflects the empirical data, through stochastic relaxation. Then, they demonstrate how these fcHNN generated dynamics predict task-based functional activity relating to pain and self-regulation. To do so, they characterise the different brain states (here as different conditions of the experimental pain paradigm) in terms of the distribution of the data on the fcHNN projections and flow-analysis. Lastly, a similar analysis was performed on a population with autism condition. Through Hopfield modeling, this work proposes a comprehensive framework that links various types of functional activity under a unified interpretation with high predictive validity.

      Strengths:

      The phenomenological nature of the Hopfield model and its validation across multiple datasets presents a comprehensive and intuitive framework for the analysis of functional activity. The results presented in this work further motivate the study of phenomenological models as an adequate mechanistic characterisation of large-scale brain activity.

      Following up from Cole et al. 2016, the authors put forward a hypothesis that many of the changes to the brain activity, here, in terms of task-evoked and clinical data, can be inferred from the resting-state brain data alone. This brings together neatly the idea of different facets of brain activity emerging from a common space of functional (ghost) attractors.

      The use of the null models motivates the benefit for non-linear dynamics in the context of phenomenological models when assessing the similarity to the real empirical data.

      Comments on revision:

      I am happy with how the authors addressed the comments and am happy to move ahead without further comments.

    1. Reviewer #2 (Public review):

      Summary:

      Tissue-resident macrophages are more and more thought to exert key homeostatic functions and contribute to physiological responses. In the report of O'Brien and Colleagues, the idea that the macrophage-expressed scavenger receptor MARCO could regulate adrenal corticosteroid output at steady-state was explored. The authors found that male MARCO-deficient mice exhibited higher plasma aldosterone levels and higher lung ACE expression as compared to wild-type mice, while the availability of cholesterol and the machinery required to produce aldosterone in the adrenal gland were not affected by MARCO deficiency. The authors take these data to conclude that MARCO in alveolar macrophages can negatively regulate ACE expression and aldosterone production at steady-state and that MARCO-deficient mice suffer from a secondary hyperaldosteronism.

      Strengths:

      If properly demonstrated and validated, the fact that tissue-resident macrophages can exert physiological functions and influence endocrine systems would be highly significant and could be amenable to novel therapies.

      Major weakness:

      The comparison between C57BL/6J wild-type mice and knock-out mice for which a precise information about the genetic background and the history of breedings and crossings is lacking can lead to misinterpretations of the results obtained. Hence, MARCO-deficient mice should be compared with true littermate controls.

    1. Reviewer #1 (Public review):

      Summary:

      The article presents the details of the high-resolution light-sheet microscopy system developed by the group. In addition to presenting the technical details of the system, its resolution has been characterized and its functionality demonstrated by visualizing subcellular structures in a biological sample.

      Strengths:

      (1) The article includes extensive supplementary material that complements the information in the main article.

      (2) However, in some sections, the information provided is somewhat superficial.

      Weaknesses:

      (1) Although a comparison is made with other light-sheet microscopy systems, the presented system does not represent a significant advance over existing systems. It uses high numerical aperture objectives and Gaussian beams, achieving resolution close to theoretical after deconvolution. The main advantage of the presented system is its ease of construction, thanks to the design of a perforated base plate.

      (2) Using similar objectives (Nikon 25x and Thorlabs 20x), the results obtained are similar to those of the LLSM system (using a Gaussian beam without laser modulation). However, the article does not mention the difficulties of mounting the sample in the implemented configuration.

      (3) The authors present a low-cost, open-source system. Although they provide open source code for the software (navigate), the use of proprietary electronics (ASI, NI, etc.) makes the system relatively expensive. Its low cost is not justified.

      (4) The fibroblast images provided are of exceptional quality. However, these are fixed samples. The system lacks the necessary elements for monitoring cells in vivo, such as temperature or pH control.

    2. Reviewer #2 (Public review):

      Summary:

      The authors present Altair-LSFM (Light Sheet Fluorescence Microscope), a high-resolution, open-source microscope, that is relatively easy to align and construct and achieves sub-cellular resolution. The authors developed this microscope to fill a perceived need that current open-source systems are primarily designed for large specimens and lack sub-cellular resolution or are difficult to construct and align, and are not stable. While commercial alternatives exist that offer sub-cellular resolution, they are expensive. The authors' manuscript centers around comparisons to the highly successful lattice light-sheet microscope, including the choice of detection and excitation objectives. The authors thus claim that there remains a critical need for high-resolution, economical, and easy-to-implement LSFM systems.

      Strengths:

      The authors succeed in their goals of implementing a relatively low-cost (~ USD 150K) open-source microscope that is easy to align. The ease of alignment rests on using custom-designed baseplates with dowel pins for precise positioning of optics based on computer analysis of opto-mechanical tolerances, as well as the optical path design. They simplify the excitation optics over Lattice light-sheet microscopes by using a Gaussian beam for illumination while maintaining lateral and axial resolutions of 235 and 350 nm across a 260-um field of view after deconvolution. In doing so they rest on foundational principles of optical microscopy that what matters for lateral resolution is the numerical aperture of the detection objective and proper sampling of the image field on to the detection, and the axial resolution depends on the thickness of the light-sheet when it is thinner than the depth of field of the detection objective. This concept has unfortunately not been completely clear to users of high-resolution light-sheet microscopes and is thus a valuable demonstration. The microscope is controlled by an open-source software, Navigate, developed by the authors, and it is thus foreseeable that different versions of this system could be implemented depending on experimental needs while maintaining easy alignment and low cost. They demonstrate system performance successfully by characterizing their sheet, point-spread function, and visualization of sub-cellular structures in mammalian cells, including microtubules, actin filaments, nuclei, and the Golgi apparatus.

      Weaknesses:

      There is a fixation on comparison to the first-generation lattice light-sheet microscope, which has evolved significantly since then:

      (1) The authors claim that commercial lattice light-sheet microscopes (LLSM) are "complex, expensive, and alignment intensive", I believe this sentence applies to the open-source version of LLSM, which was made available for wide dissemination. Since then, a commercial solution has been provided by 3i, which is now being used in multiple cores and labs but does require routine alignments. However, Zeiss has also released a commercial turn-key system, which, while expensive, is stable, and the complexity does not interfere with the experience of the user. Though in general, statements on ease of use and stability might be considered anecdotal and may not belong in a scientific article, unreferenced or without data.

      (2) One of the major limitations of the first generation LLSM was the use of a 5 mm coverslip, which was a hinderance for many users. However, the Zeiss system elegantly solves this problem, and so does Oblique Plane Microscopy (OPM), while the Altair-LSFM retains this feature, which may dissuade widespread adoption. This limitation and how it may be overcome in future iterations is not discussed.

      (3) Further, on the point of sample flexibility, all generations of the LLSM, and by the nature of its design, the OPM, can accommodate live-cell imaging with temperature, gas, and humidity control. It is unclear how this would be implemented with the current sample chamber. This limitation would severely limit use cases for cell biologists, for which this microscope is designed. There is no discussion on this limitation or how it may be overcome in future iterations.

      (4) The authors' comparison to LLSM is constrained to the "square" lattice, which, as they point out, is the most used optical lattice (though this also might be considered anecdotal). The LLSM original design, however, goes far beyond the square lattice, including hexagonal lattices, the ability to do structured illumination, and greater flexibility in general in terms of light-sheet tuning for different experimental needs, as well as not being limited to just sample scanning. Thus, the Alstair-LSFM cannot compare to the original LLSM in terms of versatility, even if comparisons to the resolution provided by the square lattice are fair.

      (5) There is no demonstration of the system's live-imaging capabilities or temporal resolution, which is the main advantage of existing light-sheet systems.

      While the microscope is well designed and completely open source, it will require experience with optics, electronics, and microscopy to implement and align properly. Experience with custom machining or soliciting a machine shop is also necessary. Thus, in my opinion, it is unlikely to be implemented by a lab that has zero prior experience with custom optics or can hire someone who does. Altair-LSFM may not be as easily adaptable or implementable as the authors describe or perceive in any lab that is interested, even if they can afford it. The authors indicate they will offer "workshops," but this does not necessarily remove the barrier to entry or lower it, perhaps as significantly as the authors describe.

      There is a claim that this design is easily adaptable. However, the requirement of custom-machined baseplates and in silico optimization of the optical path basically means that each new instrument is a new design, even if the Navigate software can be used. It is unclear how Altair-LSFM demonstrates a modular design that reduces times from conception to optimization compared to previous implementations.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript introduces a high-resolution, open-source light-sheet fluorescence microscope optimized for sub-cellular imaging.

      The system is designed for ease of assembly and use, incorporating a custom-machined baseplate and in silico optimized optical paths to ensure robust alignment and performance. The authors demonstrate lateral and axial resolutions of ~235 nm and ~350 nm after deconvolution, enabling imaging of sub-diffraction structures in mammalian cells.

      The important feature of the microscope is the clever and elegant adaptation of simple gaussian beams, smart beam shaping, galvo pivoting and high NA objectives to ensure a uniform thin light-sheet of around 400 nm in thickness, over a 266 micron wide Field of view, pushing the axial resolution of the system beyond the regular diffraction limited-based tradeoffs of light-sheet fluorescence microscopy.

      Compelling validation using fluorescent beads and multicolor cellular imaging highlights the system's performance and accessibility. Moreover, a very extensive and comprehensive manual of operation is provided in the form of supplementary materials. This provides a DIY blueprint for researchers who want to implement such a system.

      Strengths:

      (1) Strong and accessible technical innovation:

      With an elegant combination of beam shaping and optical modelling, the authors provide a high-resolution light-sheet system that overcomes the classical light-sheet tradeoff limit of a thin light-sheet and a small field of view. In addition, the integration of in silico modelling with a custom-machined baseplate is very practical and allows for ease of alignment procedures. Combining these features with the solid and super-extensive guide provided in the supplementary information, this provides a protocol for replicating the microscope in any other lab.

      (2) Impeccable optical performance and ease of mounting of samples:

      The system takes advantage of the same sample-holding method seen already in other implementations, but reduces the optical complexity. At the same time, the authors claim to achieve similar lateral and axial resolution to Lattice-light-sheet microscopy (although without a direct comparison (see below in the "weaknesses" section). The optical characterization of the system is comprehensive and well-detailed. Additionally, the authors validate the system imaging sub-cellular structures in mammalian cells.

      (3) Transparency and comprehensiveness of documentation and resources:

      A very detailed protocol provides detailed documentation about the setup, the optical modeling, and the total cost.

      Weaknesses:

      (1) Limited quantitative comparisons:

      Although some qualitative comparison with previously published systems (diSPIM, lattice light-sheet) is provided throughout the manuscript, some side-by-side comparison would be of great benefit for the manuscript, even in the form of a theoretical simulation. While having a direct imaging comparison would be ideal, it's understandable that this goes beyond the interest of the paper; however, a table referencing image quality parameters (taken from the literature), such as signal-to-noise ratio, light-sheet thickness, and resolutions, would really enhance the features of the setup presented. Moreover, based also on the necessity for optical simplification, an additional comment on the importance/difference of dual objective/single objective light-sheet systems could really benefit the discussion.

      (2) Limitation to a fixed sample:

      In the manuscript, there is no mention of incubation temperature, CO₂ regulation, Humidity control, or possible integration of commercial environmental control systems. This is a major limitation for an imaging technique that owes its popularity to fast, volumetric, live-cell imaging of biological samples.

      (3) System cost and data storage cost:

      While the system presented has the advantage of being open-source, it remains relatively expensive (considering the 150k without laser source and optical table, for example). The manuscript could benefit from a more direct comparison of the performance/cost ratio of existing systems, considering academic settings with budgets that most of the time would not allow for expensive architectures. Moreover, it would also be beneficial to discuss the adaptability of the system, in case a 30k objective could not be feasible. Will this system work with different optics (with the obvious limitations coming with the lower NA objective)? This could be an interesting point of discussion. Adaptability of the system in case of lower budgets or more cost-effective choices, depending on the needs.

      Last, not much is said about the need for data storage. Light-sheet microscopy's bottleneck is the creation of increasingly large datasets, and it could be beneficial to discuss more about the storage needs and the quantity of data generated.

      Conclusion:

      Altair-LSFM represents a well-engineered and accessible light-sheet system that addresses a longstanding need for high-resolution, reproducible, and affordable sub-cellular light-sheet imaging. While some aspects-comparative benchmarking and validation, limitation for fixed samples-would benefit from further development, the manuscript makes a compelling case for Altair-LSFM as a valuable contribution to the open microscopy scientific community.

    1. Reviewer #1 (Public review):

      Summary:

      Zhou and colleagues developed a computational model of replay that heavily builds on cognitive models of memory in context (e.g., the context-maintenance and retrieval model), which have been successfully used to explain memory phenomena in the past. Their model produces results that mirror previous empirical findings in rodents and offers a new computational framework for thinking about replay.

      Strengths:

      The model is compelling and seems to explain a number of findings from the rodent literature. It is commendable that the authors implement commonly used algorithms from wakefulness to model sleep/rest, thereby linking wake and sleep phenomena in a parsimonious way. Additionally, the manuscript's comprehensive perspective on replay, bridging humans and non-human animals, enhanced its theoretical contribution.

      Weaknesses:

      This reviewer is not a computational neuroscientist by training, so some comments may stem from misunderstandings. I hope the authors would see those instances as opportunities to clarify their findings for broader audiences.

      (1) The model predicts that temporally close items will be co-reactivated, yet evidence from humans suggests that temporal context doesn't guide sleep benefits (instead, semantic connections seem to be of more importance; Liu and Ranganath 2021, Schechtman et al 2023). Could these findings be reconciled with the model or is this a limitation of the current framework?

      (2) During replay, the model is set so that the next reactivated item is sampled without replacement (i.e., the model cannot get "stuck" on a single item). I'm not sure what the biological backing behind this is and why the brain can't reactivate the same item consistently. Furthermore, I'm afraid that such a rule may artificially generate sequential reactivation of items regardless of wake training. Could the authors explain this better or show that this isn't the case?

      (3) If I understand correctly, there are two ways in which novelty (i.e., less exposure) is accounted for in the model. The first and more talked about is the suppression mechanism (lines 639-646). The second is a change in learning rates (lines 593-595). It's unclear to me why both procedures are needed, how they differ, and whether these are two different mechanisms that the model implements. Also, since the authors controlled the extent to which each item was experienced during wakefulness, it's not entirely clear to me which of the simulations manipulated novelty on an individual item level, as described in lines 593-595 (if any).

      As to the first mechanism - experience-based suppression - I find it challenging to think of a biological mechanism that would achieve this and is selectively activated immediately before sleep (somehow anticipating its onset). In fact, the prominent synaptic homeostasis hypothesis suggests that such suppression, at least on a synaptic level, is exactly what sleep itself does (i.e., prune or weaken synapses that were enhanced due to learning during the day). This begs the question of whether certain sleep stages (or ultradian cycles) may be involved in pruning, whereas others leverage its results for reactivation (e.g., a sequential hypothesis; Rasch & Born, 2013). That could be a compelling synthesis of this literature. Regardless of whether the authors agree, I believe that this point is a major caveat to the current model. It is addressed in the discussion, but perhaps it would be beneficial to explicitly state to what extent the results rely on the assumption of a pre-sleep suppression mechanism.

      (4) As the manuscript mentions, the only difference between sleep and wake in the model is the initial conditions (a0). This is an obvious simplification, especially given the last author's recent models discussing the very different roles of REM vs NREM. Could the authors suggest how different sleep stages may relate to the model or how it could be developed to interact with other successful models such as the ones the last author has developed (e.g., C-HORSE)? Finally, I wonder how the model would explain findings (including the authors') showing a preference for reactivation of weaker memories. The literature seems to suggest that it isn't just a matter of novelty or exposure, but encoding strength. Can the model explain this? Or would it require additional assumptions or some mechanism for selective endogenous reactivation during sleep and rest?

      (5) Lines 186-200 - Perhaps I'm misunderstanding, but wouldn't it be trivial that an external cue at the end-item of Figure 7a would result in backward replay, simply because there is no potential for forward replay for sequences starting at the last item (there simply aren't any subsequent items)? The opposite is true, of course, for the first-item replay, which can't go backward. More generally, my understanding of the literature on forward vs backward replay is that neither is linked to the rodent's location. Both commonly happen at a resting station that is further away from the track. It seems as though the model's result may not hold if replay occurs away from the track (i.e. if a0 would be equal for both pre- and post-run).

      (6) The manuscript describes a study by Bendor & Wilson (2012) and tightly mimics their results. However, notably, that study did not find triggered replay immediately following sound presentation, but rather a general bias toward reactivation of the cued sequence over longer stretches of time. In other words, it seems that the model's results don't fully mirror the empirical results. One idea that came to mind is that perhaps it is the R/L context - not the first R/L item - that is cued in this study. This is in line with other TMR studies showing what may be seen as contextual reactivation. If the authors think that such a simulation may better mirror the empirical results, I encourage them to try. If not, however, this limitation should be discussed.

      (7) There is some discussion about replay's benefit to memory. One point of interest could be whether this benefit changes between wake and sleep. Relatedly, it would be interesting to see whether the proportion of forward replay, backward replay, or both correlated with memory benefits. I encourage the authors to extend the section on the function of replay and explore these questions.

      (8) Replay has been mostly studied in rodents, with few exceptions, whereas CMR and similar models have mostly been used in humans. Although replay is considered a good model of episodic memory, it is still limited due to limited findings of sequential replay in humans and its reliance on very structured and inherently autocorrelated items (i.e., place fields). I'm wondering if the authors could speak to the implications of those limitations on the generalizability of their model. Relatedly, I wonder if the model could or does lead to generalization to some extent in a way that would align with the complementary learning systems framework.

    2. Reviewer #3 (Public review):

      In this manuscript, Zhou et al. present a computational model of memory replay. Their model (CMR-replay) draws from temporal context models of human memory (e.g., TCM, CMR) and claims replay may be another instance of a context-guided memory process. During awake learning, CMR-replay (like its predecessors) encodes items alongside a drifting mental context that maintains a recency-weighted history of recently encoded contexts/items. In this way, the presently encoded item becomes associated with other recently learned items via their shared context representation - giving rise to typical effects in recall such as primacy, recency and contiguity. Unlike its predecessors, CMR-replay has built in replay periods. These replay periods are designed to approximate sleep or wakeful quiescence, in which an item is spontaneously reactivated, causing a subsequent cascade of item-context reactivations that further update the model's items-context associations.

      Using this model of replay, Zhou et al. were able to reproduce a variety of empirical findings in the replay literature: e.g., greater forward replay at the beginning of a track and more backwards replay at the end; more replay for rewarded events; the occurrence of remote replay; reduced replay for repeated items, etc. Furthermore, the model diverges considerably (in implementation and predictions) from other prominent models of replay that, instead, emphasize replay as a way of predicting value from a reinforcement learning framing (i.e., EVB, expected value backup).

      Overall, I found the manuscript clear and easy to follow, despite not being a computational modeller myself. (Which is pretty commendable, I'd say). The model also was effective at capturing several important empirical results from the replay literature while relying on a concise set of mechanisms - which will have implications for subsequent theory building in the field.

      The authors addressed my concerns with respect to adding methodological detail. I am satisfied with the changes.

    1. Joint Public Review:

      Meiotic recombination begins with DNA double-strand breaks (DSBs) generated by the conserved enzyme Spo11, which relies on several accessory factors that vary widely across eukaryotes. In C. elegans, multiple proteins have been implicated in promoting DSB formation, but their functional relationships and how they collectively recruit the DSB machinery to chromosome axes have remained unclear.

      In this study, Raices et al. investigate the biochemical and genetic interactions among known DSB-promoting factors in C. elegans meiosis. Using yeast two-hybrid assays and co-immunoprecipitation, they map pairwise protein interactions and identify a connection between the chromatin-associated protein HIM-17 and the transcription factor XND-1. They also confirm the established interaction between DSB-1 and SPO-11 and show that DSB-1 associates with the nematode-specific factor HIM-5, which is required for X-chromosome DSB formation.

      The authors extend these findings with genetic analyses, placing these factors into four epistasis groups based on single- and double-mutant phenotypes. Together, these biochemical and genetic data support a model describing how these proteins engage chromatin loops and localize to chromosome axes. The work provides a clearer view of how C. elegans assembles its DSB-forming machinery and how this process compares to mechanisms in other organisms.

      Comment from the Reviewing Editor on the revised version:

      The authors have adequately addressed the prior review comments. At this point, after going through multiple rounds of reviews and revisions, the community will be better served by having this paper out in public. This version was assessed by the editors without further input from the reviewers.

    1. Reviewer #1 (Public review):

      Summary:

      In their article, Guo and coworkers investigate the Ca²⁺ signaling responses induced by Enteropathogenic Escherichia coli (EPEC) in epithelial cells and how these responses regulate NF-κB activation. The authors show that EPEC induces rapid, spatially coordinated Ca²⁺ transients mediated by extracellular ATP released through the type III secretion system (T3SS). Using high-speed Ca²⁺ imaging and stochastic modeling, they propose that low ATP levels trigger "Coordinated Ca²⁺ Responses from IP₃R Clusters" (CCRICs) via fast Ca²⁺ diffusion and Ca²⁺-induced Ca²⁺ release. These responses may dampen TNF-α-induced NF-κB activation through Ca²⁺-dependent modulation of O-GlcNAcylation of p65. The interdisciplinary work suggests a new perspective on calcium-mediated immune response by combining quantitative imaging, bacterial genetics, and computational modeling.

      Strengths:

      The study provides a new concept for host responses to bacterial infections and introduces the concept of Coordinated Ca²⁺ Responses from IP₃R Clusters (CCRICs) as synchronized, whole-cell-scale Ca²⁺ transients with the fast kinetics typical of local events. This is elegantly done by an interdisciplinary approach using quantitative measurements and mechanistic modelling.

      Weaknesses:

      (1) The effect of coordination by fast diffusion for small eATP concentrations is explained by the resulting low Ca2+ concentration that is not as strongly affected by calcium buffers compared to higher concentrations. While I agree with this statement on the relative level, CICR is based on the resulting absolute concentration at neighboring IP3Rs (to activate them). Thus, I do not fully agree with the explanation, or at least would expect to use the modelling approach to demonstrate this effect. Simulations for different activation and buffer concentrations could strengthen this point and exclude potential inhibition of channels at higher stimulation levels.

      In this respect, I would also include the details of the modelling, such as implementation environment, parameters, and benchmarking. The description in the Supplementary Methods is very similar to the description in the main text. In terms of reproducibility, it would be important to at least provide simulation parameters, and providing the code would align with the emerging standards for reproducible science.

      (2) Quantitative characterization of CCRICs:

      The paper would benefit from a clearer definition of the term CCRICs and quantitative descriptors like duration, amplitude distribution, frequency, and spatial extent (also in relation to the comment on the EGTA measurements below). Furthermore, it remains unclear to me whether CCRICs represent a population of rapidly propagating micro-waves or truly simultaneous events. Maybe kymographs or wave-front propagation analyses (at least from simulations if experimental resolution is too bad) would strengthen this point.

      (3) Specificity of pharmacological tools:

      Suramin and U73122 are known to have off-target effects. Control experiments using alternative P2 receptor antagonists like PPADS or inactive U73343 analogs would strengthen the causal link.