10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      This rigorous and creative study uses an elegant combination of metabolomics, transcriptomics, and budding yeast molecular genetics to discover that (i) activating AMPK to maintain mitochondrial respiration fueled by cytosolic Acetyl CoA and (ii) increasing fatty acid synthesis independent of respiration drive independent pathways that increase the fitness of replicatively-aged budding yeast cells, albeit without increasing their lifespan. This work will be of interest to scientists in the field of aging and metabolism. Some clarifications in the text would address the following concerns, which would increase the impact of the study:

      (1) What does activation of AMPK (via PGDP-Sak1 expression) do to the replicative lifespan? How many bud scars, in general, do the subpopulations that are older - yet have less Tom70 (increased mitochondrial fitness) - have, after the 48 hrs time point that they are examining? How many divisions occurred in this 48hr time period - i.e. is it long enough to have all cells reach the end of their replicative lifespan? This information is important to rule out that a subset of the mutant cells just divided faster and hence had more divisions within 48 hrs (growing faster and living longer are different things). Having identical growth curves doesn't indicate per se that they all divide at the same rate, as there may be a subpopulation that divides faster and a subpopulation that doesn't grow so well.

      (2) A2A cells do not have an extended replicative lifespan (RLS) but show an increase in the "low senescence" population (Figure 2). If the cells are not becoming senescent, why don't they have longer RLS? Not having a longer lifespan seems inconsistent with the statement that "bud scar counting confirmed that A2A cells reach a higher age than wild type", which comes back to how many times the cells can divide in the 48hr timepoint studied and their rate of cell division? Also, the lifespan curve shown is plotted against time, not cell division number, which does not take into account different division times of cells within the population (described above). It would be much more useful to show standard lifespan curves showing cell division numbers per lifespan per cell.

      (3) Increased "fitness" of the old cells is implied from the increased size of the colonies that the old cells can make. However, this is a measure of the fitness of the daughters per se, not the old mother cells. Are the old mothers just passing on healthier mitochondria and more lipids to the daughters, such that they can divide more times? If the aged cells have an "increased fitness", why don't they divide more times themselves (i.e. live longer?).

      (4) The statement is made that "these experiments define two classes of aging cells with distinct metabolic needs, coherent with the model of two aging trajectories previously proposed (referencing Nan Hao's work)". However, the big difference here is that in Nan Hao's work, their two aging trajectories influenced the length of lifespan, but that does not appear to be the case here. That distinction should be made clear. Perhaps the authors could also speculate as to why the A2A yeast stops dividing after presumably the same number of cell divisions, even though they have an activated AMPK and activated fatty acid synthesis pathway.

      (5) I am a bit confused by the use of the word "senescence" by this lab here and in their previous growth on galactose studies. If yeast don't senesce, which is usually defined as an irreversible arrest of the cell cycle where cells stop dividing, shouldn't the yeast that do not senesce still be dividing and hence have a longer lifespan? Should a different term be used rather than senescence? Such as "fitness late in life". The authors giving their definition of senescence may help reduce this apparent contradiction.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigate how cytosolic acetyl-CoA metabolism influences replicative aging in budding yeast. They propose that acetyl-CoA regulates aging through three major pathways: (1) mitochondrial transport to support mitochondrial function, (2) fatty acid synthesis, and (3) global protein acetylation. The data show that AMPK activation promotes mitochondrial import of acetyl-CoA and partially mitigates mitochondrial decline in a subset of aging cells.

      Furthermore, the engineered A2A strain, which enhances mitochondrial acetyl-CoA utilization while relieving inhibition of fatty acid synthesis, increases the proportion of cells exhibiting a "low senescence" phenotype.

      Overall, this is a thoughtful and potentially impactful study that advances our understanding of metabolic control of aging. Addressing the points below, particularly by refining interpretations and, where feasible, incorporating additional analyses, will further strengthen the manuscript and its conclusions.

      Strengths:

      The study has several notable strengths. It addresses an important question by shifting the focus from lifespan to preservation of late-life fitness, which is highly relevant to aging biology. The work integrates metabolic, genetic, and functional analyses to link cytosolic acetyl-CoA flux with distinct aging outcomes, and the engineering of the A2A strain provides a clear and elegant demonstration of how coordinated pathway modulation can improve cellular fitness.

      Weaknesses:

      (1) While the manuscript focuses on mitochondrial transport and fatty acid synthesis, cytosolic acetyl-CoA is also a key regulator of histone acetylation and chromatin silencing. It would strengthen the study to consider whether acetyl-CoA depletion contributes to improved fitness through enhanced rDNA silencing. Given the well-established role of rDNA instability in yeast aging, additional experiments examining rDNA silencing and stability would be valuable. For example, monitoring rDNA copy number changes (not necessarily ERCs) under AMPK activation, oleic acid supplementation, and in the A2A strain, similar to approaches used in the authors' prior work, would help clarify whether chromatin regulation contributes to the observed phenotypes.

      (2) The current data do not fully distinguish whether AMPK activation and oleic acid supplementation act on distinct subpopulations of aging cells. An alternative explanation is that oleic acid supplementation enhances mitochondrial function and acts additively with AMPK activation, thereby increasing the fraction of cells in the "low senescence" state. Since this distinction is not central to the main conclusions, I suggest softening the language around subpopulation specificity. Emphasizing instead that the A2A strain coordinately modulates multiple branches of acetyl-CoA metabolism to improve late-life fitness would maintain the strength of the central message without overinterpretation.

      (3) The manuscript proposes that lipid starvation and excess acetyl-CoA are major drivers of senescence in distinct subpopulations of wild-type aging cells. This conclusion is not yet fully supported by the presented data. Direct measurements of age-dependent divergence in acetyl-CoA and fatty acid levels at the single-cell level would be needed to substantiate this model. Based on the current evidence, a more conservative interpretation would be that aging cells exhibit differential sensitivity to perturbations in acetyl-CoA and lipid metabolism. Accordingly, I recommend revising the statement in the Abstract ("We further implicate lipid starvation and excess acetyl coenzyme A availability as major drivers of senescence...") and the corresponding discussion text to better align with the data.

    3. Reviewer #3 (Public review):

      Summary:

      These findings suggest that PGPD-SAK1 yeast show a subpopulation with lowered TOM70-GFP expression in high bud scar staining aged cells. Deletion of CAT2 or MLS1 reduces this effect. A PGPD-SAK1 acc1S1157A double mutant (called "A2A" here) shows an even larger effect of lowered tom70 expression in high bud scar staining aged cells. Utilization of various additional mutants involved in acetyl-CoA transport, carnitine shuttle, respiration, etc., leads the authors to conclude that these shifts in TOM70-GFP in aged cells are linked to the AMPK-fatty acid metabolic regulatory system.

      Strengths:

      These extensive and clearly described experiments reveal interesting changes in TOM70-GFP intensity in subsets of aged yeast in several mutants eventually identified as linked to the AMPK-fatty acid metabolic regulatory system.

      Weaknesses:

      (1) 3 biological replicates for mRNASeq is low.

      (2) While "Traditional conceptions of ageing implicate a progressive accumulation of damage leading to systemic degradation in performance until death, with evolutionary pressures acting to maximise early life fitness and fecundity at the expense of ageing health." is tangential perhaps to the data and conclusions of the study, both claims of this sentence are at best controversial, and the manuscript is no weaker for their omission.

      (3) The statement that "Here, we determine the basis of senescence and fitness loss in replicatively ageing yeast" is a bit strong as a summary of the present careful work presented here. If the authors had created yeast mutants that retained fitness indefinitely, this would be a more appropriate strength of claim to summarize the work.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this work the authors investigate the molecular dynamics of MinD, a component of the Bacillus subtilis Min system, in vitro and in vivo. In Escherichia coli the Min system is highly dynamic and displays rapid pole to pole oscillation whereby a time average minimum of the Min proteins at mid cell is established. However, in B. subtilis, this is not the case, and there is no MinE present. MinD in B. subtilis dynamically relocalizes from the poles to division sites, and binds to MinC and MinJ, which mediates its interaction with DivIVA. This paper reports biochemical characterization of B. subtilis MinD in vitro and dynamics of MinD variants in vivo, providing mechanistic insight into the mechanism of dynamic localization.

      Strengths:

      In the current study, the authors perform a detailed biochemical characterization of the in vitro ATPase activity of MinD and demonstrate that rapid hydrolysis is elicited by adding phospholipids. They further show using a collection of substitution mutants of MinD that both monomers and dimers bind to the membrane, and ATP occupancy changes the on and off rates. Identification, quantification, and tracking of discrete Halo-MinD populations was nicely done and showed that mutations in MinD alter dynamic localization, correlating with PL binding on and off rates in vitro.

      In the revised manuscript, the authors now demonstrate localization and tracking data for minC and minJ deletion strains, which suggest that MinJ impacts MinD membrane cycling, but MinC does not. Additional in vitro work showed that the PDZ domain of MinJ modifies MinD ATP hydrolysis rates, and the authors propose that MinJ may promote MinD dimer formation.

      Weaknesses of the revised version: No major weaknesses.

    2. Reviewer #2 (Public review):

      Summary:

      Feddersen & Bramkamp determined important characteristics of how MinD protein binds/dissociates to/from the membrane, and dimerizes in relation to its ATPase activity. The presented data clearly shows the differences in function of MinD homologs from B. subtilis and E. coli.

      Strengths:

      The work presents well-executed experiments that lead to interesting conclusions and a new model of how Min system works during B. subtilis mid-cell division. Importantly, this model is supported by in vitro characterization of well-chosen mutants in the functional domains of MinD. Outstandingly, most of the in vitro data are confirmed by single-molecule localization microscopy.

    1. Reviewer #1 (Public review):

      Summary:

      The study by the Obata group characterizes the dynamics of the canonical malate dehydrogenase-citrate synthase metabolon in yeast.

      Strengths:

      The study is well-written and appears to give clear demonstrations of this phenomenon.

      Studies of the dynamics of metabolon formation are rare; if the authors can address the concern detailed below, then they have provided such for one of the canonical metabolons in nature.

      Weaknesses:

      There is a fundamental issue with the study, which is that the authors do not provide enough support or information concerning the split luciferase system that they use. Is the binding reversible or not? How the data is interpreted is massively influenced by this fact. What are the pros and cons of this method in comparison to, for example, FLIM-FRET? The authors state that the method is semi-quantitative - can they document this? All of the conclusions are based on the quality of this method. I know that it has been used by others, but at least some preliminary documentation to address these questions is required.

      Comments on revised version:

      I feel that the authors have adequately addressed my prior concerns. I have no further critiques of their work.

    2. Reviewer #2 (Public review):

      This study explores the dynamic association between malate dehydrogenase (MDH1) and citrate synthase (CIT1) in Saccharomyces cerevisiae, with the aim of linking this interaction to respiratory metabolism. Utilizing a NanoBiT split-luciferase system, the authors monitor protein-protein interactions in vivo under various metabolic conditions.

      Major Concerns:

      (1) NanoBiT Signal May Reflect Protein Abundance Rather Than Interaction Strength<br /> In Figure 1C, the authors report increased MDH1-CIT1 interaction under respiratory (acetate) conditions and decreased interaction during fermentation (glucose), as indicated by NanoBiT luminescence. However, this signal appears to correlate strongly with the expression levels of MDH1 and CIT1, raising the possibility that the observed luminescence reflects protein abundance rather than specific interaction dynamics. To resolve this, NanoBiT signals should be normalized to the expression levels of both proteins to distinguish between abundance-driven and interaction-driven changes.

      (2) Lack of Causal Evidence<br /> The study presents a series of metabolic perturbation experiments (e.g., arsenite, AOA, antimycin A, malonate) and correlates changes in metabolite levels with NanoBiT signals. However, these data are correlative and do not establish a functional role for the MDH1-CIT1 interaction in metabolic regulation. To demonstrate causality, the authors should implement approaches to specifically disrupt the MDH1-CIT1 interaction. One strategy could involve using a 15-residue peptide (Pept1) derived from the Pro354-Pro366 region of CIT1, previously shown to mediate the interaction or introducing the cit1Δ3 (Arg362Glu) mutation, which perturbs binding. Metabolic flux analysis using ^13C-labeled glucose and mitochondrial respiration assays (e.g., Seahorse) could then assess functional consequences.

      (3) Absence of Protein Expression Controls Under Perturbation Conditions<br /> In experiments involving acetate, arsenite, AOA, antimycin A, and malonate, the authors infer changes in MDH1-CIT1 association based solely on NanoBiT signals. However, no accompanying data are provided on MDH1 and CIT1 protein levels under these conditions. This omission weakens the conclusions, as altered expression rather than interaction strength could underlie the observed luminescence changes. Immunoblotting or quantitative proteomics should be used to confirm constant protein expression across conditions.

      Conclusion:

      Although the central question is compelling and the use of NanoBiT in live cells is a strength, the manuscript requires additional experimental rigor. Specifically, normalization of interaction signals, introduction of causative perturbations, and validation of protein expression are essential to substantiate the study's claims.

      Comments on revised version:

      The manuscript is much improved.

    3. Reviewer #3 (Public review):

      Summary:

      Metabolons are multisubunit complexes that promote the physical association of sequential enzymes within a metabolic pathway. Such complexes are proposed to increase metabolic flux and efficiency by channeling reaction intermediates between enzymes. The TCA cycle enzymes malate dehydrogenase (MDH1) and citrate synthase (CIT1) have been linked to metabolon formation, yet the conditions under which these enzymes interact, and whether such interactions are dynamic in response to metabolic cues, remains unclear, particularly in the native cellular context. This study uses a nanoBIT protein-protein interaction assay to map the dynamic behavior of the MDH1-CIT1 interaction in response to multiple metabolic stimuli and challenges in yeast. Beyond mapping these interactions in real time, the authors also performed GC-MS metabolomics to map whole cell metabolite alterations across experimental conditions. Finally, the authors use microscale thermophoresis to determine components that alter the MDH1-CIT1 interaction in vitro. Collectively, the authors synthesize their collected data into a model in which the MDH1-CIT1 metabolon dissociates in conditions of low respiratory flux, and is stimulated during conditions of high respiratory flux. While their data largely support these models, some key exceptions are found that suggest this model is likely oversimplified and will require further work to understand the complexities associated with MDH1-CIT1 interaction dynamics. Nonetheless, the authors put forth an interesting and timely toolkit to begin to understand the interaction kinetics and dynamics of key metabolic enzymes that should serve as a platform to begin disentangling these important yet understudied aspects of metabolic regulation.

      Strengths:

      - The authors address an important question: how do metabolon-associated protein protein interactions change across altered metabolic conditions?

      - The development and validation of the MDH1-CIT1 nanoBIT assay provides an important tool to allow the quantification of this protein-protein interaction in vivo. Importantly, the authors demonstrate that the assay allows kinetic and real time assessment of these protein interactions, which reveal interesting and dynamic behavior across conditions.

      - The use of classic biochemical techniques to confirm that pH and various metabolites can alter the MDH1-CIT1 interaction in vitro is rigorous and supports the model put forth by the authors.

      Weaknesses:

      The authors have addressed identified weaknesses within the revision of their manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      Comments on revised version:

      The authors did a great job and I am very happy with the revised manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Comments on revised version:

      I am overall happy with the revision and agree that the authors have addressed most of the comments.

    3. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Comments on revised version:

      I also agree that the authors addressed our comments and the manuscript is much stronger now.

    4. Reviewer #1 (Public review):

      Summary: This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Weaknesses:

      (1) Clarity of the VGC model and behavioral task: The exposition of the VGC model lacks sufficient detail for non-expert readers. It is not clear how this model infers which maze obstacles are relevant or irrelevant for planning, nor how the maze tasks specifically operationalize "planning" versus other cognitive processes.

      The method for classifying obstacles as relevant or irrelevant to the task and connecting metacognitive awareness (i.e., participants' reports of noticing obstacles) to attentional capture is not well justified. The rationale for why awareness serves as a valid attention proxy, as opposed to behavioral or neurophysiological markers, should be clearer.

      (2) Attention framework: The account of attention is largely limited to the "spotlight" model. When solving a maze, participants trace the correct trail, following it mentally with their overt or covert attention. In this perspective, relevant concepts are also rooted in attention literature pertaining to object-based attention using tasks like curve tracing (e.g., Pooresmaeili & Roelfsema, 2014) and to mental maze solving (e.g., Wong & Scholl, 2024), which may be highly relevant and add nuance to the current work. This view of attention may be more pertinent to the task than models of simultaneously tracking multiple objects cited here. Prior work (notably from the Roelfsema group) indicates that attentional engagement in curve-tracing tasks may be a continuous, bottom-up process that progressively spreads along a trajectory, in time and space, rather than a "spotlight" that simply travels along the path. The spread of attention depends on the spatial proximity to distractors - a point that could also be pertinent to the findings here.

      Moreover, the tracing of a "solution" trail in a maze may be spontaneous and not only a top-down voluntary operation (Wong & Scholl, 2024), a finding that requires a more careful framing of the link to conscious perception discussed in the manuscript.

      Conceptualizing attention as a spatial spotlight may therefore oversimplify its role in navigation and planning. Perhaps the observed attentional modulation reflects a perceptual stage of building the trail in the maze rather than a filter for a later representation for more efficient decision making and planning. A fuller discussion of whether the current model and data can distinguish between these frameworks would benefit readers.

      (3) Lateralization of attention: The analysis considers whether relevant information is distributed bilaterally or unilaterally across the visual display, but does not sufficiently address evidence for attentional asymmetries across the left and right visual fields due to hemispheric specialization (e.g., Bartolomeo & Seidel Malkinson, 2019). Whether effects differ for left versus right hemifield arrangements is not made explicit in the presented findings.

      (4) Individual differences: Individual differences in attentional modulation are a strength of the work, but similar analyses exploring individual variation in lateralization effects could provide further insight, and the lack of such analyses may mask important effects.

      (5) Distinction between overt and covert attention: The current report at times equates eye movement patterns with the locus of attention. However, attention can be covertly shifted without corresponding gaze changes (see, for example, Pooresmaeili & Roelfsema, 2014).

      The implications for interpreting the relationship between eye movement, memory, and attention in this setting are not fully addressed. The potential dynamics of attention along a maze trajectory and their impact on lateralization analysis would benefit from further clarification.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      However, some conceptual and methodological aspects would be clearer with greater engagement with the broader literature on attention dynamics, a more explicit justification of operational choices, and more targeted lateralization analyses.

    5. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Weaknesses:

      (1) There is a critical conceptual gap in the study and its interpretation, mainly due to the reliance on a self-report metric of awareness (rather than an objective measure of behavioral performance).

      a. Awareness is tested by a 9-point self-report scale. It is currently unclear why awareness of task-irrelevant obstacles in this task would necessarily compromise optimal planning. There is no indication of whether self-reported awareness affects performance (e.g., navigation path distance, time to complete the maze, number of errors). Such behavioral evidence of planning would be more compelling.

      b. Relatedly, it would have been more convincing to have an objective measure of awareness, for instance, how the presence or absence of a "task-irrelevant" obstacle affects performance (e.g., change navigation path distance or time to complete the maze), or whether participants can accurately recall the location of obstacles.

      c. Consequently, I'm not sure that we can conclude that the spatial context does impact participants' ability to plan spatial navigation or to "incorporate task-relevant information into their construal". We know that the spatial context affects subjective (self-reported) awareness, but the authors do not present evidence that spatial context affects behavioral performance.

      d. Another concern that may complicate interpretation is the following: Figure 3c shows improved VGC model predictions (steeper slope) for mazes with greater lateralization. However, there are notable outliers in these plots, where a high lateralization index does not correspond to good model performance. There is currently no discussion/explanation of these cases.

      (2) I noticed an issue with clarity regarding task-relevance. It is currently not fully clear which obstacles are "task irrelevant". Also, the term is used inconsistently, sometimes conflating with "awareness". For example, in the "Attentional spotlight model of task representations" section, the authors state that "task-relevant information becomes less relevant when surrounded by task-irrelevant information". But they really mean that participants become less aware of those task-relevant obstacles. I assume task-relevance is an objective characteristic related to maze organization, not to a participant's construal. Indeed, the following paragraph provides evidence of model predictions of awareness.

      (3) The behavioral paradigm has some distinct disadvantages, and the validity of the task is not backed up by behavioral data.

      a. I understand the need for central fixation, but it also makes the task less naturalistic.

      b. The task with its top-down grid view does not seem to mimic real human navigation. Though this grid may be similar to mental maps we form for navigation, the sensory stimuli corresponding to possible paths and to spatial context during real-life navigation are very different.

      c. Behavioral performance is not reported, so it is unknown whether participants are able to properly complete the task. The task seems pretty difficult to navigate, especially when the obstacles disappear, and in combination with the central fixation.

      d. There is no discussion of whether/how this navigation task generalizes to other forms of planning.

    6. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Weaknesses:

      (1) The spotlight-VGC model has a free parameter - the "width" of the attentional spotlight. This seems to have been fixed to be 3 squares. It would be good if the authors could describe a more principled procedure for selecting the width so that others can use the model in other contexts.

      (2) Have the authors considered other ways in which factors such as attentional spillover and lateralization could be incorporated into the model? The spotlight-VGC model, as presented, involves first computing VGC predictions and only afterwards computing spillover. This seems psychologically implausible, since it supposes that the "optimal" representation is first formed and then it gets corrupted. Is there a way to integrate these biases directly into the VGC framework, perhaps as a prior on construals? The authors gesture towards this when they talk about "inductive biases", but this is not formalized.

      (3) Can the authors rule out that the lateralization effects are the result of memory biases since the main measure used is a self-report of attention?

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript the authors derive a mean-field model for a network of Hodgkin-Huxley neurons retaining the equations for ion exchange between the intracellular and extracellular space.<br /> The mean-field model derived in this work relies on approximations and heuristic arguments that, on the one hand, allow a closed-form derivation of the mean-field equations, and on the other hand restrict its validity to a limited regime of activity corresponding to quasi-synchronous neuronal populations. Therefore, rather than an exact mean-field representation, the model provides a description of a mesoscopic population of connected neurons driven by ion exchange dynamics.

      Strengths:

      The idea of deriving a mean-field model which relates the slow-timescale biophysical mechanism of ion exchange and transportation in the brain to the fast-timescale electrical activities of large neuronal ensembles.

      Weaknesses:

      The idea underlying this work is not completely implemented in practice.

      The derived mean field model do not show a one-to-one correspondence with the neural network simulations, except in strongly synchronous regimes. The agreement with the in vitro experiment is hardly evident, both for the mean-field model and for the network model. The assumptions made to derive the closed-form equations of the mean field model have not been justified by any biological reason, they just allow for the mathematical derivation. The final form of the mean-field equations do not clarify whether or not microscopic variables are used together with macroscopic variables in an inconsistent mixture.

      Comments on revisions:

      The main weaknesses I listed in the first report are still present, since the authors did not answer my questions on a solid basis. I report the list for completeness:

      (1) It seems that the reduction methodology that is employed is not the most suitable one for the single-neuron model they are considering.<br /> (2) The formulation of the mean-field derivation is unnecessarily complicated. It could be heavily simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (3) The model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      Therefore, my statement remains unchanged.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aiming in developing a neural mass model characterized by few collective variables mimicking the dynamics of a network of Hodgkin - Huxley neurons encompassing ion-exchange mechanisms. They describe in details the derivation of the mean-field model , then they compare experimental results obtained for the hippocampus of a mice with the neural network simulations and the mean-field results. Furthermore, they report a bifurcation analysis of the developed model and simulation of a small network containing various coupled neural masses, somehow moving towards the simulation of an entire connectome.

      Strengths:

      The author attempts to develop a mean-field model for a globally coupled network of heterogeneous Hodgkin-Huxley neurons with explicit ion exchange mechanism between the cell interior and exterior.

      Weaknesses:

      (1) They do not employ the reduction methodology more suited for the single neuron model they consider.<br /> (2) Their derivation of the neural mass model is based on several assumptions, and not all well justified.<br /> (3) Their formulation of the mean-field derivation is unnecessary complicated, it can be strongly simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (4) Their model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      General Statements:

      The authors honestly declared the many limitations of their approach, once assumed this the results of the mean-field are somehow inconsistent with the neural network simulations as expected.

      The authors suggest to employ this model for the simulations on the whole connectome to follow seizure propagation, however I believe that a simpler model, as the Epileptor, remains superior in this respect to this model. That indeed includes biophysical parameters but their correspondence with the ones employed in the network dynamics remain elusive, due to the many assumptions required to derive this mean field model. Furthermore it is more complicated than the Epileptor, I do not think that the present model will be largely employed by the community.

      Comments on revisions:

      The authors have corrected mistakes present in the manuscript and put a correct list of references.

      However, they refuse

      (1) To simplify the formulation of the model, the model contains unnecessary complications, as I have clearly written in my report, the authors agree, but they do not want to change the formulation;

      (2) To derive the mean field model in a simpler way, as possible, and as I asked many times in my Referee report, this would help the readers to understand the important aspect of the derivation, without not needed and confusing complicated formulations;

      (3) To compare direct simulations of the network with neural mass results in sub-section "Bifurcation analysis: emergent network states and multistability" to show bistability, as I asked.

      As a matter of fact the performed modifications do not solve my previous doubts on the validity of the results reported in the manuscript.

      Therefore, my previous assessments remain valid.

    1. Reviewer #1 (Public review):

      Pyne and Pandey et al. report the observation of early DNA degradation at the phagocytic cup during macrophage engulfment. Using an elegant experimental system that combines actin staining to visualise cup formation with direct monitoring of DNA degradation, the authors identify rapid recruitment of the membrane-bound nuclease DNase X (DNase1L1) to nascent phagocytic cups. This recruitment occurs within minutes of cup formation, is independent of DNA presence at the substrate, and appears to originate from intracellular membrane structures rather than from the extracellular environment. The results support the conclusion that DNase X activity is present at the phagocytic cup and that DNA digestion can begin prior to phagolysosomal maturation.

      The study is technically strong. The experimental system is clean, specific, and allows precise spatial and temporal detection of DNA degradation. The imaging-based approaches are carefully executed and enable convincing visualisation of DNase X recruitment and activity. The use of an alternative substrate beyond the primary SNS system strengthens the core observation, and the data broadly support the authors' central claim.

      However, several limitations temper the physiological interpretation. The system relies largely on short, free DNA substrates, leaving open how efficiently DNase X processes more complex or physiologically relevant DNA structures, such as nucleosome-bound DNA or neutrophil extracellular traps (NETs). It remains unclear whether DNase X deficiency would alter macrophage responses to larger nucleic acid structures, influence engulfment efficiency, or modify downstream inflammatory signalling pathways such as TLR9 or STING activation. Moreover, the experimental setup prevents full phagocytic cup closure, potentially prolonging DNase activity compared with physiological phagocytosis, which typically proceeds rapidly to cargo internalisation. For example, the peak signal observed in Figure 5 occurs approximately 90 minutes after phagocytic cup formation, a time point at which many phagocytic cups would be expected to have already closed under physiological conditions. Additional work using fully engulfed cargo in more physiological contexts would clarify whether early DNase X activity meaningfully contributes to overall DNA clearance kinetics.

      Mechanistically, the signal that triggers DNase X recruitment remains unresolved. Although actin rearrangement was excluded as the primary driver, the upstream cues that direct DNase X-containing membrane structures to the forming cup are not yet defined.

      In the broader context, early DNase X activity at the phagocytic cup could represent an additional safeguard against inflammatory signalling by limiting extracellular or surface-associated DNA before phagolysosomal degradation by DNase II. This mechanism may be particularly relevant in settings where DNA fragmentation before engulfment is incomplete, such as necroptosis or NET formation. Determining whether DNase X deficiency exacerbates inflammatory responses, alters DNA clearance efficiency in vivo, or contributes to immune pathology will be critical for establishing its physiological and disease relevance.

      Overall, this is a compelling study that introduces a novel concept of pre-phagolysosomal DNA digestion. The conclusions are well supported within the in vitro system used, but further investigation using diverse DNA substrates and physiologically relevant models will be required to fully define the impact of this mechanism on immune regulation and disease.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an elegant and innovative imaging approach to visualize DNase activity at the interface between macrophages and extracellular substrates. The platform is technically strong and enables the study of localized DNA degradation with high spatial resolution. The work is of clear interest and provides a useful framework to investigate how immune cells process extracellular DNA. However, several aspects of the mechanistic interpretation and conceptual framing would benefit from clarification.

      Strengths:

      (1) The study introduces a creative and well-designed imaging platform that allows visualization of localized DNase activity at cell-substrate interfaces.

      (2) The approach is technically robust and represents a valuable tool that could be broadly useful to the field.

      (3) The experiments are thoughtfully designed and address an important question regarding how immune cells interact with extracellular DNA.

      (4) The work opens interesting avenues for studying DNA processing in contexts such as infection and inflammation.

      Weaknesses:

      While the experimental approach is strong, several key conclusions rely on interpretations that would benefit from further clarification:

      (1) First, the conclusion that DNaseX is recruited to phagocytic cups from the "cytoplasm" appears conceptually imprecise. Given that DNaseX is a membrane-anchored protein, it is unlikely to exist as a freely soluble cytoplasmic pool. A more plausible interpretation is that DNaseX is supplied from intracellular membrane compartments. This interpretation would also be more consistent with the data showing dependence on a membrane anchor.

      (2) Second, the interpretation that actin polymerization is not required for DNaseX recruitment raises concerns. Phagocytic cup formation is known to depend strongly on actin dynamics, and it is therefore unclear whether the structures observed under actin inhibition represent fully formed functional cups or partial cell-substrate contacts. This distinction is important for interpreting recruitment versus activity, particularly since enzymatic activity is reduced under these conditions.

      (3) Third, the identification of DNaseX as the main nuclease responsible for the observed activity is not fully resolved. The conclusions rely primarily on gene silencing and staining approaches, but the specificity of these strategies relative to other nucleases is not addressed. It therefore remains possible that additional enzymes contribute to the observed activity.

      (4) Finally, the interpretation of the biofilm experiments may be overstated. While the data clearly show localized DNA degradation in contact with macrophages, it is not fully established that this process depends specifically on phagocytic cup structures. An alternative explanation is that membrane-associated DNase activity more generally mediates this effect. In addition, the physiological relevance of this mechanism would benefit from further discussion.

      Overall, the study is technically strong and introduces a valuable methodology, but several central conclusions are only partially supported by the current data and would benefit from more cautious interpretation and clearer conceptual framing.

    1. Reviewer #1 (Public review):

      Summary:

      During erythroid differentiation, hematopoietic progenitors relinquish multipotency and activate lineage programs. The switch from GATA2 to GATA1 is particularly important in this process, yet GATA2 chromatin‑binding kinetics remain undefined. The authors investigated GATA2-chromatin interaction dynamics during erythroid differentiation in three different cell systems using single‑molecule live‑cell imaging, and they also used CUT&Tag to profile GATA2 chromatin occupancy.

      By single‑molecule imaging, the authors report two interaction modes for GATA2: short‑lived (<1 s) and long‑lived (>5 s) binding. The proportion of long‑lived molecules, the number of binding events, and the duration of long‑lived binding change (or are maintained) during differentiation. Notably, long‑lived chromatin engagement by GATA2 increases during early erythroid differentiation and decreases at the late stage. CUT&Tag identifies regulatory elements selectively occupied by GATA2 during the early transition stage. Together, these results support a model in which transcription factor kinetics form a dynamic chromatin‑engagement profile that characterizes the GATA2‑to‑GATA1 transition.

      Strengths:

      (1) Characterizing transcription‑factor binding kinetics during the GATA2->GATA1 transition addresses a fundamental mechanism in erythroid differentiation.

      (2) Combining single‑molecule live imaging with CUT&Tag provides both dynamic and locus‑specific perspectives.

      (3) Single-molecule analysis across three different cell systems strengthens the potential generalizability of the findings and highlights biological variability.

      Weaknesses:

      I agree that single‑molecule imaging is a powerful approach for investigating GATA2 kinetics, but the single‑molecule data are the most important part of the paper and need improvement. The analyses focus on three measures: (i) duration of long binding, (ii) proportion of short‑ and long‑binding molecules, and (iii) total binding events. However, several methodological and control issues limit confidence in the kinetic interpretations. The authors should address the following major concerns.

      (1) Two binding states: justification and controls

      The authors propose two states of GATA2 binding. Are there only two states? Studies that separate short‑ and long‑lived binding (e.g., Chen et al., 2014, PMID: 25342811) address two states of transcriptional factors very carefully. Some long‑binding duration distributions here are very long‑tailed (e.g., Figure 2D middle), suggesting a possible third state. The authors must explain how they determined that two states provide the "best fit" to the data and how they classified "short" versus "long" binding.

      Controls should be included for long‑lived and short‑lived binding (e.g., histone proteins, HaloTag‑NLS, or a binding‑deficient GATA2 mutant) as in other studies. These controls are essential to exclude alternative explanations (see points below).

      (2) Exclude photophysical and focal‑plane artifacts

      The authors should exclude contributions from (i) photobleaching, (ii) blinking, and (iii) Z‑axis motion (disappearance from the focal plane). Although photobleaching correction is mentioned in the Methods, no details are provided. Describe and quantify the photobleaching correction and demonstrate that it was applied across all cell types and conditions.

      Some spots in the supplementary movies appear to blink or to move substantially between frames. Provide analyses or controls that distinguish true dissociation events from photophysical blinking/bleaching or axial motion.

      (3) HILO illumination and nuclear region sampled

      HILO is powerful but sensitive to illumination angle: slight changes sample different nuclear regions (e.g., nuclear interior versus periphery). The nuclear periphery is enriched in heterochromatin and may bias binding statistics. Explain how the authors controlled the HILO angle and confirmed that comparable nuclear regions were imaged across cells and conditions.

      (4) Quantification of event counts and long‑binding durations

      The number of binding events and measured long‑binding durations are strongly affected by imaging conditions (labeling/staining, bleaching, nucleus size, cell cycle state, focal plane, spot detectability, etc.). Imaging clarity appears to differ among cells/conditions in the supplementary movie. Provide more careful analysis describing how these variables were controlled or corrected for, and assess the sensitivity of results to choices in detection and tracking parameters.

      (5) Evidence that spots are single molecules

      The authors state that spots represent single molecules but do not provide supporting evidence. Spot brightness varies considerably in the movies. Brightness differences may reflect axial position. Provide evidence supporting single‑molecule assignment (e.g., single‑step photobleaching traces, brightness distributions compared to a known single‑molecule control, or photon count analysis).

      (6) Description of spot‑analysis pipeline

      The manuscript lacks a sufficient description of the spot‑analysis method. I reviewed the STRAP pipeline paper cited (Haque and Coleman 2025 bioRxiv) and the GitHub code, but the Methods in the current manuscript should include a detailed STRAP pipeline. This would enable readers to evaluate and reproduce the analyses.

      (7) Differences among cell systems

      The three cell systems yield notably different results (e.g., Figure 2C vs 4C and Figure 2D/3D vs 4D). Provide a more detailed explanation for these differences and discuss how biological variability, technical differences, or imaging biases might account for the discrepancies.

    2. Reviewer #2 (Public review):

      In this study, the authors address the molecular mechanism underlying the transcriptional changes during erythroid differentiation from hematopoietic progenitor cells. The authors combine single-molecule live cell imaging and CUT&RUN to analyze the chromatin binding properties of the GATA2 transcription factor prior to and after initiation of differentiation into the erythroid cell lineage. Using three distinct cellular systems, the authors demonstrate that the chromatin binding of GATA2 is transiently increased early in the differentiation process, as evidenced by increased chromatin binding residence time and the emergence of new genomic binding sites identified by CUT&RUN. The strength of the study lies in the combination of single-molecule imaging, which reports on binding dynamics but is agnostic of the binding site, with CUT&RUN, which reports on the binding sites but does not provide dynamic information. The authors clearly demonstrate that chromatin binding of GATA2 is altered early in the differentiation process and is later displaced as cells switch to expression of GATA1, which has been previously observed. The use of three distinct cell lines, in particular the GATA2-SNAP mouse model, is a strength in principle; however, the results are not fully consistent between the different cell systems. A key difference is that the G1E-ER4 and HPC7 cell line models express HaloTagged GATA2 in addition to the endogenous GATA2 protein. The authors go through great lengths to control GATA2-HaloTag expression levels, but they use polyclonal cell lines and do not analyze expression levels of the GATA2-HaloTag transgene, which is a key variable in interpreting their experimental results. Finally, a key variable determined in their single-molecule analysis is the number of binding events observed during the distinct differentiation changes. The number of binding events observed is influenced by the expression level of the tagged protein, which in turn is controlled by the Shield-1 ligand, and the fraction of molecules labeled with the HaloTag ligand. Since transgene protein levels and the labeling efficiency were not determined, it is hard to assess how reliable the measurements of the number of binding events are across all cell lines.

      To address the weaknesses summarized above the authors could take the following steps:

      (1) Determine the expression levels of the GATA2-HaloTag transgene over the course of differentiation under the conditions used for single-molecule imaging. This will not only allow them to determine the expression of the transgene but also the endogenous untagged protein with which the GATA2-HaloTag fusion proteins compete for binding sites.

      (2) To determine the fraction of molecules labeled during imaging, the authors could carry out a titration of the HaloTag ligand and compare the amount of labeled protein under single-molecule imaging conditions to that of saturating labeling of the HaloTag. This approach will ensure that the number of labeled molecules per cell is comparable across experimental conditions and allow the authors to draw more solid conclusions regarding the number of binding events.

      (3) The analysis of residence times using single-molecule imaging requires robust single-particle tracking without gaps or interruptions of trajectories. The authors should show images of their particle trajectories to demonstrate that their tracking is robust. Or even better, movies superimposing the trajectories onto the imaging data.

    3. Reviewer #3 (Public review):

      Hobbs et al. use live-cell single-molecule tracking (SMT) of HaloTag- and SNAP-tagged GATA2 combined with CUT&Tag chromatin profiling to examine how GATA2 chromatin engagement evolves during erythroid differentiation. Across three complementary systems, G1E-ER4 cells, HPC7 cells, and primary bone marrow progenitors from a new Gata2-SNAP knock-in mouse, they report a transient strengthening of long-lived GATA2 chromatin binding at the "Early" (2 h) erythroid stage, manifested either as increased residence time (G1E-ER4) or expansion of the long-lived bound fraction (HPC7, primary cells). CUT&Tag identifies 1,167 Early-restricted GATA2 peaks partitioning into GATA2-only (promoter-proximal, GATA/RUNX motifs) and GATA2+GATA1 co-bound (distal, GATA/E-box motifs) subclasses. The authors propose that this kinetic phase represents a previously unappreciated dimension of the GATA switch.

      This is a strong study with a genuinely novel finding-the non-monotonic kinetic behavior of GATA2 during erythroid priming, supported by complementary measurements in three biological systems. The issues below are largely clarifications, additional analyses of existing data, and modest refinements to the discussion. With these addressed, the manuscript will make a valuable contribution. I recommend a minor revision.

      Specific points:

      (1) Clarify the photobleaching correction and report per-cell bleach lifetimes.

      The long-lived residence time claim in G1E-ER4 cells depends on careful accounting for photobleaching, which the Methods indicate was done via a right-censoring model. For reviewer and reader confidence, the authors should report the per-stage (or per-cell distribution of) photobleaching lifetimes and the photobleach-corrected residence time values alongside the apparent values in Figure 2D. If feasible, including a brief supplementary analysis with an H2B-Halo or similar long-lived control under matched conditions would further solidify the quantitative claims. This is an analysis of existing data and should not require new imaging.

      (2) Unify or explicitly discuss the mechanistic differences across systems.

      The three systems show qualitatively different signatures: residence time change in G1E-ER4, bound fraction expansion in HPC7, and primary cells. The authors currently group these under "enhanced engagement," but these signatures imply different underlying mechanisms (koff decrease vs. increased kon or increased specific-binding-competent pool). The Discussion partially addresses this by noting engineered vs. native differences, but a more explicit framing in both Results and Discussion would help readers. Specifically, reporting an on-rate proxy (for example, binding events per unit time normalized to detectable molecule number) alongside koff would let readers see how the mechanistic pieces fit together. I do not think this changes the central message; it sharpens it.

      (3) Per-cell GATA2 concentration would strengthen the "uncoupling" claim.

      A central claim of the Figure 6 model is that chromatin engagement is uncoupled from protein abundance. The ectopic Shield-1 stabilization system is a reasonable design choice, but quantifying total nuclear GATA2-Halo signal (for example, from the pre-bleach frame or a brief high-power acquisition) on a per-cell basis across stages would directly support the interpretation. For the primary cells, where the biological claim is strongest, a western blot or quantitative immunofluorescence on the flow-sorted populations would make the uncoupling argument much more defensible. I recognize this may be one additional experiment, but it is a high-value one.

      (4) Additional single-cell distribution analysis.

      Figure 1E and Figures 2 to 4 show substantial cell-to-cell heterogeneity, and the Early populations in particular look potentially bimodal. Given that the authors cite Wheat et al. and Palii et al. on probabilistic hematopoietic transitions, a brief supplementary analysis using distribution-based statistics (K-S test, or mixture model) rather than, or alongside, mean-based ANOVA would align the analysis with this conceptual framing and may reveal whether the Early state represents a subpopulation transition rather than a uniform shift. This is purely an analysis of existing data.

      (5) Quantitative integration of CUT&Tag with SMT.

      The manuscript presents SMT and CUT&Tag as complementary but does not attempt to quantitatively connect them. A back-of-the-envelope calculation of whether a 21% increase in residence time (G1E-ER4), or the fraction expansion in other systems, is consistent with the acquisition of the 1,167 Early-restricted sites, given plausible site affinities, would substantially strengthen integration. Even if the calculation is approximate, framing it explicitly would help readers appreciate that the two datasets reinforce each other.

      (6) Short-lived kinetic interpretation and tracking parameters.

      The 1.5 s gap allowance in tracking is long relative to the 0.55 to 0.73 s short-lived residence times reported in primary cells (Figure Supplement 1F), which could affect the interpretation of the "slowing of target search" claim. A brief sensitivity analysis with tighter gap parameters in the supplement would reassure readers that this effect is robust. Additionally, please clarify how the inferred slowing of search, which should reduce kon, is reconciled with the increased number of binding events per cell at the Early stage.

      (7) CUT&Tag peak definition.

      The Early-restricted peak set is defined by presence and absence at q less than 0.01, which can be sensitive to near-threshold peaks. Please report either (a) the CUT&Tag signal intensity distribution at the 1,167 sites across all three stages as a quantitative scatter or density plot, beyond the heatmap in Figure 5C, or (b) the result of a differential binding analysis (for example, DESeq2 on read counts in a union peak set) as a supplementary confirmation. Please also state the number of CUT&Tag replicates per stage and the overlap of Early-restricted sets across replicates.

      (8) Knock-in mouse validation.

      The Gata2-SNAP allele is a valuable new tool, and it would benefit from slightly more quantitative validation in the supplement. A brief characterization of basic hematopoietic parameters in homozygotes (CBC, LSK/HSPC frequencies, or colony assays) would confirm that the tagged allele is truly physiological and would serve the community that will want to use this mouse going forward. If this has been done, please include it; if not, a statement about what was checked would suffice.

    1. Reviewer #1 (Public review):

      This manuscript is very interesting and timely. By introducing the critical effects of desolvation barriers and solvent (water)-separated minima into the implicit-solvent potentials (of mean force, PMFs) for coarse-grained molecular dynamics simulations of biomolecular liquid-liquid phase separation (LLPS), this work fills a gap that should be apparent to researchers of protein folding in the past couple of decades but has so far escaped deserved attention such that these basic features of aqueous solvation have seldom, though not never, been invoked in recent studies of biomolecular condensates. Although the present paper deals almost exclusively with homopolymers, this work can be a foundation for the future development of a new, more physical coarse-grained interaction scheme for simulating amino acid sequence-dependent effects, which I presume is the authors' ongoing or next endeavor. The results presented in this manuscript are highly valuable.

      However, there is room for improvement in the authors' description of (i) the broader impact of effects of desolvation barrier and solvent-separated minimum in the thermodynamics of biomolecular condensates, especially with regard to the ramifications on hydrostatic pressure-dependent effects; (ii) the physical implication of using a 20-parameter hydropathy scale rather than a 210-parameter pairwise amino acid interaction scheme; and (iii) temperature-dependent effects, including the authors' discussion of "enthalpic" and "entropic" contributions. In all these aspects, the authors' discussion should be put in a more comprehensive context of the existing literature. At a few other places, the description of the methods and results should be clarified as well. Accordingly, the authors should revise the manuscript to address the following items thoroughly within the revised manuscript (not merely in the response letter) with the additional references mentioned below included in the revised discussion:

      (1) In several places, e.g., on line 77 (p.2), the authors appear to suggest that "implicit-solvent representation" is the origin of the deficiency in commonly utilized coarse-grained potentials that this study is aiming to rectify. But desolvation barriers and solvent-separated minima are also features of implicit-solvent representations; they are just features that should be incorporated in more accurate implicit-solvent potentials. This point is stated quite clearly and accurately in the Abstract (p.1) but not consistently in the rest of the text. The authors should check the entire text carefully to ensure that a coherent, accurate perspective is presented.

      (2) In the discussion of the importance of desolvation barriers and solvent-separated minima in the Introduction (pp.1-3), connections should be drawn to recent works that utilize these PMF features to rationalize hydrostatic pressure (P)-modulated effects on biomolecular LLPS, including the P-dependent reentrant phase separation of alpha elastin; see Cinar et al. (2019) Chem Eur J 25:13049 (https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/chem.201902210) and references therein, especially discussions around Figures 10, 11 & 13 in this reference.

      (3) In the lower panels of Figures 2D, E (p.5), what do the differently colored small circles in the double-minimum free energy profiles represent? Does the color shading have the same meaning as that in the upper panels? If so, what do the positions of the circles on the free energy profile represent? The authors should clarify this.

      (4) The discussion regarding entropy and enthalpy around Figure 2 is quite confusing as it stands. What do the authors mean exactly by the association of entropy or enthalpy with the desolvation barrier of the solvent-separated minimum? Are they referring to conformational entropy?

      (5) Do the authors assume that the PMF (effective implicit-solvent potential) is a purely enthalpic term? It appears to be the authors' assumption. If so, the assumption has to be stated clearly in their discussion of "entropy" vs "enthalpy" around Figure 2.

      (6) Closely related to points 3-5 above, it should be stated clearly that the "temperature" used in the authors' simulations does not represent experimental temperature if the authors are using purely enthalpic effective potentials because PMFs are in fact temperature-dependent. This clarification is necessary to avoid misunderstanding. In this regard, it should be noted that temperature-dependent effective interactions have been used for modeling biomolecular condensates in analytical theory (Lin, Song, Forman-Kay & Chan, J Mol Liq 2017, already in the citation list) as well as in coarse-grained molecular dynamics simulations [Dignon et al. (2019) ACS Cent Sci 5:821-830 (https://pubs.acs.org/doi/10.1021/acscentsci.9b00102); Chakravarti & Joseph (2025) Protein Sci 34:e70284 (https://onlinelibrary.wiley.com/doi/10.1002/pro.70284)]. The latter two studies, not cited currently, are particularly relevant and thus should be cited because the authors may wish to incorporate temperature-dependent features in their ongoing or future effort in constructing a more comprehensive coarse-grained interaction scheme for biomolecular LLPS simulation.

      (7) In tackling "entropy" vs "enthalpy", it should be noted that the temperature dependence of the effective interactions entails an entropic contribution (which is itself temperature dependent) in addition to conformational entropy. As for the effective potential with desolvation barrier and solvent-separated minimum, it should be noted that the decomposition into entropic and enthalpic contributions at the direct contact, desolvation barrier, and solvent-separated minimum can be dramatically different, see, e.g., MaCallum et al. (2007) PNAS 104:6206-6210 (https://www.pnas.org/doi/full/10.1073/pnas.0605859104) and references therein.

      (8) P.7, line 340: The proportionality relation follows directly from the standard Flory-Huggins result T_c = T chi(T)/chi_c, thus the proportionality constant is exactly 1/chi_c. Is this the standard relation that the authors are invoking here? The authors should clarify this.

      (9) The study on dynamic consequences on pp.8-11 is interesting, but clarifications are necessary:

      (i) The vertical schematic in Figure 4A should be explained in detail in its entirety. As it stands, no explanation is provided either in the figure caption or in the text. In particular, what does "elasticity driven" refer to?

      (ii) The top snapshot in Figure 4A is labeled t_sim = 0 ns. Does it mean that the snapshot shown is the only chain configuration that the authors used to start the simulation, and that the snapshot does NOT represent the result of any time evolution, no matter how short the duration is? However, if that is the case, why is this snapshot identified with spinodal decomposition if it is not the product of a time evolution from a more homogeneous configuration?

      (iii) Related to (ii) - do the rectangular boxes shown represent the entire simulation box or just part of the box containing the polymer chains? One would imagine that if the top snapshot represents spinodal decomposition, the simulation would have been started at a more uniform distribution a short time prior? Why is this not the case?

      (iv) What precisely do the small yellow beads and black-colored springs in the zoom-in image of Figure 4E represent?

      (10) In discussing dynamic effects, it is useful to draw connections to related works on the effect of chain flexibility on "aging" of condensate [Biswas & Potoyan (2024) PRX 45:9222-9245 (https://journals.aps.org/prxlife/abstract/10.1103/PRXLife.2.023011)] and characterization of viscoelasticity in simulations of biomolecular condensates [Tejedor et al. (2023) J Phys Chem B 127:4441-4459 (https://pubs.acs.org/doi/10.1021/acs.jpcb.3c01292)], as the effects of desolvation can be explored further based on these prior works.

      (11) Much of the present study is based on the original HPS formulation of Dignon et al. (2018). In this regard and also in anticipation of future development of improved interaction schemes, several issues should be stated and discussed, even if briefly:

      (i) The original HPS model has a basic shortcoming in accounting for the relative interaction strengths of, among others, arginine vs lysine residues [Das et al. (2020) PNAS 117:28795-28805 (https://www.pnas.org/doi/10.1073/pnas.2008122117)].

      (ii) Compared to 210-parameter pairwise interaction schemes, such as KH in Dignon et al. (2018) and Joseph et al. (2021), the 20-parameter interaction scheme is likely too restrictive to account for pairwise amino acid residue interactions [Wessén et al. (2022) J Phys Chem B 45:9222-9245 (https://pubs.acs.org/doi/10.1021/acs.jpcb.2c06181)].

      (iii) The height of the desolvation barrier may vary significantly for different amino acid residue pairs, see, e.g., Figure 11 of Cinar et al. (2019) mentioned above (and references therein). The authors should discuss these nuances in the revised version. They may also wish to take them into consideration in future investigations.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses an important and timely question in the molecular simulation of biomolecular condensates. Most residue-level coarse-grained models used for IDP phase separation employ implicit solvent and represent effective interactions through relatively simple pairwise potentials. While these models have been very useful, they usually do not explicitly distinguish direct contacts from solvent-separated interactions, nor do they include an energetic barrier associated with water removal. This manuscript attempts to address that limitation by introducing desolvation-inspired terms into coarse-grained models and examining their consequences for phase behavior, chain conformations, dense-phase packing, and dynamics.

      Strengths:

      The central idea is physically well motivated. Using a simple homopolymer model, the authors show that increasing the desolvation barrier suppresses phase separation, whereas stabilizing solvent-separated contacts enhances phase separation. They further show that solvent-separated interactions can reduce dense-phase over-compaction, which is a meaningful result given the known challenges in obtaining both accurate single-chain dimensions and realistic dense-phase properties from the same coarse-grained model. The finding that desolvation-like terms can reshape dense-phase packing without simply rescaling the overall interaction strength is interesting and could be useful for future model development. I also found the attempt to connect conformational changes across dilute and dense phases with thermal distance from the critical point to be intriguing. The dynamic analysis, including the FRAP-like simulations and the discussion of kinetic arrest during coarsening, adds another useful dimension to the work.

      Weaknesses:

      At the same time, there are several places where the manuscript would benefit from more careful framing. First, the desolvation terms are still effective coarse-grained parameters rather than a direct representation of water molecules. The language sometimes gives the impression that desolvation is being treated explicitly, whereas the model introduces desolvation-inspired effective interactions into an implicit-solvent framework. Second, the conformational analysis is interesting, but the broader context of prior work on dilute-to-dense phase conformational reorganization of IDPs could be more clearly discussed. This would help clarify what is new in the present work, whether it is the conformational change itself, its dependence on desolvation terms, or the proposed scaling with distance from the critical point. Third, the dynamic results are potentially useful, but the manuscript should more clearly articulate what is nontrivial beyond the expected slowing of local rearrangements by an added barrier in the potential.

      Overall, I think this is a useful and potentially important contribution.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an original quantitative approach for tracking the online formation and updating of prior beliefs. In an Alternating Serial Reaction Time task, participants were exposed to probabilistic visual streams, and their pre-stimulus saccadic behavior (i.e., the first eye movement after the previous stimulus disappeared) was monitored via eye-tracking. Since the stimuli followed an alternating probabilistic sequence, upcoming events did not appear with full certainty: some stimuli had a higher, some a lower probability. By comparing anticipatory oculomotor behavior between high and low probability events, the authors dissociated between learning/belief updating and general oculomotor noise. Noise-driven errors were more frequent than learning-dependent errors, which nonetheless triggered more belief updating (i.e., a change in oculomotor behavior in a subsequent encounter of the same event). Interestingly, updating depended more strongly on whether a prior belief was consistent with the task's probabilistic structure than on prediction errors. These findings suggest that incidental, implicit statistical learning may rely on conservative updating with a relatively low learning rate, or on errorless algorithms, rather than prediction errors per se.

      Strengths:

      By applying a fine-grained analysis of anticipatory oculomotor behavior, this work establishes new continuous metrics to quantify the gradual learning and refinement of prior expectations during statistical learning. These metrics provide convincing evidence of the dynamics of anticipatory oculomotor behavior.

      The method is paradigm-independent, offering generalizable metrics for tracking the dynamic formation and refinement of predictive models in any task involving probabilistic stimulus streams. In the future, computational modeling may leverage these continuous metrics to better dissect the mechanisms underlying statistical learning.

      Weaknesses:

      The authors subscribe to the idea that statistical learning is not a unified concept but rather is implemented via multiple underlying mechanisms. However, it remains unspecified what these different mechanisms could be, and how eye movements could contribute to distinguishing between them.

      The authors claim that they developed a novel methodological approach to probe whether anticipatory eye movements directly reflect priors, thereby filling an outstanding gap. However, this claim ignores mounting relevant work on structure learning using eye-tracking in the developmental field.

      The authors claim that their framework quantifies trial-by-trial oculomotor dynamics, while in fact the analyses use epochs (i.e. groups of multiple trials) as predictors. Why not use trial number as a predictor to truly investigate trial-by-trial dynamics that directly reflect anticipation, surprisal, and revision?

    2. Reviewer #2 (Public review):

      Summary:

      Hann and colleagues introduce a gaze-based analytical framework designed to capture, on a trial-by-trial basis, how people form and revise their predictions during implicit probabilistic sequence learning. Using an eye-tracking adaptation of an alternating sequence task, they record the first anticipatory saccade during the response-stimulus interval and classify each such saccade along two dimensions: whether it was directed toward a high- or low-probability upcoming stimulus (the learning-dependent vs. not-learning-dependent distinction), and whether the anticipated location coincided with the stimulus that actually appeared. A complementary iterative-updating metric codes whether a participant's prediction for a given three-element context is repeated or revised on successive encounters of that context.

      On the basis of these measures, the authors report that errors congruent with the inferred regularity - which they interpret as reflecting environmental noise - become progressively more frequent than errors reflecting an inaccurate internal model; that participants show a pronounced tendency to repeat their previous prediction rather than revise it; and that updates depend more on whether a prior belief is congruent with the task's statistical structure than on whether the previous prediction was confirmed. They interpret these results as evidence that statistical learning is less error-driven and more repetition-based (Hebbian in character) than is typically assumed.

      Strengths:

      The methodological ambition of the work is considerable, and the paper makes several contributions that are likely to be useful to the implicit-learning and predictive-processing communities. Using the first anticipatory saccade as a pre-response behavioral readout of prediction is conceptually well-motivated: it provides a trial-by-trial index of predictive orienting at a temporal resolution that manual reaction times cannot deliver, and it does so before the outcome of the trial is known. The explicit distinction between errors arising because the task's outcome is stochastic - that is, predictions congruent with the statistical structure but unconfirmed by the stochastic sample - and errors arising because the internal model is inaccurate is a theoretically meaningful move: predictive-coding and Bayesian accounts have long argued that these two sources of surprise should carry different weight for model revision, and the authors offer a behavioral operationalization of that distinction. The analytical pipeline is not tied to the specific paradigm used here and could be applied to other probabilistic sequence-learning tasks, which gives it broader methodological utility than a single-paradigm report. Finally, the demonstration that learners maintain their prior across successive occurrences of the same context, even when it has been disconfirmed by the most recent outcome, is a robust behavioral observation that speaks directly to an unresolved debate about whether statistical learning is dominantly error-driven.

      Weaknesses:

      The framework and the core behavioral observations are valuable, but several inferential steps - from the gaze signal to the cognitive constructs the authors invoke - are not fully supported by the present design, and these gaps affect how readers should interpret the stronger theoretical conclusions.

      The "process-pure" framing conflates sensitivity with construct purity. The authors repeatedly describe the eye-tracking measure as providing a more process-pure index of statistical learning than manual-response paradigms. Anticipatory saccades are themselves a learned motor behavior - the oculomotor system is among the most plastic motor outputs the primate brain generates, and sequence learning in the saccadic system is well-documented. The present design does not dissociate learning of the statistical structure from learning of the oculomotor sequence that expresses it, so the measure is not, on its face, free from the motor-learning confound that the authors criticize in button-press paradigms. The framing should be read as aspirational rather than as demonstrated by the present data.

      The oculomotor reaction-time data do not show the canonical signature of statistical learning. Reaction times for low-probability trials rise across epochs while those for high-probability trials remain approximately flat (Figure 5). The emerging difference between the two trial types, therefore, appears to be driven by a slowing of responses to low-probability stimuli rather than by a facilitation of responses to high-probability ones, and the authors do not rule out the alternative interpretations that this pattern reflects fatigue, a motor floor effect, or inhibition of unexpected locations. Because no fixation constraint is imposed during the response-stimulus interval, pre-stimulus gaze drift toward the anticipated location will artifactually reduce reaction time on precisely those trials the authors wish to treat as learning-driven; the fact that measured reaction times remain well above zero even on trials classified as correct anticipations is itself evidence that this contamination is present. The oculomotor reaction-time data, therefore, do not provide as clean a verification of learning as the manuscript implies.

      The correct/error labeling of anticipatory saccades incorporates information that the participant did not have. Because the first saccade occurs during the response-stimulus interval - that is, before the upcoming stimulus is revealed - the participant's internal predictive state is identical whether the trial is subsequently classified as a learning-dependent correct response or a learning-dependent error. Any difference in the epochwise frequency of these two categories must therefore be driven, at least in part, by the external stochastic structure of the task rather than by a difference in the predictive process itself. In particular, the observation that learning-dependent errors are the most frequent saccade type (Figure 7) is predicted by the prior probabilities of the outcomes alone, given a high-probability prediction, without appeal to any difference in predictive state. Readers should recognize that the theoretically meaningful contrast is between learning-dependent and not-learning-dependent anticipations (two categories), and that the four-way split risks confounding predictive state with outcome stochasticity.

      The iterative-updating metric does not distinguish prior revision from alternative processes. The binary update / no-update code, computed across non-contiguous occurrences of the same three-element context, does not discriminate between a genuine update of the internal model, simple episodic retrieval of a previously encountered triplet, and oculomotor perseveration. Without a formal generative model to anchor the interpretation, the central theoretical claim - that statistical learning is less error-driven than commonly assumed - is underdetermined by the data. The repetition pattern the authors observe is equally consistent with an error-driven model equipped with a low learning rate in a stable environment, an interpretation the authors themselves acknowledge in the Discussion. Adjudicating between these possibilities requires comparison against explicit computational models, which the present manuscript does not provide.

      Data loss and the absence of fixation control. An interpretable saccade is detected on fewer than half of all trials (48.76%; line 889), and the manuscript does not report the distribution of saccade counts per interval, the per-condition trial counts after all exclusions, or the decomposition of the 20% missing-data threshold into its underlying causes. Given that the entire inferential apparatus rests on this subset of trials, the degree of data loss is a relevant context for the reader. Separately, no fixation constraint is imposed between trials: the participant's starting gaze position at the onset of each response-stimulus interval is whatever position was reached at the end of the preceding response, and this starting position carries trial-history information correlated with the upcoming stimulus. This leaves open the possibility that what is classified as predictive orienting partly reflects the mechanical consequences of where the eye happened to be at the end of the previous trial. The authors defend the absence of a fixation cross on the grounds that it would transform the transitional structure of the task, but this is an empirical claim presented without a supporting citation.

      Heterogeneity within the high-probability condition is not addressed. The two routes to a high-probability triplet in the design - pattern-random-pattern (50% of trials) and random-pattern-random (12.5%) - differ both in their base rate and in the reliability of the contextual cue they provide. Collapsing across these subtypes is an analytical choice that may conceal heterogeneity in the underlying learning process.

      Appraisal: Do the results support the authors' conclusions?

      The framework succeeds in providing a trial-by-trial behavioral readout of predictive orienting that is more fine-grained than conventional reaction-time measures, and the behavioral dissociation between errors congruent with the regularity and errors reflecting an inaccurate internal model is a genuine empirical contribution. The conclusions about the mechanistic nature of statistical learning should be read as motivating hypotheses for future modeling work rather than as settled empirical claims.

      Impact and utility:

      The analytical framework introduced here is likely to be useful to researchers working on implicit learning, predictive processing, and Bayesian models of perception and cognition. The measure of predictive orienting and the iterative-updating code could be adapted to a range of probabilistic learning paradigms, and the behavioral dissociation between noise-driven and model-mismatch errors fills a methodological gap that the field has long acknowledged. The authors share their data and code openly, which will facilitate reuse. The most durable contribution of the paper is methodological; the theoretical claims about the nature of statistical learning will require additional computational modeling before they can be regarded as established.

    1. Reviewer #1 (Public review):

      This paper reports an auditory-directed analysis of the HCP 7T short movie dataset. It has the goal of using the film audio to create tonotopic (pRF) maps and combine these with other HCP-provided data (e.g., T1/T2 ratio) to improve understanding of auditory cortex organization and relative functional segregation, particularly in reference to speech processing.

      The paper is ambitious, uses well-founded existing tools for combining data across subjects, and in the Discussion in particular, makes a lot of careful points about interpretation. The paper shows that, at least for a very large dataset on 7T (and for at least a few individual participants) good quality cross-subject-average tonotopic maps can be extracted from fMRI movie datasets via basic spectral modelling of the movie soundtracks. It also suggests ways that these movie-based maps can be combined to come up with potential models of cortical organization. The PCA analysis is a creative way of combining maps (see below for comments)

      These are valuable tools for the field in exploiting/exploring existing data, and I look forward to trying them out myself. I want to emphasize that this is not 'damning with faint praise' - a concrete demonstration of this approach with freely available tools/examples is not only the product of a lot of effort (thank you!), but will be an impetus to research going forward.

      In terms of the contribution to our understanding of auditory cortex organization, using this large N cohort, they replicate a number of findings in the literature from the last couple of decades, including the overlap of low frequency preference with greater speech stimulus preference (e.g. Moerel, de Martino, & Formisano, 2012, J Neuro), patterns of BF width across cortex (Moerel et al., various; Thomas et al. 2015), use of shorter and longer natural sounds (Moerel et al., 2012, 2014; Dick et al., 2012), the importance/influence of sustained spectral attention for tonotopic mapping (da Costa et al., 2013; Dick et al., 2017; Riecke et al. 2017), the use of tonotopy and 'myelin' mapping to establish areal or regional boundaries (Dick et al., 2012; de Martino et al., 2015; Besle et al., 2018, etc) and the overall shape and consistency of tonotopic maps (e.g., Talavage et al., 2004, Humphries et al., 2010 and many others). To my knowledge/memory, this is the first tonotopy paper that has used the cross-subject cortical-surface-based averaging techniques that are driven by more than curvature/sulcal alignment.

      The paper focuses in particular on creating new sets of ROIs based on the various maps derived from the data. Despite being quite familiar with this body of work, I found it difficult to follow how the ROIs were derived, and how and why they were different and/or an improvement over existing parcellation schemes (see for instance Sereno, Sood, & Huang, 2022 for a comprehensive parcellation framework across modalities including auditory, based on combined receptive surface mapping, myelin estimates, and other metrics).

      Given the hour of fast(ish) fMRI data on a 7T with pretty big voxels (so high SNR), one aspect of the results that I found surprising - and potentially informative - was the lack of reliable tonotopic 'mappability' in the majority of participants. The authors' analytic approach to computing the pRFs seems completely reasonable (and shows good average maps), and yet individual maps seem unreliable except for the very best examples. I wondered if this might be due to problems in data collection with earbuds becoming slightly uncoupled and therefore delivering a lot less lower-frequency response and also not preventing scanner noise from getting to the ear; this is often a problem with any in-scanner earbud system (including the Sensimetrics). I wondered if the robustness of the 'speech maps' was associated with that of tonotopy; if they are highly associated, that would suggest that either there were huge individual differences in auditory attention, or perhaps that there was some variability in the acoustic signal delivered to each participant.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors leverage a high-powered 7T fMRI dataset of subjects viewing naturalistic audiovisual movies to elucidate the topographic organization of the human auditory cortex. By applying a nonlinear pRF model, they successfully map tonotopic gradients extending beyond the auditory core into the STG and STS areas. A primary finding is a medial-to-lateral gradient of increasing response compressivity, which the authors claim mirrors the hierarchical cascade architecture of the visual system. Furthermore, the modeling reveals that regions exhibiting high speech selectivity predominantly occupy the low-frequency portions of non-primary tonotopic maps. The authors argue that this architecture reflects an efficient coding mechanism where the cortex magnifies specific spectral features to facilitate the transition from acoustic encoding to flexible speech representation.

      Overall, the study presents concise analyses and compelling high-resolution results that advance our understanding of auditory cortical organization. However, the manuscript currently exhibits several significant theoretical and methodological gaps that temper its broader claims. Most notably, the authors' reliance on a spatial, retinotopic-like analogy overlooks the fundamentally temporal nature of audition. Decoding continuous, natural speech relies heavily on dynamic, full-spectrum temporal integration and contextual recurrent computations, which are difficult to reconcile with the purely static, low-frequency spatial tuning observed here.

      Strengths:

      (1) The utilization of ultra-high-field 7T functional imaging combined with large-scale, naturalistic continuous stimuli provides an excellent signal-to-noise ratio and captures cortical responses under ecologically valid conditions.

      (2) The application of a non-linear pRF encoding model provides a robust, quantitative method for parameterizing and mapping tonotopic features across the cortex, moving beyond simple contrast-based parcellations.

      (3) The manuscript effectively demonstrates the relationship between category selectivity (e.g., speech) and underlying tonotopy, drawing an elegant and structurally useful analogy to the well-established relationship between category selectivity and retinotopy in the visual cortex.

      Weaknesses:

      (1) While the PCA mapping of the functional and structural parameter space is visually compelling, the robustness of this representational geometry across varying acoustic contexts remains ambiguous. Because the model relies on the specific statistical regularities of a single naturalistic audiovisual stimulus set, it is unclear if this low-dimensional structure would hold when tested against isolated speech sounds, environmental noise, or spectrally matched non-speech control stimuli.

      (2) The methodological descriptions currently lack the computational precision required for replication and deep evaluation. I would suggest that the exact mathematical formulation of the encoding model be fully specified in the Methods section. This should include an explicit definition of the objective function, a clear accounting of all terms and hyperparameters utilized during the fitting process, and the exact dimensionalities of both the input feature space and the resulting parameter space.

      (3) There is a critical theoretical disconnect between the observed static, low-frequency tuning in the STG and the known acoustic requirements for continuous speech perception. Speech is a full-spectrum signal; while fundamental frequencies and formants dominate the lower spectrum (which is vital for processing dynamic pitch contours), high-frequency bands (>1 kHz) carry indispensable phonetic information, such as the rapid spectrotemporal dynamics of consonants, especially fricatives. If the speech-responsive cortex is primarily and statically tuned to a low-frequency spectrum, it is unclear how the dynamic, high-frequency spectral information required for semantic decoding is represented. A rich body of electrophysiological literature documents diverse spectrogram coding in the STG. For example, Mesgarani et al. (Science, 2014) demonstrated using spectrotemporal receptive field models that neural populations in the STG are tuned to both low and high-frequency spectrograms well above 1 kHz. The authors must address this discrepancy and attempt to reconcile their static tonotopic findings with the existing literature on dynamic speech encoding.

      (4) While drawing parallels between visual and auditory processing hierarchies is conceptually attractive, the modalities face fundamentally different computational challenges. Vision is largely resolved in space, making a retinotopic spatial coding strategy ecologically and computationally sound. Audition, however, evolves continuously in time. Complex temporal structure, continuous temporal integration, and contextual recurrent computations are paramount for auditory processing, particularly for speech comprehension. In this sense, a purely spatial or tonotopic coding framework is insufficient to fully explain the complex temporal processing dynamics required in the higher-order auditory domain.

    3. Reviewer #3 (Public review):

      Summary:

      The work has the potential to identify the topographical organization of the auditory cortex, which remains controversial with current unnaturalistic sound stimulation, using an elegant approach developed in the visual domain with population receptive field mapping to study the organization of the visual system with naturalistic stimulation conditions.

      Strengths:

      This work presents an analysis of the topographic study of auditory cortical organization, using a substantial Human Connectome Project 7-Tesla functional imaging dataset in which 174 participants viewed naturalistic movies.

      Weaknesses:

      The key issue for the paper is that even the authors seem undecided on what the topographical results are and whether these results are consistent with, refute, or expand our notion of human auditory cortical field organization using this massive dataset obtained under movie-watching conditions. Short of this clarity, and much of the discussion of the issues surrounding topographic mapping is buried in the Supplementary materials section, it is not clear what the authors think the advance of the current work is beyond the large datasets.

      On the flip side, there is little consideration of the challenges of mapping the auditory cortex using naturalistic stimuli that prevent dissociating visual from auditory stimulation conditions, contributing to this clarity or lack thereof in tonotopic mapping.

      As such, the current manuscript struggles to achieve its full potential.

    1. Reviewer #1 (Public review):

      Summary:

      This study by Tsuji et al. explores a mechanical threat model in Drosophila using air puffs as a stimulus. The authors first establish the paradigm and show that air puffs induce cardiac deceleration along with increased locomotion. They then identify dopamine as a key regulator of this response and go on to map the underlying circuit. In doing so, they pinpoint two pairs of DA-WED neurons as critical players. They carefully used intersectional strategies to achieve relatively clean labeling of these neurons, which helps ensure that the observed effects can be attributed specifically to DA-WED neurons. They further show that DA-WED neurons are both required and sufficient to drive cardiac deceleration, and that their activity increases in response to air puff stimulation. These neurons also contribute to the locomotor response. Directly inducing cardiac deceleration via optogenetic manipulation of cardiomyocytes also increases locomotion, suggesting a link between cardiac state and behavioral output.

      Strengths:

      Overall, the experiments are thoughtfully designed, well-controlled, and clearly presented. The figures are easy to follow, and the conclusions are generally well supported by the data. The manuscript is also clearly written, with a discussion that acknowledges potential caveats and outlines future directions. The genetic tools, behavioral paradigm, heart rate measurement approaches, and stimulation methods introduced here will be valuable resources for the community.

      Weaknesses:

      A few minor points to add to the clarity of the manuscript:

      (1) The DA-WED driver (R48A08-AD ∩ VT008692-DBD ∩ TH-FLP) appears quite clean in the brain. However, since the study focuses on cardiac function and locomotion, it would be helpful to check expression in cardiomyocytes and the ventral nerve cord. This would help rule out any off-target expression that might contribute to the phenotypes and further support the idea of a descending pathway from brain dopaminergic neurons.

      (2) Since DA-WED>Kir2.1 abolishes the puff-induced locomotor response (Figure 5b), suggesting that DA-WED neurons are directly involved in mediating locomotion. In the model (Figure 5L), it might make more sense for the pathway from mechanical threat to locomotion to pass through DA-WED neurons. The authors could consider adjusting the schematic if they agree.

      (3) In line 408, Figure 5K should be 5L as it's a discussion of the model.

      (4) In Figure 5j, the x-axis is missing time labels. Even if it matches Figure 5h, adding labels would make it easier to interpret at a glance.

      (5) In line 312, it would be helpful to briefly explain why a 28 ms light pulse was used, compared to other pulse durations elsewhere in the paper.

      (6) The cardiac deceleration seems to recover quickly after the air puff ends, whereas the locomotor response persists longer (around 10-15 seconds; see Figure 1 and Figure 5). This difference might suggest that DA-WED neurons influence locomotion through an additional or partially independent pathway, beyond their role in cardiac regulation. It could be worth briefly discussing this possibility.

    2. Reviewer #2 (Public review):

      Summary:

      The authors study cardiac deceleration during threat responses in Drosophila. Particularly, it focuses on identifying the neuronal control of this deceleration. Using behavioral and cardiac tracking and analysis, genetics, and calcium imaging, they identify two pairs of dopaminergic neurons involved in cardiac deceleration during air puff responses

      Strengths:

      The study is overall well done, and the paper is clearly written. Particularly, the work on identifying the two pairs of dopaminergic neurons involved in cardiac deceleration using a series of drivers and generating new ones is rigorous and extensive. Finally, the authors manipulate the heartbeat to investigate how it influences threat responses

      Weaknesses:

      There are, however, several points that need to be clarified, as some claims are not entirely supported by evidence.

      The authors, for example, claim that dopaminergic neurons are responsible for cardiac deceleration (during the air puff, lines 182-3, page 9). However, based on the work in this study, it seems that other neurons could be involved in this control as well. In addition to dopaminergic neurons, the authors test serotonergic and octopaminergic neurons, which, based on silencing experiments, also show an implication in heart-beat deceleration. Furthermore, because they find that dopaminergic neurons are the only ones that, upon thermogenetic activation, lead to lower heart beat frequency, they conclude that the dopaminergic neurons are responsible for air -puff induced cardiac deceleration.

      However, these activation experiments are done in a different context than the air puff experiments (at a higher temperature, which could have an effect on the heartbeat changes upon activation of different neuron groups), and because silencing of other monoaminergic neuron types during the air puff also resulted in less cardiac deceleration, one cannot exclude the implication of octopaminergic or serotonergic neurons in air-puff-induced deceleration.

      Activation experiments without high temperatures (using, for example, optogenetics) and/or in the presence of the air puff would be important to determine that the dopaminergic neurons are the main type of monoaminergic neurons involved in air-puff-induced cardiac deceleration. Otherwise, the related claims should be rephrased in a way that clearly doesn't exclude a possible implication of other monoaminergic neurons.

      Regarding the interactions between the cardiac deceleration and locomotion, the authors propose, based on the results, that the optogenetic cardiac deceleration is sufficient to induce an increase in locomotion, and that it is the decrease in heartbeat that would be responsible via interoceptive pathways to trigger an increase in locomotion. In the model they propose, the DA-WED neurons would induce a decrease in heartbeat that, in turn, would trigger an increase in locomotion. There is not enough proof that cardiac deceleration is the one that triggers an increase in locomotion during air puff responses. As the authors themselves state, the experiments that would demonstrate this would involve preventing cardiac deceleration while optogenetically activating DA-WED. It can therefore not be excluded that the DA-WED neurons trigger an increase in locomotion that is possibly modulated by the cardiac activity. Both alternatives should be considered (models in Figures 4 and 5).

    3. Reviewer #3 (Public review):

      Summary:

      In this elegant study, Tsuji et al. identify a relationship in Drosophila between cardiodynamics and threatening stimuli where mild air puffs elicit a brief bradycardia that coincides with locomotion increases. They then take advantage of the arsenal of genetic tools available in the fruit fly to reveal the indispensability of dopamine, through the action of Dop1R2, in this phenomenon. Further, they pinpoint the source of this dopamine to two specific pairs of neurons - DA-WED that are threat-activated. They then test and find a potential role for cardiac interoception from the heart in linking behavior and cardiodynamics.

      Strengths:

      This is an interesting and timely story that brings together the tools of fruit fly systems neuroscience and links it with physiology. The experiments are well done and tell a very nice story. In particular, the primary message of the paper - that the authors have identified specific dopaminergic neurons that regulate cardiac activity - is sound.

      Weaknesses:

      There are no important problems with the scientific approach. Rather, there are some interpretive changes I would consider.

      (1) The changes in heart rate are small (10% or so), and, as far as I can tell, are evident for a beat or two. So the data may be better interpreted not as a change in rate but as a lengthening of diastole for a beat or two. That may seem a petty difference, but it might point to particular stretch-activated systems or changes in blood flow as the determinant.

      Heart rate must be averaged over time, and so might be blurring the effects. It may be useful to produce figures centered on beat count and duration rather than time. Because the effect may even be just on a single beat, we suggest the authors try plotting the average beat duration for each beat that follows the air puff. If it's really just the first beat, using a quantification of the change of this duration relative to the average that precedes the puff may produce more striking figures.

      (2) The author's model that cardiac deceleration leads to walking data is only partially supported by their data. In the first figure, the relationship between cardiac deceleration and walking probability seems to be inverted relative to their model (weak stimulus -> strong cardiac effect and weak locomotor effect; strong stimulus-> weak cardiac effect and strong locomotor effect). It is possible that this discrepancy may disappear when the authors look at beat duration rather than heart rate (for instance, if following the strong stimulus, there is a very long beat that is followed by tachycardia, thus weakening their observed HR change). It would also be easier to relate this data in Figure 1 to their interoceptive model if some data were shown that illustrated the relative timing of the cardiac change and the locomotor start.

      (3) Also, since the locomotor and cardiac changes are probabilistic, it would be very useful to see how their respective probabilities change when conditioned on the other. According to their interoceptive model, locomotion should preferentially increase on trials where cardiac deceleration occurs. The authors should discuss this incongruity and also potential alternative interpretations of their cardiac manipulation experiments. Perhaps the bradycardia makes them more sensitive to threats - as suggested in the introduction? Control flies show a mild increase in locomotion following green light (Figure 5j), so perhaps by slowing the heart, they are more sensitive and thus respond more strongly to this stimulus?

      (4) Looking at the example shapes of the beats in Figure 5g versus Figure 1c, the optogenetically induced diastole has a very different shape from the naturally occurring long beat. Thus, the exact cardiac stimulus may be unnatural. If this is true across trials and animals, it may be worth considering that the funny beat (like an anxiogenic atrial fibrillation in mammals) is the source of the fear and, in turn, locomotor behavior (also interesting!) rather than being a true replication of the cardiac events seen following the puff stimulus.

    1. Reviewer #1 (Public review):

      Summary:

      The current study is a follow-up to a previously published study by the same research group (Nold et al. 2025). In the previous study, the authors had included a set of exploratory analyses which assessed the effects of fitness level (denominated by a relative FTP), sex, and drug treatment (Naxolone versus placebo). In this previous study, the authors state that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities".

      In the current study, the authors have recruited an additional 22 female participants (21 included in analysis) from local cycling clubs to assess if fitness level does indeed impact exercise-induced hypoalgesia responses to experimental thermal and pressure pain models.

      Strengths:

      The current study has the potential to present a convincing argument about the effect of fitness level and potentially other factors (e.g., sex) on exercise-induced hypoalgesia responses. Combining data across two of their primary studies would be highly fruitful to the research community interested in this area. Specifically, it has the potential to inform sports medicine practitioners and how they administer exercise protocols to help those experiencing pain with a further consideration for the fitness level (and maybe sex) of their patients.

      Weaknesses:

      However, the current study makes several bold claims about the role of fitness level and sex on exercise-induced hypoalgesia, which I do not believe that this study on its own - or in conjunction with the previously published study by the same authors - can make at present. Namely, the current study does not appear to conduct any specific analyses between the cohorts from either study (current and present). The results mention a difference in the group mean values in "fitness level" between cohorts, but the analysis itself on pain responses/exercise-induced hypoalgesia is limited only to the cohort from the current study. If the authors wanted to provide a convincing argument that fitness level has an effect on exercise-induced hypoalgesia, then the analysis of this study would have to include an analysis between the groups considered to be of "high" and "low" fitness level. I do not think the current study does this. Instead, it makes an assumption from the previous study (Nold et al. 2025) which only states that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities". The analysis of this study would have to include fitness level "high fitness" versus "low fitness" of participants across both studies in its statistical model to properly discern if fitness level has an impact on exercise-induced hypoalgesia.

      A similar comment can be made with respect to sex differences, as these have not been assessed in the analysis of this study either.

      Another area of weakness in this study is how "fitness level" has been demarcated across participants. One issue is how authors have assumed that the current cohort is 'fit', whereas the previous cohort was 'less fit', meaning that the authors could be coming to false conclusions about fitness level. In detail, figures within the current study show a large overlap between the 'fit' and 'less fit' cohorts, where some participants have a higher relative functional threshold power (FTP) in the 'less fit' cohort than the 'fit' cohort and vice versa. Therefore, I believe the authors should better demarcate between those that are in the 'more fit' and 'less fit' groups according to a validated and well-established criterion from the kinesiology and sport science literature. That being said, I think this may be problematic in some ways as FTP is considered a relatively poor measure to denote fitness levels, a limitation highlighted in the previous study's review.

      Altogether, whilst I commend the researchers on their body of work across the two studies, the current methods and analysis provide an incomplete assessment of their primary research question, and therefore, I would urge the authors to reconsider some of their methods/analysis and the framing of their results to better reflect the main research question they have attempted to answer. Likewise, I would recommend that readers ensure they consider the current results with caution until the authors have addressed some areas of concern which currently limit their main conclusions.

    2. Reviewer #2 (Public review):

      This study addresses an important question regarding exercise-induced modulation of pain in women, but the conclusions appear to be based on relatively limited and selective evidence. The authors report an interaction between exercise intensity and stimulus intensity, which they interpret as evidence for exercise-induced hypoalgesia and conclude that fitness, but not sex, modulates this effect. However, this main result relies on a relatively small interaction that emerges only under specific conditions, with inconsistent findings across pain modalities and stimulus intensities, and an analysis approach that does not fully exploit the continuous pain ratings collected. The lack of a baseline condition further limits the interpretability of the findings as reflecting hypoalgesia, and overall, the data provide a rather constrained basis for drawing broader conclusions.

      Strengths:

      (1) The focus on women is important and timely, particularly given the ambiguity in prior findings and the historical bias toward male-dominated samples.

      (2) The attempt to revisit previous findings in a new cohort is valuable in principle.

      Weaknesses:

      (1) The core interpretation may not be fully supported by the data

      The central claim-that the results demonstrate exercise-induced hypoalgesia and its dependence on fitness but not sex-does not appear to be fully supported by the evidence presented.

      1.1 Lack of baseline condition

      The absence of a no-exercise baseline substantially limits interpretation. The study compares high- and low-intensity exercise, but without a baseline, it is not possible to determine whether either condition produces hypoalgesia or hyperalgesia relative to calibration. The observed HI-LI difference, therefore, reflects only a relative contrast between exercise intensities, not an absolute reduction in pain. As a result, attributing the findings to "hypoalgesia" may be difficult to justify fully.

      1.2 Lack of internal replication across conditions

      The reported effect is highly specific and does not clearly generalise across the experimental design. It emerges significantly only for heat pain at the highest stimulus intensity, with no clear effects for other intensities and for pressure pain. Moreover, the main statistical result is a relatively small interaction effect with a modest p value, which translates into a difference of approximately 6-8 VAS units on a 150 scale. This combination-a small effect size, limited statistical strength, and restriction to a single condition-substantially weakens the evidence for a robust or generalisable effect.

      1.3 Deviations from the original study and selective use of data

      Although framed as a follow-up to previous work, the current study introduces substantial methodological changes, particularly in the acquisition and scaling of pain ratings (continuous vs post-hoc ratings, modified VAS with sub-threshold range). Despite collecting rich continuous data, the analysis focuses on peak responses to approximate the previous study. While this may aid comparability, it results in a strong emphasis on a single data point (highest intensity), rather than leveraging the full dataset. This limits both interpretability and comparability.

      1.4 Over-reliance on null results regarding sex differences

      The conclusion that fitness, but not sex, modulates exercise-induced pain may not be directly supported by the data presented. The current study includes only highly fit women, and comparisons with men or less-fit women rely on non-significant differences in a previous cohort. The absence of a significant difference does not provide evidence for equivalence, and no formal statistical support for a null effect is provided. As such, conclusions about the absence of sex differences would unfortunately benefit from more cautious interpretation.

      (2) Limited sample and lack of diversity

      The dataset is narrow in scope, comprising a small sample (N = 21) of healthy, highly fit women. Key demographic characteristics (e.g. age range, BMI distribution) are not fully presented, explored or discussed. This limits generalisability and makes it difficult to draw broader conclusions about exercise-induced pain modulation in women, as the main focus of the study.

      (3) Methodological choices limit the interpretability of the data

      Several methodological decisions would benefit from stronger justification:

      3.1 The use of a non-standard VAS scale (0-150 with a fixed pain threshold at 50) is unconventional and may influence how participants report pain, while limiting comparability with related literature.

      3.2 Participants explicitly reported expecting exercise to reduce pain, introducing a potential confound that is not presently addressed.

      3.3 A more comprehensive use of the full time series of pain ratings would provide a stronger and more transparent basis for interpretation of the present findings.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates how cellular NAD/NADH ratios are controlled in cancer cell lines in vitro. The authors build on previous work, which shows that serine synthesis is sensitive to NAD/NADH ratios and PHGDH expression. Here, the authors demonstrate that serine synthesis is variable across a panel of cell lines, even when controlling for expression of serine synthesis enzymes such as PHGDH. The authors show that cellular NAD/NADH ratios correlate with the ability to synthesize serine and grow in serine-deprived environments when PHGDH levels remain constant. Investigating this variability in NAD/NADH ratios, the authors find that the cells that can positively respond to serine deprivation are able to increase oxygen consumption and cellular NAD/NADH ratios. Cells that do not increase oxygen consumption in response to serine deprivation do not increase NAD/NADH ratios and cannot grow well without serine. The authors go on to show that in cells with the ability to increase oxygen consumption upon serine deprivation, PHGDH expression alone is sufficient to fully restore growth-serine; in cells that cannot increase oxygen consumption, both PHGDH expression and interventions to increase NAD/NADH ratios are required to increase growth. Thus, cells need both PHGDH and NAD/NADH increases to maximize serine synthesis in response to serine deprivation. The authors previously showed that lipid synthesis likewise requires NAD regeneration. Interestingly, one cell line that does not increase oxygen consumption in response to serine limitation tends to increase oxygen consumption in response to lipid deprivation; accordingly, depriving this cell line of lipids increases the synthesis of serine. Together, these findings show that how cells respond to nutrient deprivation is highly variable and that the response to nutrient deprivation (for example, whether or not oxygen consumption is increased) will determine how well cells tolerate depletion of nutrients with related biosynthetic constraints. This work sheds light on the complexity of cancer cell metabolism and helps to explain why it is difficult to predict which nutrients will be limiting to any cancer cell type or environment.

      Strengths:

      (1) The authors use multiple interventions to manipulate NAD/NADH ratios in cells.

      (2) Experiments are well controlled and appropriately interpreted.

      Comments on revised version:

      The authors thoughtfully and thoroughly responded to all reviewer comments. The revised manuscript addresses the critiques.

    2. Reviewer #2 (Public review):

      In the manuscript "Cancer cells differentially modulate mitochondrial respiration to alter redox state and enable biomass synthesis in nutrient-limited environments", Chang et al investigate how cancer cells respond to the limitation of certain environmental nutrients by regulating the cellular NAD+/NADH ratio. They focus on serine and lipid metabolism, pathways known to be controlled by the NAD+/NADH ratio, and propose that changes in mitochondrial respiration in response to deprivation of these nutrients can influence the NAD+/NADH ratio, thereby impacting biomass synthesis.

      While the study is descriptive in nature and does not investigate specific molecular mechanisms that explain the crosstalk between nutrient availability and mitochondrial redox changes, the experimental component is robust, and the conclusions are well supported by the results. Some suggestions could further refine the conclusions and enhance the quality of the manuscript.

      Comments on revised version:

      The authors have provided a very comprehensive response. Their updated paper has improved, and the critiques have been mitigated.

    1. Reviewer #1 (Public review):

      The authors conducted a comprehensive benchmarking and evaluation of co-folding platforms, including AlphaFold3, Boltz-2, Chai-1, and the docking algorithm Dock3.7, which employs a physics-based scoring function that incorporates van der Waals interactions, electrostatics, and ligand desolvation energies. The system of interest was the SARS-CoV-2 NSP3 macrodomain (Mac1), an increasingly popular antiviral target, and the ligand sets comprised 557 unseen ligand poses (keeping the training for these co-folding platforms in mind). Additionally, the authors investigated whether the co-folding models could distinguish true ligands from non-binding small molecules. The study is thorough, with extensive statistical support and consensus across multiple metrics (chemoinformatics for quantifying ligand similarity and efficacy). The questions that the authors aim to address are whether the co-folding models struggle with memorization, whether they can distinguish between a true and a false binder, whether they replicate experimental binding affinities and efficacy, and how they compare to the physics-based docking algorithm (Dock3.7).

      Strengths:

      Overall, this is a scientifically solid paper.

      The work is highly detailed and well executed, featuring thorough data analysis and statistical assessment.

      Comments on revised version:

      The authors have adequately addressed my concerns.

    2. Reviewer #3 (Public review):

      Summary:

      Core conclusions are well-supported by data: co-folding outperforms docking in known ligand pose/affinity prediction (validated by RMSD and IC₅₀ correlation), struggles with false positive discrimination in virtual screens (lower AUC values), and is complementary to docking (non-correlated errors, distinct strengths in drug discovery stages).

      Strengths:

      Unprecedented prospective design with 557 novel Mac1-ligand complexes ensures rigorous, independent evaluation of co-folding methods, provides an unbiased and rigorous benchmark dataset, which contains structures and compounds absent from the co-folding models training sets. Comprehensive comparison of 3 co-folding tools (AlphaFold3, Chai-1, Boltz-2) with DOCK3.7 across diverse targets and metrics enables nuanced performance assessment. The revised results clarify an intriguing finding: co-folding can predict correct ligand poses even when protein formations are mispredicted. The study clearly demonstrates complementary roles of co-folding (superior pose/affinity prediction for known ligands) and docking (better hit prioritization), and addresses deep learning memorization concerns via ligand similarity analysis.

      Weaknesses:

      The study identifies a major limitation of co-folding-failure to capture rare protein conformational changes, which deserve future investigation. The authors include uncalibrated Boltz-2 affinity data (addressing a prior comment) but note that large-scale free energy perturbation (FEP) comparisons are beyond their capabilities.

      Appraisal of Aims Achieved:

      The authors successfully achieved their primary aims and the results provide strong, well-supported evidence for their core conclusions. Key conclusions are grounded in the study's unbiased, training-set independent data, ensures the conclusions are not confounded by model memorization and are broadly applicable to the field's use of these co-folding models.

      Field Impact:

      This study provides a critical reality check for the field: co-folding models are powerful tools for pose prediction but are not yet standalone solutions for virtual screening, a key distinction that will prevent over-reliance on these models and guide more rational tool selection.

    1. Reviewer #1 (Public review):

      Summary:

      Eroglu and Hobert demonstrate that injecting CRISPR guides and repair constructs to target three genes at a time, tagging each with a different fluorescent protein, and selecting which gene to tag with which fluorophore based on genes' expression levels, can improve efficiency of gene tagging.

      Strengths:

      This manuscript demonstrates that three genes can be targeted efficiently with three different fluorophores. It also presents some practical considerations, like using the fluorophore least complicated by agar/worm autofluorescence for genes with low expression levels, and cost calculations if the same methods were used on all genes.

      Weaknesses:

      Eroglu has demonstrated in a previous publication that single-stranded DNA injection can increase efficiency of CRISPR in C. elegans, while inserting two fluorescent proteins and a co-CRISPR marker into three loci, and Paix et al 2015 demonstrated simultaneous insertion of two fluorescent tags. The current work is valuable and incremental advance. In general, I applaud the authors' willingness to strategize about how whole proteome tagging might be accomplished. I predict that the advance here will be one of many small advances that will get the field to that goal. The title oversells the advance presented, in my view, since seems like one among many key advances, and the first sentence of the Discussion seems a more apt summary of the key advance here.

      Some injections targeted genes on the same chromosome together, which will create unnecessary issues when doing crossing that will be useful for some future experiments. This made me wonder if injecting 3 together really is helpful vs targeting each gene separately, since only 5 worms need to be injected. It cuts time down by 2/3, but perhaps avoiding targeting the same chromosome with two tags would be useful.

      The limited utility of current blue fluorescent proteins makes me wonder if it's worth using at this stage, before there are better blue fluorescent proteins, or better yet, far red, to avoid issues with live imaging under phototoxic UV or near-UV illumination.

    2. Reviewer #2 (Public review):

      Original Review:

      The manuscript by Eroglu and Hobert presents a set of strains each harboring up to three fluorescently tagged endogenous proteins. While there is technically nothing wrong with the method and the images are beautiful, we struggled to appreciate the advance of this work - who is this paper for?

      As a technical method, the advance is minimal since the first author had already demonstrated that three mutations (fluorophore insertion and co-CRISPR marker) could be introduced simultaneously.

      As a pilot for creating genome-scale resources, it is not clear whether three different fluorophores in one animal, while elegantly designed and implemented, will be desired by the broader community.

      Finally, the interpretation of the patterns observed in the created lines leaves much to be desired. A Table with all the observations must be included and can replace the tedious (and often wrong) descriptions of the observations with the different lines. It would be too much to point out every mistaken expectation of protein expression. Two examples include:

      The expectation that ACDH-10 is enriched in the intestine and epidermal tissues (hypodermis) is naïve - there are multiple paralogs of this protein (look at WormPaths or WormFlux) that may share functions in different tissues. There is also no reason to assume that fatty acid metabolism does not occur in other tissues (including the germline). Finally, there are no published studies about this enzyme, so we really don't know for sure what it's doing.

      The expectation that HXK-1 is ubiquitously expressed is similarly naïve. There are three paralogous enzymes that are all associated with the same reaction, and we have shown that these three function redundantly in vivo, perhaps in different tissues (PMID: 40011787). Moreover, single cell RNA-seq data (PMID: 38816550) also shows enrichment of hxk-1 in gonadal sheath cells.

      The table should have at least the following information: gene/protein name - Wormbase ID - TPM levels of single cell data assigned to tissues for L2, L4 and adult (all published) - tissues in which expression is observed in the lines presented by the authors.

      Other points:

      (1) We would encourage the authors to provide systematic validation of the reported insertions. The manuscript reports that 24 of 30 tags were isolated and visible but does not clearly state whether each isolated line was confirmed by sequence‑level validation to be correctly in‑frame and free of unintended mutations at the target locus.

      (2) The manuscript presents aggregated success counts (e.g., 8/10 mTagBFP2 tags, 9/10 mStayGold, 7/10 mScarlet3) and useful narrative descriptions of injection outcomes. We suggest also to include per‑locus success rates.

      (3) For pools that required re‑injection after initial failures, we would like to see a description of the specific changes that were made to the injection mixes or procedures (e.g., new repair template prep, different Cas9 reagent lot, guide redesign). This will be useful troubleshooting information for others.

      (4) The authors states that the fluorophore sequences are codon-optimized for C. elegans. We suggest they provide the exact donor/tag sequences used specifically state whether the fluorophore sequences contain any synthetic/artificial introns or other sequence modifications (e.g., silent PAM‑disrupting mutations) were included in the donor templates.

      (5) Page 3: Include a reference for "The C. elegans genome encodes around 20,000 genes"

      We hope these comments are useful.

      Comments on Revised Version:

      Overall, we found the responses to be quite recalcitrant.

      We have one remaining composite concern about the comparison between observed expression patterns with the new strains versus published data.

      First, the authors only report patterns for one stage while it should be not too much effort to image the different life stages. However, since this is a revision, we are not formally requesting they do this.

      Second, in the now provided Table (thank you) 'observed expression' (last column) is lacking for 9 of the 30 proteins, and for 6 of these the procedure was not successful. Why not report patterns for the other three? It is confusing also because on page 5, the authors say that "overall, 24 of 30 tags ...all of which were visible with fluorescence stereomicroscopy" - are we missing something? Also, they then said that they "obtained 6/9 of the originally failed tags"; why are the corresponding patterns not included in table 1, and are 9 proteins still labeled as "no" in the "success?" Column?

      Third, we strongly feel that the response to our comments about expression patterns is not adequate. On page 5 the authors say that "all proteins were expected to be ubiquitously expressed" and that "scRNA-seq indicated that transcript abundance was ubiquitous and without strong tissue-specific enrichment with few exceptions". However, in their rebuttal, the authors now argue for tissue-specific expression for proteins with paralogs, turning around their own argument! Moreover, their Table indicates that many genes show tissue-enriched expression by RNA-seq while many of their tagged proteins exhibit ubiquitous expression.

      Overall, this indicates that both the overall accomplishment of generating tagged protein strains and analyzing their expression is oversold.

    3. Reviewer #3 (Public review):

      Summary:

      The authors argue that establishing the expression pattern and sub-cellular localisation of an animal's proteome will highlight hypotheses for further study. This claim is probably accepted by many in the community. This manuscript seeks to confirm the feasibility of establishing such a resource, by using current transgenic methods to knock in DNA encoding different colored fluorescent tags into C. elegans genes.

      Strengths:

      The authors make the points above. For example, they provide evidence that the C. elegans germline harbors two populations of mitochondria that differ qualitatively in the proteins they express. They also confirm that labelling the whole proteome is an achievable goal with relatively limited resources and time.

      Weaknesses:

      The work is somewhat incremental in that it uses existing transgenic technology. Cell biology in C. elegans is challenging because of the small size of many of its cells, notably neurons. This can make establishing the sub-cellular localisation of a fluorescently tagged protein, or co-localizing it with another protein, tricky. The authors point out in their introduction that advances in light microscopy such as diSPIM, STED and ISM (a close relative of SIM), have increased the resolution of light microscopy. They also point out that recent advances in expansion microscopy can similarly help overcome the resolution limit. However, they do not use these technologies to characterize their transgenic strains.

    4. Reviewer #4 (Public review):

      Summary:

      Tagging the entire proteome of a metazoan would be a landmark achievement, providing a powerful complement and extension to existing "omic" catalogs in model systems. Here, Eroglu and Hobert argue that efficiently tagging multiple loci in a single "batch" would make the community-based achievement of this goal realistic. They provide rigorous evidence that such an approach is indeed feasible, exploring issues related to efficiency, design and screening strategies, disruption of gene function, and the potential for endogenously tagged alleles to reveal unexpected aspects of protein expression and localization. While the work has some minor gaps that are important to rigorously assess the feasibility of the proposed effort, the detailed and valuable insights that emerge should provide impetus to the community to coordinate efforts to make this ambitious goal a reality.

      Strengths:

      The work has numerous strengths. The authors provide compelling evidence that:

      - three distinct loci can be efficiently targeted with three distinct fluorescent tags in a single injection.

      - thoughtful targeting design can reduce the likelihood of disruption of function by the tag.

      - systematic design principles based on expression level and predicted localization/function can be used to optimize tagging strategies.

      - the resulting tags can provide unexpected insight into patterns of protein production and subcellular localization.

      Not all of these advances are novel in themselves, but taken together, they represent an important technical and conceptual advance. The most important strength comes from the exceptionally high value of the goal itself, in that the work is that it has the potential to spur a community-wide effort toward achieving the ambitious goal of proteome-wide tagging.

      Weaknesses:

      The work's shortcomings are minor.

      - One concern has to do with the feasibility of the proposed screening strategies. The experimental design cleverly coinjects tags for three loci in different gene expression 'zones'; this expression level determines which tag will be used. As the authors allude to, there is an important distinction between genes with the same overall FKPM value between those that are expressed broadly and those focally expressed in a specific tissue. The proposed strategy claims that there are a sufficient number of highly expressed genes "to be used as visible markers" for recovering successfully edited animals. It would be useful for the authors to discuss the issue of broad vs focused expression among this set of genes a bit more thoroughly, with an eye toward the issue of how likely it is that these genes could indeed consistently be used as visible markers, particularly for those at the low end of this limit.

      - What fraction of the proteome (on a per-gene basis) is secreted proteins? How difficult will it be to screen these for successful tags? Are there specific tags that would be more optimal for secreted proteins? (The authors mention the use of an SL2 or T2A cassette to label the cells in which these proteins are expressed but note that there are technical challenges associated with doing this at scale.)

      - For secreted and/or weakly expressed genes, it would be useful for the authors to estimate for what fraction of these would successful insertions need to be screened by PCR, and what resources (time and money) this would likely entail.

      - For how many genes would a single tag not capture all predicted isoforms?

      - Finally, some readers might object to the authors' assertion in the abstract that this work is "a first step in this direction" (presumably referring to designing a strategy for whole-proteome tagging). There is no concern that the authors are disregarding the extensive work of other groups, as they explicitly mention the contributions of other groups to the foundation that enables the present work. However, the spirit of the abstract could be misinterpreted by a well-intentioned reader.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors aimed to uncover novel therapeutic vulnerabilities in APC-mutant colorectal cancer (CRC), which constitutes the majority of CRC cases. They hypothesized that modulating oxygen-sensing pathways (via PHD inhibition) could disrupt adaptive stress responses in these tumours.

      Strengths:

      The study employs a powerful, two-pronged approach to identify Molidustat's targets. By using both Thermal Proteome Profiling (TPP) and an orthogonal chemical proteomic competition assay, the authors provide compelling evidence that GSTP1 is a genuine, direct off-target, effectively addressing the common limitation of indirect effects in proteomic screens.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine Molidustat targets and the potential utility of these findings. They clearly demonstrate that Molidustat interferes with GSTP1 and some other proteins on top of PHD2. They also demonstrate that PHD2 deletion is not sufficient to recapitulate Molidustat effects in cells and proteomes. Finally, they demonstrate synthetic lethality in organoids for Molidustat and APC deletion.

      Strengths:

      The data on Molidustat proteomes, GSTP1 binding, inhibition and metabolic health of organoids is really clear. All biochemical, docking and omic data are really strong. The potential impact of these findings could be the use of Molidustat in APC null tumours and awareness of potential off-target effects.

    3. Reviewer #3 (Public review):

      In this paper, the authors revealed that Molidustat can induce a dose-dependent increase in Caspase-3/7 activity in the HT29 cell line, which is an APC-mutant colorectal cancer cell line. More importantly, they found that targeting PHD2 alone cannot cause cell death. By using thermal proteome profiling (TPP) and orthogonal chemical proteomic competition assays, they determined GTSP1 as a previously undiscovered off-target of Molidustat. They also revealed that combined PHD2 and GSTP1 loss leads to an increase in intracellular ROS and apoptosis. Moreover, they evaluated the effects of Molidustat in colonic organoids and showed that Molidustat has a high selectivity for colonic organoids with activated WNT signaling and/or KRAS pathway alterations, and this effect is not reproduced by hydroxylase inhibition alone, providing a new potential approach to targeting both PHD2 and GTSP1 for the treatment of APC-mutant CRC.

    1. Reviewer #1 (Public review):

      Summary:

      Knowing that small pupil-size variations accompany brightness variations (even when these are illusory), the authors asked whether pupil constrictions would accompany the synesthetic perception of a brighter color (compared with a darker one), induced by the presentation of a black-white character. This grapheme-colour synesthesia is only experienced by few participants, sixteen of whom were enrolled in this study. The results reliably showed that a relative pupil constriction would "betray" the perception of a brighter color in these participants, while no such effect would be observed in control participants who were asked to report a color in association with each grapheme, even though they did not perceive any.

      Strengths:

      The main strength of the study lays in its combination of psychophysics (brightness ratings) and pupillometry, which allowed for showing clear-cut results.

      Weaknesses:

      I only see the following relatively minor weaknesses, namely:

      - The pupil traces in Figure3 (main results) are heavily pre-processed (per-participant demeaned), loosing any feature besides the effect of interest. As I argued in my first review, I worry that this format gives unrealistic expectations about the effect (the perception of dark/bright colors do not generate a net dilation/constriction of the pupil; perception-related modulations of pupil size are always relative and generally small compared to the numerous other effects registered in pupil size; these include a pupil dilation that is more prominent in the controls and that gets analyzed later on in the manuscript; I do not think that eliminating one of the effects of interests from a main results figure helps the reader understand the results). In the revised manuscript, the authors addressed this concern by adding a Supplementary Figure 4, where a more complete representation of the results is shown (traces from individual trials are baseline corrected and averaged, resulting in more informative timecourses). I would strongly recommend that Supplementary Figure4 is brought to the main text (Figure3 could be presented in Supplementary).

      - Responses to physical brightness modulations were only measured in the synesthethes group, not in controls. The authors point out that pupillary light responses have been thoroughly characterized in previous studies, and conclude that synesthethes' responses were in line with the expectations both in terms of amplitude and latency. However, as we are not dealing with standardized measurements, subtle differences in pupil reactivity across the two populations remain a possibility. I recommend that this possibility is mentioned in the discussion.

      Impact:

      This work is likely to improve our understanding of synesthesia, providing a new tool to quantify the subjective sensations; an interesting potential extension would be using pupillometry for tracking changes over time of the synesthetic experiences, opening up the possibility to evaluate the importance of learning for this peculiar experience.

    2. Reviewer #2 (Public review):

      Synesthesia is a neurological condition where stimulation of one sensory channel leads to involuntary, automatic, and consistent experience of another, unrelated percept. For example, Sir Francis Galton (1880, Nature) famously described the robust tendency of some individual (synesthetes) to associate numerals with a distinct color. Ever since, synesthesia keeps attracting a broad interest in the cognitive neurosciences in light of its implications for the study of domains such as perception, consciousness, and brain connectivity, among others.

      Strauch, Leenaars, and Rouw measured pupil size in a group of 16 grapheme-color synesthetes and two matched control groups. The participants were presented with gray digits - that is, visual stimuli having identical physical properties in terms of brightness. Each participant subsequently rated the corresponding evoked color and brightness: unlike controls, synesthetes did so in a very consistent and reliable fashion. Accordingly, this was also shown in their pupils: despite the same objective luminance, digits associated with brighter percepts caused their pupils to constrict and digits associated with darker percepts caused their pupils to dilate more than controls. These results highlight how crossmodal correspondences are deeply rooted in synesthetes, and puts forward pupillometry as a particularly appealing biomarker for some phenomenological experience (at least those grounded in "brightness").

      Further strengths of the technique are its temporal resolution and its responsiveness to several constructs. Across several tasks, the authors show for example that responses to synesthetic light are somewhat slower than responses to real light (i.e., they are likely mediated), but at the same time faster than responses to mental imagery. The role of mental imagery can also be reasonably dismissed when considering the second feature of pupil size: its responsiveness to mental effort and cognitive load. The pupils tend to dilate with demanding, challenging tasks, and this was the case when control participants were asked to report the color of a digit for which they did not consistently experience a synesthetic association. The same task was, instead, seemingly effortless for synesthetes, again speaking in favor of the automaticity of number-color correspondences in their case.

      Overall, the findings by Strauch, Leenaars, and Rouw are highly significant for the field and likely to be impactful. The strength of their evidence, when accounting for the relatively small sample size and the inherent variability of both phenomenology (color perception and subjective reporting) and physiology (pupil size), is adequate and sufficiently convincing.

      Comments on revisions:

      I thank the authors for addressing all my comments in a satisfactory way. I think that the paper has improved, especially in terms of transparency of the reporting and clarity of the results.

    3. Reviewer #3 (Public review):

      Summary:

      In the present study, the authors examined pupillary responses to uncolored stimuli (number graphemes) among number-color synesthetes and non-synesthetes. After seeing a digit, the synesthetes and active control participants were asked to indicate which color they perceived using three dimensions of hue, saturation, and lightness. The lightness values were the primary independent variable for follow-up analyses. To see how the pupil responded to psychologically "bright" and "dark" digits, the authors split the reported lightness values at the median and plotted them. The synesthetes showed a pupillary constriction to digits they perceived as bright and dilation to digits they perceived as dark. Active control participants did not show that effect. In a subsequent block, only the synesthetes were shown the colors they reported perceiving as colored discs. Their pupillary responses were similar. The authors also found that the differences in pupillary responses between light and dark perceptions (with digits) were only slightly delayed in their onset to the perception of a colored disc, and therefore the color perception accompanying a digit is unlikely to be effortful or a retrieved association, but occurs rather automatically.

      Strengths:

      The authors employed a well-controlled and designed quasi-experiment comparing color-grapheme synesthetes to non-synesthetes and showed convincingly that the color perceptions accompanying graphemes alter the physical perception of brightness. They also made a reasoned attempt to ruled out the possibility that color associations are occurring effortful via retrieved associations.

      The follow are questions which I had asked in a first round of reviews, and which were answered adequately by the authors:

      (1) Are the pupillary responses among synesthetes, which objectively do not seem to match the degree of physical stimulation entering the retina, in any way maladaptive for eye functioning? I understand the constriction/dilation of the pupil to not only benefit visual acuity but also to protect the retina from damage. Are synesthetes at any risk of retinal damage due to over-dilation of the pupil to brighter stimuli? Or are these effects of a magnitude that is too small to matter? As reported in arbitrary units, it was hard to know how large these effects were in terms of measurable changes in dilation (e.g., millimeters).

      (2) Likewise, is the automatic synesthetic merging of two percepts something that could be learned such that natural synesthetes and "artificial" synesthetes would look similar? For example, if a group of non-synesthetic participants were to learn a color-grapheme association to automaticity, would you expect their pupillary responses to the graphemes look similar to the synesthetes? If so (or if not), what would this tell us anything about the phenomenology of synesthesia?

      (3) Do the synesthetic perceptions of digit graphemes merge in a sensible way? For example, if a synesthete sees a particular color with the digit 1, and a different color with the digit 9, what do they perceive when they see 19? or 1-9, or 1 9? Is there color blending, or an altogether different color perception?

    1. Reviewer #1 (Public review):

      This work compiles a comprehensive atlas of ncORFs across mammalian tissues and cell types, derived from reanalysis of ~400 public ribosome profiling datasets. The authors then evaluate cross-species conservation and functional signatures, proposing that evolutionarily ancient ncORFs tend to have higher translation potential, stronger expression, and closer relationships with canonical coding sequences.

      Strengths:

      In general, the study provides a large-scale and timely resource of annotated ncORFs, which could be broadly useful for the community. The authors collected ~400 public ribosome profiling datasets for annotations of ncORFs, which, to my best knowledge, is the largest collection of data for such purpose. The catalog could facilitate future investigations into ncORF biology and broaden understanding of the coding potential of the "non-coding" genome.

      Weaknesses:

      Based on the ncORF catalog, some of the analyses were not properly done. Some of the results are descriptive.

      (1) Bias and representations of data source. Public ribo-seq datasets are unevenly distributed across tissues and cell lines, raising concerns about heterogeneity and underrepresentation of certain contexts. This may limit the generalizability of the catalog.

      (2) The discussion on modular domains of ncORFs is unclear, and the claim that they may originate via TE-related mechanisms is not well supported. Stronger evidence or clearer reasoning is needed.

      (3) The conservation comparisons are not fully convincing. Figure S7 shows only mild differences between ncORFs and CDS, and statistical significance is not clearly demonstrated. Comparisons with other non-coding RNAs should be added, and overlapping sequences between ncORFs and CDS should be excluded to avoid bias.

      (4) Figure 3 indicates that some ncORFs are subject to evolutionary constraints. This is not surprising. The authors should provide further analyses on more detailed features of these "conserved" ncORFs vs. the "non-conserved" ones. Some pretty informative works have been done in drosophila, worms, mouse, and human. Figure 3 suggests some ncORFs are under evolutionary constraint, but this is not unexpected. More granular analyses contrasting "conserved" versus "non-conserved" ncORFs would be informative. In fact, small ORFs, especially uORFs, have been extensively studied, for their functions and corss-species conservations. The authors should explicitly show what is new here in their analyses.

      (5) Translation levels are reported using RPF counts. However, translation efficiency (normalized by RNA expression) is a more appropriate measure to account for expression heterogeneity.

      (6) The correlation analyses between ncORF translation levels and PhyloCSF are confusing and largely descriptive. These sections need sharper framing and clearer conclusions.

      (7) Public ribo-seq datasets, generated by different research labs, are known for their strong batch effects. Representations of tissues and cells are also very unbalanced. Therefore, the co-translation analysis between ncORFs and canonical CDS is not well controlled. This should be done by referring to a recent large-scale ribo-seq meta-analysis (Nat Biotechnol. 2025. doi: 10.1038/s41587-025-02718-5).

      Comments on revisions:

      The authors have made efforts to address most of the previous concerns, and several points have been clarified or improved in the revision. However, in a number of cases, the responses rely more on acknowledgment and reframing rather than substantive analytical strengthening. Overall, the manuscript is improved, particularly in terms of clarity, transparency, and positioning of claims. I support its publication and look forward to seeing how the field engages with and discusses these claims.

    2. Reviewer #2 (Public review):

      Summary:

      Chang et al. attempted to analyze a large number of ribo-seq datasets through a standardized pipeline, identifying novel non-canonical ORFs and elucidating their evolutionary and expression characteristics.

      Strengths:

      (1) The datasets analyzed by the authors are sufficiently comprehensive, and the use of standardized pipelines ensures excellent analytical consistency.

      (2) Their analyses of ORF evolution and co-expression further deepen our understanding of these ORFs.

      Weaknesses:

      (1) The authors primarily conducted analyses through bioinformatics, lacking sufficient wet-lab experimental evidence.

      (2) Some analytical methods and standards were not clearly presented in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Meijer et al. sought to investigate the role of cortical layer 6b (L6b) neurons in modulating sleep-wake states and cortical oscillations under baseline and sleep deprived conditions and in response to orexin A and B. Using chronic EEG recordings in mice with silencing of Drd1a+ neurons (via constitutive Cre-dependent knockout of SNAP25), the authors report that while overall baseline sleep-wake architecture and response to sleep deprivation are minimal/unchanged, "L6b silencing leads" to a slowing of theta activity during wakefulness and REM sleep, and a reduction in EEG power during NREM sleep. The manuscript is well written with clarity and transparency. Although Drd1a+ neurons are not exclusive to L6b, the authors describe key future studies to identify a causal role for L6b neurons in brain state regulation. These studies contribute to a growing body of evidence that cortex-in addition to subcortical brain regions-plays a role in brain state regulation.

      Strengths:

      (1) The text is well written.

      (2) The authors are transparent about methodological details and study limitations.

      (3) The stated sleep, circadian, and orexin infusion experiments are well designed, executed, and analyzed.

      Weaknesses:

      (1) Outcomes are attributed to silencing cortical L6b neurons, but the genetic manipulation is not specific to L6b neurons or cortex. The authors acknowledge this as a limitation and offer targets for future studies to identify L6b neuron-specific contributions to stated outcomes that include spatially restricted manipulations.

      (2) Experiments use only male mice, which limits generalizability to females.

      Comments on revised version:

      The authors took great care in addressing my previous comments, and I do not have any additional concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Meijer and colleagues investigated the effects of inactivation (conditional silencing) of cortical layer 6b neurons on sleep-wake states and EEG spectral power under the following three conditions: during natural sleep-wake states, after sleep deprivation, or after intracerebroventricular administration of orexin A and B. The authors report that silencing of L6b neurons did not have a significant effect on the total time spent in sleep-wake states, duration or number of state epochs, or the response to sleep deprivation. However, silencing of L6b neurons did slow down theta-frequency (6-9 Hz) during wake and REM sleep, and reduced the total EEG power during NREM sleep. Infusion of orexin A in the mice in which cortical layer 6b neurons were inactivated produced an increase in wakefulness. A similar effect was observed after infusion of orexin A in the mice in which these neurons were not silenced, but the effect (i.e., increase in wakefulness) was of a smaller magnitude. Silencing of cortical layer 6b neurons attenuated the effect of orexin B in increasing theta activity, as was observed in the control mice. The authors conclude that the cortical neurons in layer 6b play an essential role in state-dependent dynamics of brain activity, vigilance state control and sleep regulation.

      Strengths:

      - A focus on cortical layer 6b neurons, which is an understudied neuronal population, especially in the context of brain and behavioral state transitions.

      - The authors used a well-established mouse model to study the effect of inactivation of cortical layer 6b neurons.

      Weaknesses:

      - Although the authors used a highly selective approach to silence layer 6b neurons, the observed changes in EEG oscillations cannot be solely attributed to layer 6b neurons because of the ICV route for orexin administration.

      - The rationale for using only male rats is not provided.

      Comments on revised version:

      The authors have addressed my concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript explores the role of the Evening Complex (EC), specifically focusing on ELF3, a disordered protein component of the EC, and its temperature-dependent phase behavior. The study highlights the role of polyQ tracts in modulating temperature-sensitive condensate formation and provides a combination of computational approaches, including REST2 simulations and coarse-grained Martini simulations, to investigate how polyQ tract length and sequence context influence this behavior.

      Strengths:

      The study addresses a key question in plant biology - how temperature influences circadian clock-mediated growth regulation through protein phase behavior. The manuscript introduces the novel finding that polyQ tract length modulates the temperature-dependent formation of helices and condensates.

      Weaknesses:

      (1) Coarse-Grained Simulation Results Not Supported by Data:

      The results presented in Figure 6A of the manuscript do not seem to show a clear trend in the number of clusters formed as a function of polyQ tract length. This is particularly evident in the comparison between 0Q and 7Q polyQ lengths, which display statistically similar values in terms of the number of clusters. The lack of distinction between these values raises questions about the sensitivity of the coarse-grained simulations to polyQ tract length, which the authors claim as a key modulator of condensate formation. This discrepancy weakens the argument that polyQ length directly impacts the clustering behavior in the simulations.

      Suggested Analysis:

      a) A more detailed statistical analysis should be performed to assess whether the observed differences between polyQ lengths are significant. This could involve hypothesis testing or the use of error bars in the graphs to better communicate the variability in the data.

      b) Additionally, the authors should examine whether there are other features, such as cluster shape or internal structure, that might differentiate between different polyQ lengths, even if the total number of clusters is similar.

      (2) Inconsistency in Cluster Size Across Temperatures (Figure 6B):

      The results in Figure 6B show a striking difference in the size of the largest cluster between temperatures of 290K and 300K. This abrupt shift in behavior lacks a clear mechanistic explanation. Typically, phase transitions driven by temperature are more gradual, unless there is some underlying structural or chemical shift that the authors have not accounted for. Without a clear explanation, this sudden change in behavior reduces confidence in the simulation results.

      Suggested Analysis:

      a) The authors should explore possible explanations for the dramatic difference in cluster size between 290K and 300K. For example, they could investigate whether specific interactions (such as the breaking or formation of hydrogen bonds or hydrophobic contacts) might explain the behavior at higher temperatures.

      b) It is important to check whether the coarse-grained simulation model has been adequately parameterized and scaled for accurate temperature dependence. Atomistic simulations of monomers and dimers with varying polyQ tract lengths could be used to fine-tune the coarse-grained model, ensuring it accurately reflects molecular behavior. The gross estimate of a 10% scaling factor might be insufficient and could lead to inaccurate representations of cluster formation.

      (3) Scaling of Coarse-Grained Model with Atomistic Simulations:

      As mentioned, the coarse-grained model used in the study may not have been properly scaled against atomistic data. A simple scaling factor of 10% may not be appropriate for accurately capturing the behavior of polyQ tracts across different lengths, especially considering their sensitivity to subtle changes in temperature. Without rigorous validation against atomistic simulations, the coarse-grained model's predictions could be skewed.

      Suggested Analysis:

      a) To address this, the authors should compare the coarse-grained model with atomistic simulations of monomeric and dimeric forms of ELF3 with different polyQ tract lengths. By comparing key structural parameters (e.g., radius of gyration, contact maps, and clustering propensity), the authors could adjust the coarse-grained model to more accurately reflect the atomistic behavior. The authors have wealth of atomistic simulation data that could afford such benchmarking and identification of scaling factor

      b) Additionally, the authors should investigate whether the assumed scaling factor of 10% is appropriate for each polyQ length or whether it needs to be refined based on specific properties, such as the number of hydrophobic interactions or secondary structure stability.

      (4) Lack of Analysis for Liquid-Like Behavior in Phase Separation:

      The simulations presented in the manuscript do not analyze the liquid-like behavior of ELF3 condensates, which is a key characteristic of liquid-liquid phase separation (LLPS). In LLPS systems, condensates are often dynamic, with chains exchanging between clusters, indicating liquid-like rather than solid-like behavior. The authors fail to probe this crucial aspect, which is necessary to support the claim that ELF3 undergoes phase separation.

      Suggested Analysis:

      a) The authors should conduct additional analyses to probe the liquid-like nature of the clusters formed by ELF3. One approach would be to analyze the dynamics of chain exchange between clusters, measuring how frequently chains leave one cluster and join another over time. This analysis would reveal whether the condensates behave as liquid-like, dynamic structures or more static, solid-like aggregates.

      b) Additionally, the temperature dependence of these exchange dynamics should be investigated. In true liquid-liquid phase separation, the rate of chain exchange is often sensitive to temperature. Observing how this rate changes between 290K and 300K, for instance, could help explain the abrupt shift in cluster size seen in Figure 6B.

      c) The authors should also analyze whether the internal structures of the condensates are consistent with a liquid-like phase. For example, radial distribution functions and contact lifetimes could be calculated to reveal whether the clusters exhibit liquid-like organization.

      (5) Lack of justification of polydispersity of polyQ:

      The authors don't provide any rationale for choice of different copies of polyQ used in the manuscript for their chain-growth simulation studies. It will be more apt if it can be motivated via some precedent experimental observations.

      (6) Lack of initiative to connect to Experiments:

      While the computational models and simulations provide robust theoretical insights, the absence of direct experimental validation weakens the overall impact of the manuscript. For example, experimental data on how specific mutations in the polyQ tract influence ELF3 behavior in vivo would significantly bolster the authors' claims. The manuscript would benefit from either citing existing experimental studies that corroborate these findings or from suggesting future experimental directions.

      Comments on revised version:

      The authors have now adequately addressed to the key concerns of manuscript. The manuscript in the present form looks significantly improved.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate how ELF3, a disordered scaffolding protein in the plant circadian Evening Complex, responds to temperature by forming reversible nuclear condensates. They focus on the C-terminal prion-like domain and on a variable polyglutamine tract within it, asking how the tract length and surrounding sequence context tune temperature-responsive structural and condensation behavior. Using a tiered set of computational approaches, including sequence heuristics, hierarchical chain-growth ensembles, all-atom enhanced-sampling simulations, and coarse-grained condensate simulations of 100 monomers, they characterize wild-type, polyQ deletion, polyQ expansion, and an aromatic-disrupting F527A variant. In the revised manuscript, the central claim has been reframed so that polyQ length is now described as tuning condensate material properties rather than driving temperature-sensitive phase separation, with temperature-responsive condensation attributed primarily to a sticker-rich aromatic contact network.

      Strengths:

      The biological question is important and timely, and the multiscale computational strategy provides a fresh view of an intrinsically disordered protein and its variants. The all-atom enhanced sampling analyses identify a temperature-dependent long-range aromatic contact involving F527 and a methionine-tyrosine coordination motif, which are concrete and mechanistically interesting observations beyond what coarse-grained or sequence-only methods could provide. In response to the previous round of review the authors have added replicate averaged statistics with error bars on the new condensate analyses, introduced new dynamics observables including effective diffusivity, an anomalous diffusion exponent, the self van Hove function, shape anisotropy, per chain radius of gyration in the condensed phase, and a condensate lifetime, provided cluster size time series for transparency, justified the choice of polyQ tract lengths against published Arabidopsis polymorphisms, expanded the Methods with explicit formulas for the new analyses, and included a split half convergence check for the all atom ensembles. The reframing toward a sticker spacer interpretation is consistent with recent experimental work and represents a more cautious and defensible reading of the data.

      Weaknesses:

      Despite these substantive additions, several core concerns from the previous review remain only partially addressed, and, on close reading, the new supplementary analyses do not robustly support the reframed claim that polyQ length tunes condensate material properties. Error bars and replicate-averaged statistics were added to the new condensate panels, but the helical propensity and per-residue analyses throughout the rest of the manuscript still show only a single curve per temperature, so variability for these key observables remains unreported. Several of the newly added dynamics observables show that the variants are essentially indistinguishable within the reported uncertainty: the self van Hove distributions, the shape anisotropy distributions, and the per chain radius of gyration distributions in the condensed phase overlap almost entirely across variants, and the anomalous diffusion exponent has between replica spreads at low temperature that exceed the variant to variant differences, with variant orderings that change with temperature. The variant-dependent signal that does survive, namely a drop in condensate lifetime for the polyQ expansion and the aromatic mutant at the highest temperature studied, rests on a single temperature point, with replicate spreads spanning most of the metric's dynamic range.

      The cluster size time series at higher temperatures shows the dominant cluster oscillating over a wide range across replicas, indicating intermittent dissolution and incomplete convergence in the very temperature regime where the variant-specific claims are made. The only convergence test provided is a split-half radius-of-gyration analysis for the all-atom ensembles, with no slab-geometry or coexistence-density check for the coarse-grained condensate simulations. The polyQ deletion variant forms dominant clusters comparable in size to wild type at low and intermediate temperatures, which on its own argues that variable polyQ presence is not a primary determinant of clustering and supports the earlier concern that the temperature sensitive behavior is dominated by generic chain length and aromatic sticker effects rather than polyQ specific sequence effects, a concern that the reframing softens but does not resolve. Statistical significance is not assessed anywhere, and with three replicas and largely overlapping error bars, claims of variant-specific differences would benefit from explicit statistical tests. Minor quality control issues are also visible in the supplementary material, including a mislabeling of the aromatic mutant in two analysis panels and an inconsistent trajectory length for one variant at one temperature.

      Additional Context for Readers:

      Readers should interpret the molecular mechanism proposed here with caution. The reframing from polyQ length driving temperature-sensitive phase separation to polyQ length tuning of condensate material properties is more scientifically measured and aligns with recent experimental work, but several of the supplementary observables introduced to support this revised claim indicate that the variants studied are statistically indistinguishable within the reported replicate uncertainty. The most robust observation in the revised work is that the prion-like domain undergoes a temperature-responsive break of an aromatic contact in all-atom simulations and that aromatic sticker contacts dominate inter-protein interactions in coarse-grained condensate simulations. The mechanistic role of the polyQ tract, beyond generic chain length and hydration effects, remains, as in the original submission, not clearly established by the simulations presented. Independent experimental validation of the proposed aromatic contact and of the predicted material-state differences between polyQ variants will be needed to establish the molecular mechanism, and improved condensate convergence tests, uniformly reported error bars across all simulation-derived figures, and explicit statistical tests of variant-versus-variant differences would substantially strengthen confidence in the conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel approach to subcellular spatial proteomics by combining laser microdissection with expansion microscopy and LC-MS/MS analysis (SPEx). They implement two different workflows for LMD and LC-MS/MS quantification:

      (1)The standard approach, where an area of interest is cut out by LMD, subjected to proteomics analysis, and compared to the rest of the cell without the dissected ROI.

      (2) The subtraction approach, where ROIs are removed, and the remaining cellular material is compared to samples containing both the surrounding material and the ROI.

      The authors assess the technique by applying it to subcellular targets of various sizes, volumes, and protein compositions such as the nucleus, nucleoli, and Golgi. They demonstrate that SPEx can identify proteins enriched or reduced in ROIs.

      Strengths:

      The broad, relatively easy, and inexpensive applicability of this approach to potentially many cell types and subcellular areas of interest provides an exciting alternative to subcellular fractionation, native immunoprecipitation, or genetically encoded proximity labeling constructs. Moreover, by visually selecting ROIs for subsequent analysis, subcellular context or organelle morphology can be taken into account, as discussed by the authors in the discussion section.

      Weaknesses:

      While strongly supporting the sharing of this approach, we have a number of comments and questions that will improve the impact of the manuscript:

      (1) General:

      a) The manuscript would benefit from restructuring and language revision. In its current form, the writing is sometimes dense and verbose (in particular, the Results section). This makes it difficult to follow the authors' arguments.

      b) The authors mention the possibility of selecting organelles based on morphology. This is left for the discussion, but it seems like a missed opportunity - the authors could compare individual organelles in different morphological states, e.g., connected vs. fragmented mitochondria.

      (2) Technical:

      a) Why do the authors strive and optimize for a 10x expansion factor? Is SPEx compatible with a more standard 4x expansion, as e.g., used in the classic U-ExM approach (https://www.nature.com/articles/s41592-018-0238-1)? This could be added to the discussion.

      b) The U-ExM approach shows improved ultrastructural preservation when using 3%FA with 0.1% glutaraldehyde fixation (GA). Is SPEx compatible with the use of low amounts of GA for fixation?

      c) Related to the above, was the anchoring efficiency reduced only to achieve a 10x expansion factor or does this additionally affect the proteome coverage?

      d) Have the authors considered using alternative anchoring approaches, such as GMA (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0291506#pone.0291506.s001), which potentially increase the amount of sample retained in the hydrogel, thus allowing for better proteome coverage? This could be added to the discussion.

      e) The limitation of the approach to near-2D samples should be mentioned, and alternative approaches for more 3D samples could be discussed.

      f) How are peptides that are directly anchored to the hydrogel dealt with during LC-MS/MS analysis? Are they excluded, or can they be identified during the spectral search? The latter would allow us to get a deeper structural understanding of how proteins are actually anchored into hydrogels, which so far has not been assessed.

      An alternative approach to address this question would be to investigate if the peptide coverage of proteins detected by SPEx is enriched for peptides representing the folded core of proteins as opposed to the surface-exposed regions, which likely get more anchored into the hydrogel.

      g) Same question regarding peptides with NHS labeling. Can they be identified, or do they just compete for ionization and thus negatively affect coverage and dynamic range of the LC-MS/MS approach?

      h) How are the primary and secondary antibodies affecting the proteomics analysis identified as contaminants?

      i) Have the authors observed differences in proteomics coverage of only antibody vs NHS-labeling? Depending on the questions above, could pure antibody-based labeling increase proteomic coverage?

    2. Reviewer #2 (Public review):

      Summary:

      This study introduces a method that combines physical expansion of cells, imaging-guided isolation of defined regions, and protein identification to enable compartment-resolved analysis of protein composition at the subcellular scale. The authors aim to address a central limitation in existing approaches, namely the loss of spatial information during sample preparation or the indirect nature of proximity-based labeling methods. Using several cellular compartments as examples, they demonstrate that their approach can recover compartment-enriched protein sets and identify candidate proteins with previously unassigned localization.

      Strengths:

      A major strength of this work is the conceptual simplicity and accessibility of the approach. By combining established techniques in a modular way, the method avoids the need for genetic manipulation or specialized labeling strategies, making it broadly adaptable across experimental systems. The ability to directly select regions of interest based on imaging represents a clear advantage over indirect enrichment strategies and allows flexible targeting of both membrane-bound and non-membrane-bound compartments.

      The experimental design is also a strong aspect of the study. The use of complementary comparison strategies-analyzing isolated compartments alongside matched "subtracted" controls-provides an internal framework for assessing enrichment and depletion, increasing confidence in spatial assignment. The application of the method across multiple organelles of different sizes and properties demonstrates versatility, and the reported specificity for several compartments is encouraging. In particular, the ability to profile small and biochemically challenging structures highlights a potentially important niche for the approach.

      Weaknesses:

      Despite these strengths, several methodological limitations constrain the interpretation of the results. The most important relates to spatial accuracy in three dimensions. While lateral resolution is improved through physical expansion, the lack of depth resolution introduces uncertainty regarding contributions from structures above and below the selected region. Although the authors argue that this does not substantially affect specificity, the current evidence is largely indirect, and a more rigorous quantification of potential contamination would strengthen this conclusion.<br /> Quantitative interpretation also remains challenging. Because the measurements reflect total protein abundance rather than local concentration, differences in compartment size and protein density can influence enrichment values, particularly for small structures embedded within larger volumes. This issue is evident in the analysis of smaller compartments and complicates direct comparison across conditions. Additional normalization or modeling would help clarify how to interpret these measurements.

      Another limitation concerns variability in the expansion process and its downstream consequences. Differences in expansion factor across samples may affect the definition of regions of interest and introduce variability in sampling, yet the impact of this variability is not fully explored. Similarly, the use of a modified chemical treatment to preserve proteins for downstream analysis is central to the workflow but is not extensively validated with respect to preservation of spatial organization.

      While the identification of previously unannotated proteins is an appealing aspect of the study, validation is limited to a small number of examples, and broader support from independent datasets or literature context is lacking. In addition, the study primarily focuses on steady-state measurements in a single cell type, and therefore does not yet demonstrate the ability of the method to capture dynamic or condition-dependent changes in protein localization.

      Finally, the positioning of the method relative to existing approaches could be more clearly articulated. Although qualitative comparisons are provided, a more systematic and quantitative benchmarking against alternative strategies would help readers better understand the specific advantages and trade-offs.

    3. Reviewer #3 (Public review):

      Franziscus et al. describe an elegant approach for spatially specific proteome analysis. To achieve this, they expand fixed cells and subsequently use a laser to micro-dissect a region of interest, which is then analyzed by mass spectrometry.

      They demonstrate the effectiveness of their approach by analyzing the nucleus, nucleolus, and the Golgi, and benchmark their hits against previous datasets for these organelles.

      The manuscript is very well written and nicely guides the reader through the applied methods. The presented data is convincing, and I do not see the need for additional experimental verification of the protocol. The only minor concern is the novelty of the method and the presentation. A combination of expansion, laser microdissection, and proteomics has been applied in the past (PMID: 36450705, PMID: 39477916). In the manuscript, one of these studies is cited, though it does not become clear that this approach is already described. However, Franziscus et al. describe the approach better and make it more accessible to the reader, especially since the other studies described this methodology in combination with tissue expansion and not in combination with single cell expansion as it is done here. I would ask the authors to be clearer in the introduction about what others have already done and what their contribution is here. In general, I am convinced that the community will benefit from the presented protocol to analyze organelle proteomics in detail.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates a fundamental question in cognitive science: is our ability to reason about the physical world an abstract mental process, or is it "embodied"-directly rooted in our real-time physical interactions with the environment? The authors compared participants' performance in computerized reasoning games with and without Galvanic Vestibular Stimulation (GVS). They suggest that participants failed more often and utilized suboptimal strategies under GVS compared to a sham stimulation condition. Furthermore, they found that this detrimental effect of GVS was reduced when the games were governed by altered gravity (hyper- and hypo-gravity). Consequently, the authors conclude that the physical experience of the body modifies high-level cognitive skills, such as reasoning.

      Strengths:

      The manuscript is well-written, organized, and easy to follow, making complex concepts accessible. Also, combining a specialized physical reasoning task with real-time vestibular disruption (GVS) is an intriguing approach to testing the boundaries of embodied cognition.

      Weaknesses:

      (1) Lack of Overall Effects and Inflated Type I Error for Game-Level Effects

      The study utilizes a within-subject design. Taking Study 1 as an example, each subject participated in a familiarization session (4 games), a baseline session (12 games without stimulation), a GVS session (14 games), and a sham session (14 games). No game was repeated for any single subject. Performance was quantified using three primary measures (success rate, number of attempts, and time per attempt) and two strategy measures (tool switching and the distance between tool placements).

      For Study 1, to identify condition differences at the game level (i.e., Figure 2), the authors effectively conducted 70 independent t-tests (5 measures × 14 games). While 7 significant results were reported, this large number of independent tests invites an inflated Type I error rate, as no multiple-comparison correction appears to have been applied.

      A similar inflation is expected in Study 2, where 50 independent t-tests (5 measures × 10 games) yielded 5 significant comparisons (Figure 4). Although the authors might argue the direction of the differences is systematic, implying GVS generally impairs performance, at least one significant comparison shows the opposite effect: tool switching indicates that GVS led to better performance for the 'Table_A' game in Study 2 (Figure 4d), whereas the same variable indicated GVS led to worse performance in Study 1 (Figure 2d). I suspect that none of the significant game-level results would survive a proper statistical correction. If possible, the authors can redo statistical testing with corrections (FDR or Bonferroni) or with LMM using game as a random effect. Before proper statistical analyses, I strongly encourage the authors to refrain from drawing broad conclusions based on these isolated game-level results.

      Furthermore, when analyzing data across all games, the study found no significant effect of GVS on overall performance or strategy measures in either Study 1 or Study 2. This lack of an aggregate effect contradicts the authors' conclusion that participants failed more often or utilized suboptimal strategies under GVS.

      (2) Missing Rationale for Classification Analysis

      It is puzzling why the authors pursued two exploratory analyses on tool placement after revealing that the two related primary measures (tool positioning and switching) did not generate significant condition differences in Study 1. These additional analyses-the Dirichlet Process Gaussian Mixture Model and leave-one-out classification-were not pre-registered. In the absence of overall condition differences, the authors appear to be "doubling down" by applying sophisticated classification tools to the raw data without a clear prior rationale.

      (3) Insufficient Evidence for the Reduced Effect of GVS Under Altered Gravity

      To compare Study 1 and Study 2, the authors devised a "gravity-weighted index," but its definition is not sufficiently justified. The index assigns weights of 1, 2, and 3 to low-, medium-, and high-gravity-dependent games, respectively. The choice of these specific weights appears arbitrary, making the quantitative results difficult to interpret. More importantly, there is no citation or explanation regarding how these three levels of "gravity impact" were defined in the first place (Line 468). This index was also not pre-registered.

      The authors state that for the success rate index, a value close to -1 indicates a large negative difference for GVS, 0 indicates no difference, and 1 indicates a large positive difference. These are theoretical bounds; the actual distribution of each index should be examined to validate such claims. However, the paper lacks descriptive statistics for this composite index.

      Notably, the "reduction" of the GVS effect in altered gravity was only demonstrated in one of the five available indices (success rate, p = 0.046). In fact, the success rate in Study 2 was 66.7(sham) vs 67.3 (GVS) in Table 2. It is highly debatable whether this marginal result justifies the conclusion that GVS effects "were reduced when the games included reasoning about altered gravity".

      (4) Questionable Assumptions Regarding Strategy

      The authors assume that "big changes in tool positioning and frequent tool switching indicate poor evaluation of the failed outcome". This assumption is questionable. In solving this cognitive task, participants must explore and exploit solutions based on feedback. Large shifts in positioning or frequent tool switching might reflect active, adaptive exploration based on failed outcomes rather than a failure to evaluate them.

      (5) Confounding Factors in GVS Interpretation

      The central theoretical question is whether physical reasoning is grounded in physical experience. GVS is used here to manipulate that experience. However, GVS does not selectively target the vestibular nerve; it also activates distributed fronto-parietal attention networks and hippocampal circuits essential for any reasoning task. Additionally, the vestibular system is linked to the limbic system and the cerebellum, which regulate emotional reactivity and arousal. Because attention and emotion are likely affected by GVS, the authors should be much more cautious in attributing their behavioral findings solely to changes in the "physical experience of the body."

    2. Reviewer #2 (Public review):

      Summary

      The paper investigates whether the real-time physical experience of the body shapes high-level physical reasoning. Participants played a set of computerized tool-use reasoning games (the Virtual Tools paradigm) in which they must use knowledge of physical laws - including gravity, collisions, and inertia - to guide a ball into a target area. In Study 1, participants played the games under terrestrial gravity while receiving either Galvanic Vestibular Stimulation (GVS), which introduces noise into the vestibular organ and disrupts gravitational signalling, or a Sham condition with matched skin sensation. In Study 2, a separate cohort played the same games redesigned under hypogravity (0.5 g - half Earth g) or hypergravity (2 g - double Earth g), again with concurrent GVS or Sham stimulation. Performance was assessed through success rate, number of attempts, and time per attempt; strategy was assessed through the spatial distance between successive tool placements and the frequency of tool switching across attempts. A post-hoc gravity-weighted index (GWI) was computed to compare the effect of vestibular perturbation across the two studies. The main finding is that GVS impairs performance in gravity-dependent games under terrestrial gravity, yet the same perturbation appears to be neutral or even beneficial when the game environment involves non-terrestrial gravity - a result the authors interpret as evidence for an adaptable, body-grounded internal model of physics.

      Strengths

      One of the most notable strengths of this work is its conceptual positioning at the intersection of embodied cognition and physical reasoning. Rather than treating the human body either as an abstract information-processing device or as a purely biomechanical system, the authors take seriously the idea that cognition is scaffolded by ongoing sensorimotor state - and they test this idea with a paradigm that is both tractable and theoretically motivated. The use of the Virtual Tools paradigm is well-suited to this goal: the games vary systematically in their reliance on gravitational predictions, allowing selective impairment (rather than general disruption) to serve as a signature of embodied physical reasoning.

      The dual-study design is another strength. Testing the same vestibular perturbation under terrestrial and altered game-gravity conditions, and observing a reversal in its effect depending on context, provides a form of internal control that is conceptually compelling. The additional clustering analyses (Dirichlet Process Gaussian Mixture Model and leave-one-out kernel density classification) strengthen the strategy results beyond raw distance measures, confirming that GVS systematically shifts participants' spatial exploration strategies.

      The paper is also clearly written and engages meaningfully with relevant theoretical frameworks - predictive coding, embodied cognition, and stochastic resonance - making it accessible and stimulating for a broad audience.

      Weaknesses

      (1) Absence of multiple-comparisons correction. A large number of game-level pairwise t-tests are conducted in both studies (upward of twenty per study) without correction for familywise error rate. The game-level effects that anchor the main narrative - in Study 1 alone: Remove, GoalMove, Spiky, Falling_A, Shafts_B, Gap, and Chaining - arise from an uncorrected pool of comparisons. The probability that some of these constitute false positives is non-trivial. The authors should apply a correction (e.g., Benjamini-Hochberg) or at a minimum discuss this limitation explicitly.

      (2) The facilitation claim rests on a post-hoc and arbitrarily parameterized index. The gravity-weighted index (GWI), which drives the central cross-study comparison, uses integer coefficients (1, 2, 3) to weight games by gravity dependency level. These coefficients are entirely arbitrary and bear no principled relationship to the actual gravitational magnitudes used in the study. Why not use the gravity dependency ratings themselves, or the empirically estimated gravity impact scores from the computational modelling mentioned in the Methods? The choice of weights should be either principled or tested across a range of values to demonstrate robustness. Furthermore, the notation in equation (1) as currently typeset reads as "Gravity minus Weighted Index" rather than "Gravity-Weighted Index"; this should be corrected.

      (3) The "facilitation" interpretation exceeds what the data in Study 2 directly support. Across all games in Study 2, GVS versus Sham differences in absolute performance are non-significant in all directions. The facilitation claim derives entirely from the GWI being higher in Study 2 than in Study 1 - a between-subjects comparison involving different participant groups and a non-pre-registered metric. The language of "facilitation" should be tempered accordingly, or the authors should provide additional analyses to support this framing.

      (4) Gravitational manipulation is visual only, and the vestibular system is only one component of the gravity-sensing network. Gravity perception results, as the authors very well know, from a distributed multisensory integration process that involves, in addition to the vestibular system, visual, proprioceptive, and visceral inputs. The present paradigm manipulates gravitational context solely through visual cues and targets the vestibular system through GVS - a point the authors acknowledge but do not discuss in sufficient depth. It is important to distinguish clearly between real gravitational alterations (as achieved in parabolic flight or centrifuge environments, where the entire body is physically subjected to a different gravitational vector) and virtually altered gravity, where only one sensory modality is targeted while others remain anchored to 1 g. The scope of the conclusions should reflect this distinction.

      (5) The choice of 0.5 g and 2 g may lack sensitivity. Combining the two altered-gravity conditions in Study 2, because no significant effect of hypo versus hypergravity was found, is statistically pragmatic but conceptually unsatisfying. There is evidence in the space physiology literature that gravitational processing is not linearly symmetric around 1 g: threshold effects exist below and above terrestrial gravity that may not be captured by modest deviations (half and double g) - see refs below. It is worth discussing whether the absence of a hypo/hyper distinction in Study 2 reflects a genuine equivalence or a lack of sensitivity, and whether more extreme conditions (e.g., near-zero g or 4-5 g) might reveal different processing regimes. Whether 0.5 g and 2 g were sufficient to saturate the system or merely insufficient to perturb it remains an open question with direct implications for the interpretation of the null GWI effects on strategy measures.

      Lee SMC, Ribeiro LC, Martin DS, Zwart SR, Feiveson AH, Laurie SS, Macias BR, Crucian BE, Krieger S, Weber D, Grune T, Platts SH, Smith SM, and Stenger MB. Arterial structure and function during and after long-duration spaceflight. J Appl Physiol (1985) 129: 108-123, 2020.

      de Winkel KN, Clément G, Groen EL, and Werkhoven PJ. The perception of verticality in lunar and Martian gravity conditions. Neurosci Lett 529: 7-11, 2012.

      Clément G, Moore ST, Raphan T, and Cohen B. Perception of tilt (somatogravic illusion) in response to sustained linear acceleration during spaceflight. Exp Brain Res 138: 410-418, 2001.

      Benson AJ, Kass JR, and Vogel H. European vestibular experiments on the Spacelab-1 mission: 4. Thresholds of perception of whole-body linear oscillation. Exp Brain Res 64: 264-271, 1986.

      (6) High-level reasoning is not defined with sufficient precision. The term "high-level reasoning" appears from the title onward and in the heading of the Study 1 results section (line 138), but it is never formally defined. The reader needs a clearer account of what distinguishes high-level physical reasoning from low-level sensorimotor prediction, and where the games used here fall along that continuum. What specific physical competencies - ballistic trajectories, free-fall predictions, collision dynamics, frictional forces, inertial effects - are required across the game set? When describing the subset of games that drive key effects, this information is critical for evaluating whether effects are specific to gravity reasoning or to some other physical concept.

      (7) Performance measures are disconnected from underlying kinematics. The performance measures (success rate, number of attempts, time per attempt) are coarse, high-level summaries. Time per attempt is used as a proxy for performance efficiency, yet participants received no instructions regarding speed, and different individuals may have adopted systematically different speed-accuracy trade-offs. It would be valuable to know whether time per attempt correlates with attempt number within a given game (which would indicate within-game learning) and whether mouse movement data - trajectory, velocity, hesitation - were recorded and could be analysed to provide more mechanistic insight into strategy formation.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript investigates a theoretically important question in cognitive science: whether higher-level physical reasoning is an abstract, modular process or is grounded in real-time body-environment interactions. To address this question, the authors combine galvanic vestibular stimulation (GVS) with the Virtual Tools task to test whether perturbing vestibular gravity signals affects performance in physical reasoning. The study is conceptually innovative and has the potential to bridge embodied sensory processing and higher-level cognition. However, in its current form, the evidence only partially supports the main claims, and several aspects of the analysis and interpretation limit the strength of the conclusions.

      Strengths:

      A major strength of the manuscript is the originality of the experimental paradigm. The combination of galvanic vestibular stimulation (GVS), which perturbs gravity-related vestibular signals, with computerized game-based tasks that require physical reasoning provides a novel way to test whether ongoing bodily experience influences higher-level cognition. Conceptually, the study is highly original and meaningfully bridges two domains that are often studied separately: sensorimotor processing and higher-level cognition.

      Weaknesses:

      The main weakness of the manuscript is that its central conclusion is not strongly supported by the data. The key finding depends on a marginally significant cross-study comparison, whereas direct GVS-versus-Sham differences in Study 2 are minimal across aggregate measures. In addition, many game-level analyses involve a large number of uncorrected multiple comparisons, raising the possibility that some of the reported effects may reflect chance findings. The manuscript's most important metric, the Gravity-Weighted Index, was not preregistered and is exploratory in nature, yet it is treated as a primary basis for confirmatory conclusions. The cross-study comparison is also difficult to interpret because the two studies differ in participant samples, number of games, and partially in the stimulus set. Finally, the mechanistic claims in the Discussion-particularly those invoking predictive coding, stochastic resonance, or updating of internal gravity models-go well beyond what can be directly inferred from the present behavioral data. Overall, the study provides intriguing but limited evidence that vestibular signals may influence some physical reasoning tasks under specific conditions, rather than strong evidence for a broad account of physical reasoning as grounded in online vestibular processing

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors study two residues in the GHKL ATPase active site of Aq MutL and GyrB, and argue that the catalytic base function is shared between two conserved acidic residues that are 3 residues apart.

      They generated mutant versions in MutL and GyrB (both ala and the appropriate Asn/Gln version) and performed ATPase analysis. They also generated high-resolution crystal structures of the GyrB NTD with AMPPnP for WT and mutants of the two acidic residues. The data show that mutation in either of these residues does not fully kill activity (with the exception of the Alanine mutation of the first of the two, which interferes with ATP (or AMPPnP) binding). When the acidic residues are mutated to Asn/Gln, the catalytic water can still be positioned, and hence these mutants are more active than the Ala mutants. In both cases, the double mutation is catalytically dead.

      The authors then perform phylogenetic analysis and ancestral gene reconstruction, and based on this, they argue that HSP90 forms a different class of GHKL ATPases, and lost rather than gained this separate status.

      Strengths:

      The biochemical analysis seems solid.

      Weaknesses:

      (1) A major question that remains is why the mutations have so much more detrimental effect in MutL (100-fold lower kcat/KM) than they do in GyrB (3-fold lower). Can the authors explain this? Doesn't this argue against the proposed catalytic conservation?

      (2) The structure figures all have omit maps for just the AMPPnP and the water, whereas the density for the acidic residues and their mutants is not shown.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Fukui et al. re-examined the ATP hydrolysis mechanism in GHKL ATPases, revealing a cooperative role of two conserved acidic residues rather than one. The authors have used a range of biochemical and structural techniques on various mutants from different members of the GHKL ATPase family to test and validate their proposed mechanism.

      Through a detailed re-analysis of their previously published structure of the aqMutL NTD (ATPase domain) in complex with AMPPCP, they identified Glu29 and Glu32 as interacting with nucleophilic water for the catalysis. The authors carefully dissected the respective roles of these two acidic residues with a series of site-directed mutations. Mutations at Glu29 impaired ATPase activity without affecting protein secondary structure or ATP binding in the case of the E29Q mutant. Moreover, mutations at Glu32 did not affect secondary structure (except for E32G) but reduced ATPase activity. Activity was abolished when both residues (E29Q/E32Q) were mutated.

      The authors extended their study to another GHKL ATPase, aqGyrB. Their findings further supported the cooperative function of the corresponding acidic residues in aqGyrB (Glu48 and Asp51) during ATP hydrolysis. Mutation of these residues partially impaired ATP hydrolysis without affecting protein secondary structure. ATPase activity was completely lost in the double mutant E48Q/D51M. While the E48Q mutant retained the ability to bind ATP, the E48A mutant did not. High-resolution structures of the WT and E48A, E48Q, D51A, and D51N mutants of the aqGyrB NTD demonstrated that nucleophilic water positioning depended on these residues. E48 played a dominant role in water positioning and is critical for stabilising ATP lid formation and associated conformational changes, whereas D51 contributed cooperatively to catalysis.

      The authors investigated the functional impact of mutating the corresponding residues in the human MutL homologs PMS2 and MLH1. Clinical variants consistently exhibited reduced or abolished ATPase activity, providing a potential molecular basis for Lynch syndrome through impaired DNA mismatch repair.

      Lastly, through evolutionary analysis, the authors inferred that the second acidic residue was likely present in the common ancestor of MutL, GyrB, and MORC proteins, but was lost in the case of Hsp90.

      Strengths:

      (1) This study contains a detailed structural and biochemical analysis of a biologically important set of GHKL ATPases. The authors identify a second acidic residue that is conserved and contributes to catalysis in a large subset of GHKL ATPases. An updated and extended mechanistic model of ATP hydrolysis by this class of enzymes is proposed, which involves cooperative and partially overlapping roles for the catalytic residue pair. This revised mechanistic model is invaluable for the interpretation of clinical variants of GHKL ATPases such as PMS2 and MLH1.

      (2) The work described was performed to an excellent and rigorous technical standard. The structural and biochemical data are sound. The evidence supporting the claims is compelling.

      Weaknesses:

      (1) The identification in this study of a second acidic residue contributing to catalysis but not absolutely essential for catalysis is a useful finding. However, given that many structures of GHLK ATPases have been determined with different nucleotide analogs bound and that the essential role of the first acidic residue is well established, the importance and scope of the advances described here remain focused within the field of study of GHKL ATPases.

      (2) The authors assessed the consequences of variants in the human MutL homologs PMS2 and MLH1, but various other human GHKL ATPases contain clinically relevant variants, some of which have stronger disease associations than the mutations examined in this study. A broader analysis of the effect (or likely effect) of disease-linked mutations in GHKL ATPases would have strengthened this study.

      (3) In MLH1, the E37K mutation completely abolishes ATPase activity, but the corresponding mutations in aqMutL, aqGyrB, and PMS2 do not. It remains unclear why E37K in MLH1 leads to complete loss of activity, as the authors propose that water molecule positioning via the first acidic residue, as well as ATP lid stabilisation and associated conformational changes, should still be possible.

      (4) The authors do not examine ATP binding in the E32 mutants of aqMutL NTD and the D51 mutants of aqGyrB, or AMPPNP binding of the NLH1 and PMS2 mutants. Hence, the relative contributions of the acidic residues to ATP binding and hydrolysis remain partially unclear.

      (5) The ATPase assays for PMS2 and MLH1 (Figure 7 and Table 1) were performed with purification/solubility tags still present. Hence, it cannot be ruled out that these tags influence the measured activities.

      (6) The authors suggest that the two-acidic-residue mechanism proposed in this study could be shared among several GHKL ATPase families, yet they also state that the hydrogen-bonding network was not observed in MutL and MORC family proteins. This raises doubt about how conserved the mechanism is, e.g., in MutL and MORC proteins.

    1. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is very important research question.

      Strengths:

      The biggest strength of the manuscript is vast number of mouse strains used.

      Weaknesses:

      After the review, there are still some open questions from my side:

      (1) I would like the authors to defend their choice of diet type since this has not been done in the review/response to authors. In case they cannot, we need additional proof (HFD or WD model).

      (2) Since the authors used same control groups (chow and HFCD), as required by the animal ethics committee, they must have power analysis test to show that the number of controls (but also in other groups) they used is enough to see the effect. Please provide it.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Comments on revisions:

      I have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      This study asks whether synapses formed by the same broad neuronal class (excitatory pyramidal neurons, PN) adapt their presynaptic organization in a cortex-specific manner, comparing the prefrontal cortex (PFC) with the primary somatosensory cortex (S1). The authors combine sophisticated electrophysiology (paired recordings and extracellular minimal stimulation), pharmacological perturbations of presynaptic Ca²⁺-secretion coupling, bouton Ca²⁺ imaging, and mechanistic modeling. Across two prominent excitatory connections (Layer 5 (L5) PN-L5PN and L2/3-L5PN), they provide convergent evidence that mature PFC synapses operate with looser Ca²⁺ channel-release sensor coupling than their S1 counterparts.

      Overall, the study provides an appealing mechanistic link between synaptic nano/micro-architecture and cortical-area specialization. The idea that PFC synapses retain a more "plasticity-favoring" presynaptic state, while the primary sensory cortex emphasizes reliability and timing precision, is potentially impactful for how we think about circuit computation and plasticity across cortical hierarchies.

      Strengths:

      A major strength is the multi-pronged experimental strategy. The paper first establishes robust, area-dependent differences in synaptic efficacy, reliability, timing, and short-term plasticity (facilitation prevailing in PFC versus depression in S1), using both paired recordings and minimal extracellular stimulation paradigms. The coupling interpretation is then directly supported by differential sensitivity to EGTA (and appropriate positive-control effects of fast chelators). Finally, volume-averaged calcium signals are reported to be similar across areas, arguing against trivial explanations based on gross differences in calcium influx, and the modeling provides a quantitative framework for interpreting the observed chelator effects.

      Weaknesses:

      Limitations are minor and concern interpretation/clarity rather than core results. Some key inferences rely on indirect readouts (chelator sensitivity, fluctuation analysis-derived parameters, bouton-averaged calcium signals), each of which carries assumptions and potential confounds that should be discussed more explicitly. In particular, the repatching paradigm for the paired-recording EGTA experiment, though very impressive, and the limited number of extracellular calcium conditions used for fluctuation analysis (three concentrations), can influence quantitative estimates and the confidence intervals around them.

    2. Reviewer #2 (Public review):

      Schwarze et al. investigated whether synaptic efficacy is brain-region specific. To this end, they compared synaptic connections established by layer 5 (L5) neocortical pyramidal cells and between L5 and L2/3 pyramidal cells. In order to identify the mechanism of this brain region specificity, the authors employed several experimental approaches, including paired electrophysiological recordings, extracellular stimulation, low- and high-affinity intracellular calcium chelators (EGTA and BAPTA), multiple probability fluctuation analysis (MPFA), and intracellular measurements of calcium transients as well as computational modelling. The findings of the present study indicate that synaptic connections in the primary somatosensory cortex (S1) are significantly stronger and more reliable than those in the prefrontal cortex (PFC).

      The study is timely, and the topic is of significant interest to the neuroscience community. Despite the extensive research that has been carried out on the neuroanatomy and receptor distribution of different brain regions, comparatively little attention has been paid to differences in synaptic physiology. The authors' approach is characterised by its elegance and comprehensive nature, and the conclusions drawn are compelling. Nevertheless, there are a number of unresolved issues.

      Major points:

      (1) The authors state that data from the S1 cortex were obtained in a previous study. In the context of an explicitly comparative study (PFC vs. S1cortex), it would have been advantageous for the authors to perform a subset of experiments in which both cortices were obtained from a single animal. This is a feasible undertaking, given the spatial separation of the PFC and S1 cortex.

      (2) Figure 1A is somewhat misleading because it could suggest that the authors have performed dual recordings in identified PFC pyramidal cells.

      (3) PFC and S1 cortex in rodents differ markedly in their morphological organisation. For example, in all sensory cortices, layer 4 is very pronounced; however, in the PFC of rodent,s no clear layer 4 can be found. On the other hand, PFC shows a clear separation of layers 2 and 3, which is not visible inthe S1 cortex. Furthermore, PFC pyramidal cells in layers 2, 3, and 5 exhibit significant heterogeneity, diverging considerably from those found in layers 5a and 5b of S1 cortex. Thus, there is no clear correlation between L5 pyramidal cells in the PFC and the S1 cortex. In order to achieve a meaningful comparison of the data obtained in PFC and S1 cortex, it is necessary for the authors to determine whether the record is from similar pyramidal cell populations.

      (3) In addition, PFC pyramidal cells in layer 2, 3 and 5 are highly heterogeneous and differ markedly from those in layer 5a and 5b of S1 cortex. To achieve a meaningful comparison of the data obtained in the PFC and the S1 cortex, the authors need to determine whether the record from similar pyramidal cell populations.

      (4) For the S1 cortex, in rats it has been found that L5 synaptic connection between pairs of L5a pyramidal cells and pairs of L5b pyramidal cells differ markedly with respect to mean EPSP amplitude, latency and coefficient of variation (cv, a surrogate measure for the synaptic release probability) (cf. Markram et al., 1997; Frick et al., 2008). It is therefore likely that PFC and S1 pre- and postsynaptic pyramidal cells are not only morphologically and electrophysiological distinct but also with respect to their synaptic properties. At least, the authors need to discuss these confounding issues and preferentially address them experimentally. For example, it would be helpful to demonstrate that paired recordings were made from the same pyramidal cell types, perhaps by documenting their morphology and/or firing patterns. In addition, they should discuss the marked difference in EPSP amplitude and putative release probability between their data and the earlier studies.

      (5) In order to perform multiple probability fluctuation analysis (MPFA), a parabolic fit with a mere three points is inadequate, particularly because 2 mM and 5 mM Ca2+ are close to the peak of the variance-to-mean parabola, and only 1 mM Ca2+ is on its initial linear part. A more meaningful result would have been obtained with an additional Ca2+ concentration between 1.0 and 2.0 mM, as these are closer to the physiological range. In this context, the authors should have quoted the more recent and more detailed paper by the Silver group (Saviane and Silver, 2006; Lanore and Silver, 2016) and not just the Clements and Silver review paper.

      (6) Methods: The authors should clarify whether their paired recordings from L5 pyramidal cells involved whole-cell recordings from both pre- and postsynaptic neurons. From Figure 1B, it appears as if the presynaptic neurons were not recorded in whole cell mode but rather stimulated in cell-attached mode. This is also reflected in the artefact visible in the current trace recorded in the postsynaptic neuron. The authors should explicitly state their methodological approach and mention how reliable the timing of the presynaptic action potential was under these circumstances. The same holds true for the extracellular stimulation protocol. A significantly more detailed description of the experimental protocol is necessary here.

      (7) Methods: The authors use Student's t-test for data comparison. The authors should verify that the data distribution was indeed normal, e.g. by using a Shapiro-Wilk test. If this is not the case, non-parametric tests should be used.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Max Schwarze and colleagues examined the coupling distance between presynaptic Ca²⁺ channels and the vesicular release sensor at neocortical synapses in mice. They propose that Ca²⁺ channel-release sensor coupling differs across cortical areas, with relatively loose (microdomain) coupling in prefrontal cortex (PFC) and tighter (nanodomain) coupling in primary somatosensory cortex (S1) for comparable pyramidal-neuron synapse types. To test this, they combine paired recordings and minimal stimulation with chelator manipulations (EGTA/BAPTA), mean-variance/MPFA-style analyses, presynaptic Ca²⁺ imaging, and computational modeling. They conclude that presynaptic coupling organization is area-specific in the mature cortex and contributes to regional differences in synaptic timing, reliability, and short-term plasticity.

      Strengths:

      This study tackles an important question and is strengthened by a cohesive body of evidence assembled from multiple complementary approaches. A major asset is the inclusion of high-value datasets, particularly the paired recordings between L5 pyramidal neurons and the systematic assessment of EGTA sensitivity, which provide a solid functional foundation for the authors' central claims. The work is further distinguished by its genuinely multimodal design: combining electrophysiology with presynaptic calcium imaging (and integrating these observations with quantitative analyses and modeling) offers a more mechanistic view of neurotransmitter release than any single method could provide. Overall, the direct, within-framework comparison of presynaptic release-control mechanisms across cortical areas for comparable synapse types is compelling and gives the conclusions a level of robustness and interpretability that is often difficult to achieve in studies of cortical synaptic diversity.

      Weaknesses:

      Several aspects would benefit from clearer explanation, stronger integration with the existing literature, and a more explicit discussion of limitations and potential confounds. Without these additions, some conclusions remain speculative. Throughout the manuscript, the authors also often imply that different measurements reflect the same underlying synapse population. This is unlikely to be strictly true across all experiments and makes it difficult to integrate results from the various approaches into a single, unified set of functional synaptic properties. In addition, some statements-particularly those linking coupling mode to "higher-order neocortical functions"-appear broader than what is directly supported by the experiments and should be tempered or more precisely scoped.

      Below, I list several topics that could help better frame the main findings of the present study and clarify how it relates to previously published work.

      (1) The authors use EGTA sensitivity of EPSCs (together with additional metrics) to argue that S1 and PFC synapses differ in Ca²⁺ channel-release sensor coupling. While this is a plausible interpretation, EGTA effects are not uniquely determined by coupling distance and can also reflect differences in Ca²⁺ entry kinetics, action potential waveform, endogenous buffering/extrusion, or release-sensor/vesicle state. The authors use a constrained modeling approach, but the rationale for the different constraint sets is not fully clear from the current description. It would be helpful to expand and clarify the Methods section to explain how these constraints were defined, justified, and applied (and how alternative constraint choices would affect the results). In this context, the Abstract's broader claim that the study "reveals microdomain coupling as a presynaptic structure-function correlate of higher-order neocortical functions" appears overstated. Given the well-known diversity of cortical synapses even within a single region (e.g., synapses onto different interneuron subclasses or different PN cell types, extracortical sources like thalamus), the authors should clarify the intended scope: is the conclusion meant to apply broadly across synapse classes in S1 and PFC, or only to the specific connection type(s) examined here?

      (2) The chelator logic is sound in principle, but the Discussion should more explicitly acknowledge standard caveats and alternative explanations. The authors partly address this by including presynaptic Ca²⁺ imaging and modeling, yet it would help to explain more clearly how the combination of (i) chelator sensitivity, (ii) presynaptic Ca²⁺ signals, and (iii) model constraints rules out-or substantially reduces the likelihood of-changes in AP waveform, Ca²⁺ influx kinetics, buffering/extrusion, or sensor/vesicle state as the primary drivers. In addition, recent hypotheses emphasizing vesicle priming and/or release-site occupancy as contributors to apparent EGTA sensitivity should be discussed as a complementary or alternative interpretation.

      (3) A substantial portion of the S1 comparison appears to rely on previously published datasets. This should be made unambiguous in the Results and Methods, and it would be helpful to summarize this clearly (e.g., in a table indicating which figures/analyses use new data versus reanalysis of published data). If this information is already present, it should be highlighted more prominently.

      (4) The modeling is informative, but the choice of a specific VGCC-release-site geometry and channel arrangement is not sufficiently justified. The manuscript adopts a particular spatial configuration, yet the rationale for selecting this geometry, rather than other plausible architectures discussed in the literature, is not clearly explained, nor is it meaningfully revisited in the Discussion. The authors should justify why the same organization is assumed across two distinct cortical areas and, ideally, include (or at a minimum discuss) a sensitivity analysis showing how key inferences (e.g., coupling distance and channel number) depend on the assumed geometry.

      (5) The calcium imaging data are valuable, but given the diversity of synapses within each cortical layer, it is not clear that imaged boutons can be confidently assigned to the specific connection types being interrogated electrophysiologically. A substantial fraction of boutons likely corresponds to different postsynaptic targets (including interneurons and distinct pyramidal-cell classes), and this heterogeneity could complicate interpretation. This limitation should be discussed explicitly

      (6) In unitary connections, the authors assess EGTA effects alongside other functional parameters (strength, delay, short-term plasticity), which is a major strength. However, for L2/3 to L5 connections, it appears that EGTA sensitivity was tested primarily using extracellular stimulation. Given anatomical and circuit differences between PFC and S1, extracellular stimulation may recruit different synapse populations across regions, potentially confounding regional comparisons of EGTA sensitivity. This limitation should be acknowledged explicitly. While I am not requesting technically demanding L2/3↔L5 paired recordings in S1, the possibility that different synapse identities are being sampled should be treated as a meaningful source of uncertainty. The Discussion would also benefit from placing the magnitude of EGTA effects in the context of prior "loose coupling" literature, where comparatively large EGTA effects have been reported in some systems. In addition, the reported difference between adult PFC EGTA effects and S1 inhibition appears small (on the order of <10%) and should be interpreted cautiously, especially given that PFC and S1 mature on different timelines and P21-P26 is unlikely to reflect a mature PFC circuit state. The adult cohort (P90-P100) is therefore important, but the age mismatch complicates PFC-S1 comparisons; ideally, S1 should be assessed at matched ages, or this limitation should be discussed explicitly. Finally, for statistical robustness, in panel D of Figure 2, were the comparisons corrected for multiple testing to control Type I error?

      (7) Alterations in initial release probability are often associated with changes in short-term plasticity. In the present manuscript, the authors report similar initial release probability at PFC and S1 synapses, yet observe differences in short-term plasticity profiles. The mechanistic basis for this apparent dissociation is not addressed and should be discussed explicitly, including potential explanations.

      (8) There are multiple instances where the text appears to cite non-existent or misnumbered figure panels (e.g., references to "Figure 4G-I / 4J" when the relevant material appears elsewhere). These should be corrected throughout, as they currently reduce readability and confidence.

      (9) The Methods describe P21-P26 animals, whereas the Results include older cohorts (e.g., P90-P100) and additional regions (e.g., mPFC). The Methods should be updated so that all cohorts and regions analyzed in the Results are fully described.

    1. Reviewer #1 (Public review):

      Very nice and coherent body of work with appropriate in vitro to in vivo transition in methods.

      Lovely and easy to follow figures that can be understood even without the manuscript.

      My recommendation is that a sentence or two be added clearly stating the authors think nafamostat is off the table and suggest other approaches/drugs that might be considered instead of just making a general statement. I think all this can be done in a few sentences.

      Gabexate was administered to a snakebite victim in this case report from about 20 years ago and also a good example of the now better recognized threat to pregnancy.

      Nasu K, Ueda T, Miyakawa I. Intrauterine fetal death caused by pit viper venom poisoning in early pregnancy. Gynecol Obstet Invest. 2004;57(2):114-6. doi: 10.1159/000075676. Epub 2003 Dec 19. PMID: 14691344

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test whether a defined set of small molecules can lessen damaging effects caused by venoms from several Bothrops species, and whether these effects are consistent enough to suggest a broadly applicable approach. They present a cross-venom dataset spanning in-vitro activity readouts and blood-based functional outcomes, and include a chicken embryo model to explore whether venom inhibition can translate into improved survival. The central message is that certain small molecules can reduce specific venom-driven effects across multiple samples, providing a comparative resource for the field and a basis for prioritizing future validation.

      Strengths:

      The main value of this work is the breadth and structure of the dataset, which places multiple venoms and multiple readouts into a single, comparable framework that should be useful for readers evaluating patterns across samples. The experimental flow is generally coherent, moving from activity measurements to functional outcomes and then to an in-vivo test, which helps the reader understand how the authors link mechanism-oriented assays to more integrated endpoints. The manuscript also provides practical information for the community by highlighting which readouts appear most consistently affected across venoms, which can help guide hypothesis generation and study design in follow-up work.

      Comments on revisions:

      I would like to thank the authors for answering my questions. The manuscript has gained in quality, knowing the limitations that are now better stated in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a new Bayesian approach to estimate importation probabilities of malaria combining epidemiological data, travel history, and genetic data through pairwise IBD estimates. Importation is an important factor challenging malaria elimination, especially in low transmission settings. This paper focus on Magude and Matutuine, two districts in south Mozambique with very low malaria transmission. The results show isolation-by-distance in Mozambique, with genetic relatedness decreasing with distances larger than 100 km, and no spatial correlation for distances between 10 and 100 km. But again strong spatial correlation in distances smaller than 10 km. They report high genetic relatedness between Matutuine and Inhambane, higher than between Matutuine and Magude. Inhambane is the main source of importation in Matutuine, accounting for 63.5% of imported cases. Magude, on the other hand, shows smaller importation and travel rates than Matutuine, as it is a rural area with less mobility. Additionally, they report higher levels of importation and travel in the dry season, when transmission is lower. Also, no association with importation was found for occupation, sex and other factors. These data have practical implications for public health strategies aiming malaria elimination, for example, testing and treating travelers from Matutuine in the dry season.

      Strengths:

      The strength of this study relies in the combination of different sources of data - epidemiological, travel and genetic data - to estimate importation probabilities, the statistical analyses.

      Weaknesses:

      The authors recognize the limitations related to sample size and the biases of travel reports.

    2. Reviewer #2 (Public review):

      Summary:

      Based on a detailed dataset, the authors present a novel Bayesian approach to classify malaria cases as either imported or locally acquired.

      Strengths:

      The proposed Bayesian approach for case classification is simple, well justified, and allows the integration of parasite genomics, travel history, and epidemiological data.

      Weakness:

      While the authors aim to classify cases as imported or locally acquired, the method does not quantify the contribution of each case type to overall transmission, which the authors leave for future study.

    3. Reviewer #3 (Public review):

      This work provides a novel statistical model to identify imported malaria cases, which are an important challenge for elimination, particularly in low-transmission areas. This tool was applied in Plasmodium falciparum populations in Mozambique and determined differences in importation rates in two low-transmission districts in the South.

      Strengths:

      The study has several strengths, particularly the development of a novel Bayesian model integrating genomic, epidemiological, and travel data to estimate importation probabilities. The findings provided important insights into malaria transmission dynamics, including the identification of importation sources and regional differences in importation rates across Mozambique. These results highlight the potential value of targeted interventions among traveler populations to support malaria elimination efforts. Moreover, this approach could be adapted to other epidemiological settings.

      Weaknesses:

      The study has some limitations, including uneven sample representation across provinces, incomplete metadata for risk factor analysis and a proxy for transmission intensity. Future work will include a new sample collection effort and the incorporation of monthly malaria incidence estimates.

    1. Reviewer #1 (Public review):

      Summary:

      Sidarta-Oliveira et al. present TopOMetry, a novel dimensionality reduction method based on the eigendecomposition of approximated Laplace-Beltrami Operator. Shortly, TopOMetry is an iterative version of the existing spectral methods (e.g., Laplacian Eigenmap or Diffusion map). It approximates the Laplacian operators twice, once in a "phenotypic space" and then once again in the eigenbases space. By doing this the approximated operator will contain more information of the manifold, which allows for more robust and accurate downstream analyses.

      Strengths:

      - Introduces operator-native fidelity scores and Riemannian diagnostics to single-cell analysis, enabling researchers to evaluate and trust embeddings - functionality absent in prior methods.<br /> - The approach was rigorously tested based on synthetic and real single-cell RNA-seq datasets.<br /> - The package is well-made and easily scalable to millions of cells.<br /> - The comprehensive documentation helps the end-users to run desired analyses.

      Weaknesses:

      - The method is an extension of the current state-of-art methods, not a fundamentally new one.

      Comments on revised version:

      The revised manuscript partially addresses the concerns raised in the prior review. The jargon weakness has been substantially mitigated by relocating mathematical derivations to the Methods section and simplifying language in the main text; this weakness has been updated accordingly.

      The introduction of operator-native fidelity scores and Riemannian diagnostics represents a meaningful addition and has been added to the Strengths. The benchmarking scope has also been notably expanded.

      The core weakness - that the method is an extension of existing spectral methods rather than a fundamentally new contribution - remains unchanged, as the authors' rebuttal did not provide a sufficiently precise mathematical argument to overturn it.

    2. Reviewer #2 (Public review):

      Summary:

      This work introduces a novel framework to systematically learn the latent dimensions of single-cell data, grounded in the theory of the Riemannian manifold. The authors demonstrate how this framework can be applied to various important tasks, such as estimating intrinsic dimensionalities, annotating cell types, etc. They did a great job of tackling an important but not yet established problem in the field and approaching it with a theoretically sound and novel approach. I think after a more rigorous and comprehensive validation, this work could be impactful.

      Strengths:

      - Dimensionality reduction is a routine step in analyzing many high-dimensional data, such as molecular data. While the downstream analysis results depend heavily on this step, existing methods rely on strong assumptions and are sometimes heuristic. The authors present a novel, theoretically grounded approach to address this important problem.

      - The authors demonstrated its usability in downstream analysis in a comprehensive manner. Especially, they show evidence suggesting novel T-cell subpopulations.

      - I commend the authors for releasing and maintaining their software well with comprehensive documentation. This significantly increases the usability and accessibility of the method.

      Weaknesses:

      - The paper lacks experiments that validate the results. It would be beneficial to see additional evaluation settings with better-established ground truths to more strongly demonstrate the method's effectiveness.

      - Batch effects are prevalent in single-cell data. The paper does not adequately address how the proposed method handles this issue.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Ding et al. use genetic mouse models to demonstrate that atrial trabeculation is more dependent on Tie1/Tie2 signaling than ventricular trabeculation. With additional experimentation that would support the current claims, the results may hold significant value, as atrial trabeculation remains an understudied phenomenon in cardiac biology with potential implications for atrial cardiomyopathy and atrial fibrillation.

      Strengths:

      Detailed characterization of atrial versus ventricular trabeculation across different developmental timepoints, and the use of appropriate animal models to address the scientific question at hand.

      Weaknesses:

      The authors have consistently treated mice with tamoxifen after ventricular, but not atrial, trabeculation has already started. As such, the observed cardiac phenotypes - where predominantly atrial trabeculation is affected - might be a mere consequence of the precise time window in which Tie1/2 signaling was impaired, rather than a direct measurement of its relative importance for atrial versus ventricular trabeculation. The conclusions of the paper may thus be significantly strengthened by depleting Tie1/2 signaling prior to the onset of ventricular trabeculation, as is done for atrial trabeculation.

    2. Reviewer #2 (Public review):

      Summary:

      Ding et al. examine the role of TIE1 in cardiac chamber morphogenesis using genetic mouse models targeting Tie1, Tek, or both, and analyzing endocardial cell-mediated chamber formation across multiple embryonic developmental and postnatal stages, supported by analysis of published single-cell datasets and new bulk RNA seq analyses of murine cardiac tissue. The authors find that Tie1 and Tek expression is higher in atrial than ventricular endocardial cells. Notably, endothelial Tie1 is required for atrial trabeculation at E12.5, but is less critical in ventricular trabeculation. TIE1 also acts synergistically with TIE2 during atrial trabeculation. While Tie1 deficiency alone does not cause defects at E10.5, combined heterozygous deletion of Tek disrupts both atrial and ventricular development at E10.5. This synergy is further supported by analyses at later embryonic stages and in postnatal hearts.

      Strengths:

      The study is well-designed, clearly written, and supported by high-quality figures. The performed experiments demonstrate a previously unrecognized role for Tie1 in cardiac development and identify synergistic control of cardiac morphogenesis by Tie1 and Tie2. This synergy is consistent with the previously identified roles of Tie1 and Tek in venous development and with Tie1 involvement in angiopoietin-dependent postnatal vascular and lymphatic remodeling. Together, these findings support a role for Tie1 as a contributor to Ang1-Tie2 signaling during heart development.

      Weaknesses:

      The manuscript does not include direct mechanistic studies; however, RNA seq analysis of atria and ventricles showed reduced expression of Tek, Dll1, and Notch1 upon Tie1 deficiency in developing hearts. Although previously reported mechanisms, such as TIE1-TIE2 heterodimer formation and effects on endothelial junctions, migration, or survival are discussed, no direct mechanistic experiments are performed. Addressing some of these mechanisms would have clarified the basis of Tie1-Tie2 synergy. As two distinct Tie1 models are used, including one targeting the kinase domain, the authors should state whether phenotypes differed or were similar between models.

    3. Reviewer #3 (Public review):

      Summary:

      Ding et al. investigate the roles of TIE1 and TEK (Tie2) in mouse cardiac development, with a particular focus on atrial trabeculation. The authors employ multiple genetic models, including Tie1ICDflox/flox (with Cdh5-CreERT2), a knockout-first allele (EUCOMM, Tie1 tm1a/tm1a), and a Tek deletion model.

      Based on the dataset from Feng et al. 2022 Nat Commun, the authors report increased expression of Tie1 and Tek transcripts in atrial endocardial cells compared to ventricular cells at embryonic day (E) 14.5. Loss of Tie1 leads to early atrial trabeculation defects detectable at E12.5, whereas ventricular defects appear later and are less pronounced at E14.5. Chamber-specific RNA sequencing reveals stronger transcriptional changes in atrial tissue.

      Conditional deletion of Tek results in a similar phenotype, with more pronounced atrial defects. Combined deletion of Tie1 and Tek (Tie1 ΔICD/ΔICD; Tek+/-) leads to earlier and more severe defects in both atrial and ventricular trabeculation and results in embryonic lethality around E12.5, suggesting a synergistic interaction between the two genes.

      Conditional endothelial deletion of Tie1 combined with heterozygous global Tek at later embryonic stages allows analysis at later time points and again shows more severe defects in atrial trabeculation. Postnatal analysis of this model reveals reduced heart-to-body weight ratios and potential mild atrial abnormalities.

      Strengths:

      (1) The authors address chamber-specific signaling mechanisms underlying atrial versus ventricular trabeculation, an area of high developmental and clinical relevance.

      (2) The study provides a comprehensive temporal analysis across multiple embryonic stages.

      (3) The use of multiple genetic models strengthens the overall conclusions and allows comparative interpretation.

      (4) While focusing on trabeculation, the authors also include observations on coronary vessel development, increasing the broader relevance of the work. The findings are therefore of interest to the wider cardiovascular research community.

      Weaknesses:

      (1) Timing of recombination vs. trabeculation onset

      Ventricular trabeculation begins earlier than atrial trabeculation. Since tamoxifen (in contrast to 4-hydroxytamoxifen) requires metabolic activation, Cre-mediated recombination will occur with a delay. This suggests that atrial trabeculation may be targeted before its onset, whereas ventricular trabeculation may already be underway for 2-3 days at the time of effective gene deletion.

      How do the authors account for this discrepancy in their interpretation?

      Have earlier induction time points been tested to better capture the onset of ventricular trabeculation? This limitation should be explicitly discussed.

      (2) Clarity of genetic models and experimental design

      The study employs several genetic constructs. It would improve clarity if, for each experiment, the specific genetic model and tamoxifen regimen were clearly described before presenting the results.

      (3) Tie1 tm1a/tm1a phenotype vs. known global knockout

      Previous studies (PMID: 8846781, 7596437) show that complete Tie1 loss leads to severe edema, vascular rupture, and embryonic lethality around E13.5-E14.5.

      How does the Tie1 tm1a/tm1a allele differ, given that animals appear to survive longer? Is this allele hypomorphic rather than a full knockout?

      This point requires clarification.

      (4) Limited mechanistic insight

      While the authors aim to investigate underlying mechanisms, the current study is largely descriptive and based on mRNA expression and genetic interaction analyses (Tie1/Tek co-deletion). Direct mechanistic insights into signaling pathways remain limited. However, the dataset provides a valuable foundation for future mechanistic studies, which should be more clearly acknowledged in the discussion.

    1. Reviewer #1 (Public review):

      Summary:

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths:

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The authors also succeed in describing how single-cell recordings can interface with task-design to help mitigate the impact of confounded neural activity when searching for NCCs.

      The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors - as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG, it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. Additionally, the authors provide a compelling case for single-celled research in consciousness science, despite the dominance of theories situated at the system and circuit level of analysis. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses:

      Overall, I feel positive about this paper. The authors have addressed my comments from my previous review and I see no significant weaknesses in the current version.

      Comment on revised version:

      No comments - congratulations to the authors!

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with their own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review on the knowledge acquired by using invasive recordings in humans. This included population level measurements in vision and in other sensory modalities, as well as single neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC as for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      No major weaknesses.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review, and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must read for anyone working in the field of consciousness research.

      Comment on revised version:

      The authors have addressed all my concerns. Once again, my compliments for a nice piece of work.

    3. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current sometimes contradicting evidence. As such, the manuscript is important as call for a concernted better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript discusses extensively the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront, and briefly explaining how states and contents interact, would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

      Comments on revised version:

      The current version of the manuscript is clear and complete. Kudos to the authors for their thorough revisions.

      My only remaining point concerns the definition of "report": "We define a report as any explicit behavioral response (whether verbal, manual, or otherwise) that communicates a participant's subjective state."

      It would be helpful to clarify whether this definition is intended to exclude purely internal, explicit self-reports that are not externally expressed. As currently formulated, the definition appears to require overt behavioral communication. However, this raises a conceptual issue in relation to the no-report paradigm literature, where the distinction between report, metacognitive access, and overt motor/verbal expression is precisely at stake.

      Could the authors specify whether "report" is meant to (i) be restricted to externally observable, behaviorally expressed reports, or (ii) extend to internally generated, explicit metacognitive judgments even when they are not communicated? Clarifying this point would help situate the manuscript more precisely within ongoing debates on the role of report in identifying neural correlates of consciousness.

    4. Reviewer #1 (Public review):

      Summary

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors, as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses

      Overall, I feel positive about this paper. However, there are a couple of aspects to the manuscript that I think could be improved.

      (1) Distinguishing NCCs from their prerequisites or consequences

      This section in the introduction was particularly confusing to me. Namely, in this section, the authors' aim is to explain how intracranial recordings can help distinguish 'pure' NCCs from their antecedents and consequences. However, the authors almost exclusively describe different tasks (e.g., no-report tasks) that have been used to help solve this problem, rather than elaborating on how intracranial recordings may resolve this issue. The authors claim that no-report designs rely on null findings, and invasive recordings can be more sensitive to smaller effects, which can help in such cases. However, this motivation pertains to the previous sub-section (limits of noninvasive methods), since it is primarily concerned with the lack of temporal and spatial resolution of fMRI and M/EEG. It is not, in and of itself, a means to distinguish NCCs from their confounds.

      As such, in its current formulation, I do not find the argument that intracranial recordings are better suited to identifying pure NCCs (i.e. separating them from pre- or post-processing) convincing. To me, this is a problem solved through novel paradigms and better-developed theories. As it stands, the paper justifies my position by highlighting task developments that help to distinguish NCCs from prerequisites and consequences, rather than giving a novel argument as to why intracranial recordings outperform noninvasive methods beyond the reasons they explained in the previous section. Again, this position is justified when, from lines 505-506, the authors describe how none of the reported single-cell studies were able to dissociate NCCs from post-perceptual processing. As such, it seems as if, even with intracranial recording, NCCs and their confounds cannot be disentangled without appropriate tasks.

      The section 'Towards Better Behavioural Paradigms' is a clear attempt to address these issues and, as such, I am sure the authors share the same concerns as I am raising. Still, I remain unconvinced that the distinguishing of NCCs from pre-/post- processing is a fair motivation for using intracranial over noninvasive measures.

      (2) Drawing misleading conclusions from certain studies

      There are passages of the manuscript where the authors draw conclusions from studies that are not necessarily warranted by the studies they cite. For instance:

      Lines 265 - 271: "The results of these two studies revealed a complex pattern: on the one hand, HGA in the lateral occipitotemporal cortex and the ventral visual cortex correlated with stimulus strength. On the other hand, it also correlated with another factor that does not appear to play a role in visibility (repetition suppression), and did not correlate with a non-sensory factor that affects visibility reports (prior exposure). These results suggest that activity in occipitotemporal cortex regions reflecting higher-order visual processing may be a precursor to the NCC but not an NCC proper."

      It's possible to imagine a theory that would predict HGA could correlate with stimulus strength and repetition suppression, or that it would not correlate with prior exposure (e.g. prior exposure could impact response bias without affecting subjective visibility itself). The authors describe this exact ambiguity in interpretation later in the article (line 664), but in its current form, at least in line 270 (when the study is most extensively discussed), the manuscript heavily implies that HGA is not an NCC proper. This generates a false impression that intracranial recordings have conclusively determined that occipitotemporal HGA is not a pure NCC, which is certainly a premature conclusion.

      Line 243: "Altogether, these early human intracranial studies indicate that early-latency visual processing steps, reflected in broadband and low gamma activity, occur irrespective of whether a stimulus is consciously perceived or not. They also identified a candidate NCC: later (>200 ms) activity in the occipitotemporal region responsible for higher-order visual processing."

      The authors claim in this section that later (>200ms) activity in occipitotemporal regions may be a candidate for an NCC. However, the Fisch et al. (2009) study they describe in support of this conclusion found that early (~150ms) activity could dissociate conscious and unconscious processing. This would suggest that it is early processing that lays claim to perceptual consciousness. The authors explicitly describe the Fisch et al results as showing evidence for early markers of consciousness (line 240: '...exhibited an early...response following recognized vs unrecognised stimuli.) Yet only a few lines later they use this to support the conclusion that a candidate NCC is 'later (>200ms) activity in the occipitotemporal region' (line 245). As such, I am not sure what conclusion the authors want me to make from these studies.

      This problem is repeated in lines 386-387: "Altogether, studies that investigated the cortical correlates of visual consciousness point to a role of neural responses starting ~250 ms after stimulus onset in the non-primary visual cortex and prefrontal cortex."

      This seems to be directly in conflict with the Fisch et al results, which show that correlates of consciousness can begin ~100ms earlier than the authors state in this passage.

      (3) Justifying single-neuron cortical correlates of consciousness

      The purpose of the present manuscript is to highlight why and how intracortical measures of neural activity can help reveal the neural correlates of perceptual consciousness. As such, in the section 'Single-neuron cortical correlates of perceptual consciousness', I think the paper is lacking an argument as to why single-neuron research is useful when searching for the NCC. Most theories of consciousness are based around circuit or system-level analyses (e.g., global ignition, recurrent feedback, prefrontal indexing, etc.) and usually do not make predictions about single cells. Without any elaboration or argument as to why single-cell research is necessary for a science of consciousness, the research described in this section, although excellent and valuable in its own right, seems out of place in the broader discussion of NCCs. A particularly strong interpretation here could be that intracranial recordings mislead researchers into studying single cells simply because it is the finest level of analysis, rather than because it offers helpful insight into the NCCs.

      (4) No mention of combined fMRI-EEG research

      A minor point, but I was surprised that the authors did not mention any combined fMRI-EEG research when they were discussing the limits of noninvasive recordings. Intracortical recordings are one way to surpass the spatial and temporal resolution limits of M/EEG and fMRI respectively, but studies that combine fMRI and EEG are also an alternative means to solve this problem: by combining the spatial resolution of fMRI with the temporal resolution of EEG, researchers can - in theory - compare when and where certain activity patterns (be they univariate ERPs or multivariate patterns) arise. The authors do cite one paper (Dellert et al., 2021 JNeuro) that used this kind of setup, but they discuss it only with respect to the task and ignore the recording method. The argument for using intracranial recordings is weaker for not mentioning a viable, noninvasive alternative that resolves the same issues.

    5. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with its own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review of the knowledge acquired by using invasive recordings in humans. This included population-level measurements in vision and in other sensory modalities, as well as single-neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC and for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      The intention of the authors is to argue how some of the problems faced when studying NCCs are alleviated by the use of intracranial recordings in humans. But in some cases, the link between the problems related to the study of NCCs and the advantages of intracranial recordings over non-invasive methods is not clear.

      For example, the authors explain the difficulties in distinguishing between true NCCs from their prerequisites and consequences. This constitutes a difficult conceptual problems that plague all recording techniques. The authors don't provide a convincing explanation of how intracranial recordings offer advantages over EEG or MEG when dealing with these problems.

      For example, the authors explain how the use of non-report designs to rule out post-perceptual processing relies on null results, which, according to them, are harder to interpret given the low resolution of non-invasive methods. But the interpretation of null results is actually more complicated in the case of intracranial recordings. As the coverage achieved by the electrodes is sparse, if a null result is attested, it remains possible that a true effect was present in a nearby patch of cortex out of coverage.

      The authors argue that the spatial resolution of intracranial recordings is better than that of EEG and MEG. While this is technically true (especially compared to EEG), the true spatial scale of the NCCs is unknown. If NCCs' span is in the mm range, then the additional spatial resolution of intracranial recordings might not be an advantage.

      Another factor that should be taken into consideration when assessing the spatial resolution of intracranial recordings is that while the listening zone of individual intracranial contacts is small, coverage is sparse and defined by clinical criteria (something that the authors discuss). In practice, the activity recorded by contacts is usually attributed to anatomically defined ROIs with a scale in the cm range. Given the sparse and uneven (across regions and patients) coverage afforded by intracranial recordings, the advantage of intracranial recordings in terms of spatial resolution is overstated.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      What is less clear is how the use of intracranial recordings per se holds potential to overcome problems such as the distinction between true NCCs and the prerequisites and consequences of conscious processing.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must-read for anyone working in the field of consciousness research.

    6. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current, sometimes contradicting evidence. As such, the manuscript is important as it calls for a concerted and better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript extensively discusses the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post-consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anaesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront and briefly explaining how states and contents interact would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

  2. May 2026
    1. Reviewer #1 (Public review):

      Summary:

      The authors utilize genetic code expansion to tag TDP-43 and G3BP1, and evaluate this protein tagging system (ANAP) compared to antibodies and evaluate protein trafficking and stress granule formation in response to stress with sodium arsenite treatment. They find similar staining to antibodies in HeLa cells, mouse embryonic stem cells and primary mouse cortical neurons. By incorporating the intrinsically fluorescent noncanonical amino acid Anap at carefully selected sites, the authors enable live-cell and neuronal visualization of protein localization, stress-induced redistribution, and dynamic behavior without the structural and functional compromises often associated with large fluorescent protein tags. The work provides technical framework that will be useful for live imaging of tagged proteins.

      Strengths:

      A key strength is the demonstration of the specificity of the Anap fluorescence signal through appropriate controls and the agreement between Anap labeling and antibody-based detection across multiple cell types, including primary neurons. The ability to visualize stress-induced redistribution of both G3BP1 and TDP 43 in living cells highlights the practical value of this approach.<br /> The functional validation of TDP 43-Anap is compelling. The rescue of both cell viability and RNA splicing defects in TDP 43 knockout models provides evidence that Anap incorporation preserves core protein functions. This is important, as functional disruption is a central concern for any alternative tagging strategy applied to aggregation-prone or RNA-binding proteins.

      Weaknesses:

      While some inherent limitations of genetic code expansion remain (e.g., variable amber suppression efficiency and the inability to directly assess endogenous protein behavior), these are acknowledged and discussed appropriately. Importantly, these limitations do not undermine the central contributions of the study.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Chen and colleagues describe a novel means of labeling two RNA binding proteins, G3BP1 and TDP-43, using genetic code expansion. Overexpressed constructs that incorporate the intrinsically-fluorescent non-canonical amino acid Anap redistribute to cytoplasmic granules upon application of external stressors such as sodium arsenite. Similar labeling and redistribution of overexpressed G3BP1 and TDP-43 was observed in cultures of mouse primary neurons.

      Genetic code expansion and non-canonical amino acid labeling have many advantages over traditional fusion proteins for tracking protein redistribution in living cells. The authors show that they are able to label exogenous G3BP1 and TDP-43 with the non-canonical amino acid Anap, and follow labeled proteins in living cells with and without stress.

      I suspect that this method could be incredibly valuable to many investigators studying the dynamics and interactions of proteins that are difficult to label or detect by conventional methods.

    1. Reviewer #1 (Public review):

      Summary:

      LRRK2 protein is familially linked to Parkinson's disease by the presence of several gene variants that confer a gain-of-function effect on LRRK2 kinase activity.

      The authors examine the effects of BDNF stimulation in immortalized neuron-like cells, cultured mouse primary neurons, hIPSC-derived neurons, and brain tissue from genetically modified mice. They examine a LRRK2 regulatory phosphorylation residue, LRRK2 binding relationships, other kinase phosphorylation status, and measures of synaptic structure and function.

      Strengths:

      The study addresses an important research question: how does a PD-linked protein interact with other proteins, and contribute to responses to a well-characterized neuronal signalling pathway involved in the regulation of synaptic function and cell health.

      They employ a range of models and techniques to convincingly demonstrate that BDNF stimulation alters LRRK2 phosphorylation at pS935 and binding to many proteins. Several independent data sets lead to some exciting conclusions.<br /> In this re-revised manuscript, some aspects are very convincing and well validated e.g., drebrin binding to LRRK2, increased by BDNF, and reduced LRRK2 protein levels in young (but not mature) drebrin KO mice. A phosphoproteomic analysis of PD mutant Knock-in mouse brain is included. Overall, the links between LRRK2, LRRK2 activity, and the changes to synaptic molecules, structures, and activity are intriguing.

      Weaknesses:

      Enthusiasm for the title claim that "LRRK2 regulates synaptic function through BDNF signalling" is tempered by disconnected results across different model systems and inconsistent alterations upon kinase phosphorylation in SHSY5Y cell line and primary neurons. Exciting conclusions are sometimes not consistently supported by the data and/or only conducted in one of the models.

      BDNF increasing pS935 LRRK2 is quite well supported in cell lines, as is BDNF regulation of derbrin-LRRK2 binding. However, there is a lack of connection between this result and subsequent alterations to LRRK2 substrates e.g., phosphorylation of Rab GTPases, especially in neurons. Interesting omic data sets are provided, but with very little or no validation. For example, only drebrin protein was assessed in BDNF treatment omic, and the phosphoproteomic analysis of PD mutant Knock-in mouse is stand alone with no validation and G2019S is not explored elsewhere in the study.

      The major disconnect this reviewer struggles with is the conclusion that the quite clear data in SHSY5Y cells is the same as that from neurons regarding BDNF / LRRK2 and ERK / Akt. It seems they are not.

      ERK and Akt phosphorylation by BDNF is absent in CRISPR KO SHSY5Y cells.<br /> This conclusion is at odds with interpretation of neuronal data. To explain; in div14 neurons, BDNF's transient increase in pLRRK2 is seen and strongly prevented by MLi2. BDNF also increased pAkt & pERK1&2 in WT... but also in LRRK2 KO. Furthermore, this happened in the presence of MLi2 in WT despite no pLRRK2 increase. While the 5min BDNF induced increase to pAkt appears reduced in LKO, the same time BDNF in LKO with MLi2 is as high as WT (in these unquantified examples) and ERK is almost identical. This is described as "significantly reduced" but I see no replicates or quantification, and face value assessment of the blot argues against this.<br /> Thus, there is little or no evidence supporting that LRRK2 activity is involved in BDNF-stimulated increases in pAkt or pERK, upstream, in neurons as neither Mli2 nor KO prevented this.

      Synapse markers increased in WT neuron with BDNF treatment which did not happen in LKO neurons. So this process requires pLRRK2, but is unrelated to pAkt or pERK (which do still go up with BDNF in KO)? Similarly, an increase in synaptic activity in WT hiPSC neurons in response to BDNF seems lost in LRRK2 KO hiPSC neurons, although their activity is already increased and depending on the age of the cells the effects were different. Both of these experiments lack supporting evidence by other measures e.g., LRRK2 inhibition effects on BDNF-induced increases in WT and parallel biochemistry of p'd LRRK2, Akt, ERK in WT & KO.

      LRRK2 activating Akt1 has been published before (e.g., Ohta 2011 - not cited), but Ohta also conclude that LRRK2 gain of function mutations (more LRRK2 kinase activity) were associated with a reduced ability of LRRK2 to bind AND phosphorylate Akt at the same residue, in contradiction to the mechanism proposed here? This should be discussed. Here the authors also conclude Akt is Upstream of LRRK2. However, it appears from the data here in neurons that pLRRK2 increases in response to BDNF are separate from BDNF signalling to Akt.

      Of note, in comparison to bTubulin control, LKO total Akt levels appear consistently higher in this single example blot; a large increase in Akt would skew the ratio down, while absolute levels of pAkt (probably the most important matter for an active enzyme - what is the ratio against total protein stain) are similar or increased. These are major problems for the conclusions as presented.

      BDNF increased mEPSC frequency in hIPSC neurons; which didn't happen in LKO, which already had high frequency. Earlier in the manuscript BDNF is shown to alter synapse number in WT but not LKO mouse neurons, but no increase in synapse number was seen following BDNF treatment in any WT or LKO hiPSC neurons +/- BDFN.

      If we are to assume that the WT neurons have LRRK2 (not demonstrated), and that LRRK2 KO neurons have similar drebrin (not demonstrated) it is unclear how to interpret this result in the model of BDNF-LRRK2 being upstream of pERK/Akt. There is no evidence that the BDNF increase in WT is blocked by LRRK2 inhibition, nor has it been associated with changes (or not) to pAkt or ERK1, which would be expected in both WT and KO based on Figure 4C.

      There are many reports of acute and longer term BDNF application increasing event frequency in brain slices & primary neurons. Overexpression of BDNF in NPCs has also been shown to increase synapse function in hiPSC neurons derived from them. Here, BDNF has an effect on frequency in only one 6 comparisons (3 timepoints, two lines). Is it not concerning that expected BDNF effects occur at only one time point in WT, and that generally a lack of effect is more common both in WT and LKO... is this due to slow appearance of TrkB receptors and degeneration at 90 days?

      There are no other data provided to show that BDNF was having a consistent expected effect in human neurons (pAkt, pLRRK2 etc etc), and there is little to link between this data and that in previous figures of the study.

      The discussion of some of the weaknesses is mostly fair, asides the disparities noted above which are not.

    2. Reviewer #2 (Public review):

      The data show that BDNF regulates the PD-associated kinase LRRK2, they place LRRK2 within well-described BDNF pathways biochemically, and they show that LRRK2 can play a role mediating BDNF-driven synaptic outcomes at excitatory synapses. The chief strength is that the data provide a potential focal point for multiple observations that have been made across many labs. The findings will be of broad interest because LRRK2 has emerged as a protein that is likely to be part of Parkinson's pathology and its normal and pathological actions remain poorly understood.

      A major strength of the study is the multiple approaches that were used (biochemistry, bioinformatics, light and electron microscopy and electrophysiology) across different experimental models (cells, primary neurons, human neurons, mice) to identify and examine the impact of BDNF on LRRK2 signaling and functions. Noteworthy is also the employment of LRRK2KO preparations to validate outcomes and to place LRRK2 actions up or downstream.

      The demonstration that LRRK2 and drebrin interact directly is important and suggests that other interacting proteins identified biochemically and bioinformatically in the paper will be important to pursue.

    1. Reviewer #2 (Public review):

      Summary:

      The authors investigated whether early-life malaria exposure has long-term effects on immune responses to unrelated antigens. They leveraged a natural experiment in coastal Kenya where two adjacent communities (Junju and Ngerenya) experienced divergent malaria transmission patterns after 2004. Using 15 years of longitudinal data from 123 children with weekly malaria surveillance and annual serological sampling, they measured antibody responses to multiple pathogens using a protein microarray technology and ELISA.

      Strengths:

      (1) Extensive longitudinal data collection with weekly malaria surveillance, enabling precise exposure classification.

      (2) Use of a natural experiment design that allows for causal inference about malaria's immunological effects.

      (3) Broad panel of antigens tested, demonstrating generalized rather than antigen-specific effects.

      (4) Within-cohort analysis in Ngerenya controls for geographic and environmental factors.

      (5) Validation of key findings using both serologic microarray and ELISA.

      (6) Important public health implications for vaccine strategies in malaria-endemic regions.

      Weaknesses:

      (1) Due to its nature, the study lacks the ability to determine the direction of the associations found between malaria exposure and other IgG levels to unrelated pathogens.

      (2) No evaluation of the clinical Implications of the reduced IgG levels observed in the area with high malaria exposure.

      Assessment of Claims:

      The data appear to support the authors' primary claims. The strength of the evidence is limited by the observational nature of the study and the results should be interpreted in that light. Together with the currently available evidence of P. falciparum's impact on the host's immune function, this natural experiment design provides further evidence for a relationship between early malaria exposure and reduced antibody responses to other pathogens and vaccine-derived antigens. The within-Ngerenya analysis controls for geographic factors and thus enhances the quality of the evidence; there is limited physical, nutritional, and socio-economic information on factors that may have driven the observed changes.

      Impact and Utility:

      This work has fundamental implications for understanding vaccine effectiveness in malaria-endemic regions and may contribute to inform vaccination strategies. The findings, if confirmed, would suggest that children in areas of high malaria transmission may require modified immunization approaches. The dataset provides a valuable resource for future studies of malaria's immunological legacy.

      Context:

      This study builds on prior work showing acute immunosuppressive effects of malaria but uniquely attempts to demonstrate the durability of these effects years after exposure. The natural experiment design addresses limitations of previous observational studies by providing a more controlled comparison.

    1. Reviewer #1 (Public review):

      Sebag et al. addressed the role of ADH5 in BAT in the development of aging and metabolic disarrangements associated with it. This is a follow-up study after the authors' demonstration of the role of BAT ADH5 in glucose homeostasis, obesity, and cold tolerance. By ablating ADH5 specifically in brown adipocytes or pharmacologically modulating ADH5 through activation of its transcription factor, the authors conclude that preservation of BAT function is crucial for healthy aging and ADH5 is causally involved in this process. The topic is appealing given the rise in the aging population and the unclear role of BAT function in this process. Overall, the study uses several techniques and addresses several physiological and molecular manifestations of aging. Therefore, the findings contribute to the growing body of literature pointing to the biological role of BAT activity in aging.

      Comments on revised version:

      I have no further comments other than to congratulate the authors on the nice piece of work.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates the role of the enzyme Alcohol Dehydrogenase 5 (ADH5) in brown adipose tissue (BAT) during aging. BAT is crucial for thermogenesis and energy balance, but its function and mass diminish with age, contributing to metabolic dysfunction and age-related diseases. ADH5, also known as S-nitrosoglutathione reductase, regulates nitric oxide (NO) signaling by removing damaging S-nitrosylation modifications from proteins. The authors show that aging in mice leads to increased protein S-nitrosylation associated with a combination of increased Nos2 expression and reduced ADH5 expression in BAT, resulting in impaired metabolic and cognitive functions. Deletion of ADH5 in BAT accelerates tissue senescence and systemic metabolic decline. Mechanistically, aging suppresses ADH5 via downregulation of heat shock factor 1 (HSF1), a master regulator of protein homeostasis. Importantly, pharmacologically boosting HSF1 improves BAT function and mitigates both metabolic and cognitive declines in aged mice. The findings highlight a critical HSF1-ADH5 pathway in BAT that protects against aging-related dysfunction, suggesting that targeting this pathway may offer new therapeutic strategies for improving metabolic health and cognition during aging.

      Strengths:

      This research provides insight into the interplay between redox biology, proteostasis, and metabolic decline in aging. By showing that age regulates genes that control SNO status in BAT and further developing a therapy to target ADH5 in BAT to prevent age related decline, the authors have identified a putative mechanism to combat age related decline in BAT function.

      Weaknesses:

      None identified.

      Comments on revised version:

      Congratulations to the authors for this interesting manuscript. I don't want to pat myself on the back, but I found the increased Nos2 expression in Figure 1C of the revised manuscript very satisfying, as it reinforces the shift in the regulation of SNO status that happens in BAT with aging. I appreciate the authors addressing this suggestion.

    1. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Metz Reed and colleagues present an exceptionally thorough analysis of three-dimensional genome reorganization during breast cancer progression using the well-characterized MCF10 model system. The integration of high-resolution Micro-C contact maps with multi-omics profiling provides compelling insights into stage-specific dynamics of chromatin compartments, TAD boundaries, and looping events. The discovery that stable chromatin loops enable epigenetic reprogramming of cancer genes while structural changes selectively drive metastasis-associated pathways represents a significant conceptual advance. This work substantially deepens our understanding of genome topology in malignancy.

      Strengths:

      This work sets a benchmark for integrative 3D genomics in oncology. Its methodological sophistication and conceptual advances establish a new paradigm for studying nuclear architecture in disease.

      Comments on revised version:

      The authors made a significant effort to improve the manuscript. My comments were sufficiently addressed.

    2. Reviewer #2 (Public review):

      Using the MCF10 breast cancer progression sequence, the authors combined high-resolution Micro-C chromatin conformation capture with RNA-seq and ChIP-seq to depict the sequential reorganization of compartments, topologically associated domains (TADs), and long-range loops in benign, pre-tumor, and metastatic states, and coupled these three-dimensional changes with gene expression and enhancer activity. Four main findings were: (i) chromatin structure was largely quiescent, still limiting gene output differentiation, with upregulated sites being most significantly affected; (ii) enhancer-promoter contact strength covariated with transcriptional amplitude; (iii) 127 genes gained expression with increasing chromatin contact; and (iv) progression-related genes acquired altered histone markers in distal enhancers, which remained connected by stable loops. These conclusions are widely accepted and provide strong justification for the publication of this paper.

    3. Reviewer #3 (Public review):

      Summary:

      The authors tackle an important problem- that is defining the topological changes that occur during tumorigenesis. To study this, they use an established stepwise cell model of breast cancer. A strength of their study is a careful, robust differential analysis of topological features across each cell state that is presented clearly and rigorously. They define changes in compartmentalization, TAD structure and chromatin looping. Intriguingly, when the authors integrate differential gene expression with chromatin looping, they see that most differentially regulated genes are not involved in loop changes, suggesting that changes in promoter or enhancer chromatin marks may play a bigger role in regulating transcription than differential loops. The differential topology analysis and its integration with transcription is very well done- one of the best versions of this I have read in the 3D genome field! However, the paper is framed largely as a cancer biology study and it teaches us much less about this. I am worried that some of the trends for each topologic feature are not going to be consistent across the pre-malignant-malignant-metastatic spectrum and would like the authors to soften some of their claims a bit regarding how this clarifies our understanding of cancer evolution.

      Updated comments on revision:

      There are still some issues with this paper. First, it reads descriptively. It is a series of comparisons with limited biologic insight as changes are always seen in genomics and in this case, they're often not tied back to transcription or gene regulation in cancer. Cell lines do not represent cancer faithfully and in this case should not be argued to represent malignant transformation broadly. The authors did not really soften their language as much as I think required. I would caution the authors to further qualify their results in the context of a single, clonal cell line that has undergone stepwise transformation. This is not a patient cohort analysis or frank progression. This matters because there is likely to be much more noise, not pertinent to transformation, in a cell line model. It doesn't negate the validity of the study, but this language should be adjusted appropriately. It was nice to see the authors compare gene expression data from their model to the primary tumor data, however the limited overlap is concerning that at the least patterns of transcriptional regulation in their model are not faithful to primary tumors. If this is the case, it raises concern that the topological changes are also not generalizable to cancer.

      The authors declined a number of functional assays to validate their observations (which are purely correlative). And while I see that the burden of extra experiments may be beyond the scope of this study, they must soften their language to justify the observed relationships.

    1. Reviewer #1 (Public review):

      Summary:

      Patients with STX11 mutations develop familial hemophagocytic lymphohistiocytosis Type 4, a fatal immune disorder marked by defective T and NK cell cytotoxicity and cytokine storm. The conventional explanation attributes this to impaired cytotoxic granule release, but this has never fully accounted for the broader disease picture. This study proposes an alternative mechanism. The authors show that STX11 is required for store-operated calcium entry through ORAI1 channels, which are essential for both cytotoxic killing and NFAT-driven gene expression in T cells. In STX11-deficient cells, ORAI1 currents drop, NFAT nuclear translocation fails, IL-2 expression is suppressed, and degranulation is impaired. These defects are largely rescued by ionomycin or a constitutively active ORAI1 mutant, placing the primary lesion at calcium signaling rather than the fusion machinery. Mechanistically, STX11 binds the C-terminal tail of ORAI1 via its Habc domain and maintains ORAI1 in a state competent for productive assembly prior to STIM1-dependent gating, a step the authors call "priming."

      Strengths:

      The paper identifies a novel and disease-relevant role for STX11 in calcium channel regulation and raises the possibility of using channel agonists as a therapeutic strategy in the disease. The biochemical and functional data are of high quality and generally consistent with the interpretation. The proposal that a non-conventional syntaxin directly interacts with ion channels to prime its activation is novel and interesting.

      Weaknesses:

      For readers to appreciate the value of patient experiments derived from a single individual, the authors should quote prior studies showing that STX11 protein levels are abolished in all known human STX11 mutations. The priming model, while functionally well-supported, rests on indirect structural evidence, and the precise conformational transition involved remains to be defined. These are acknowledged limitations, but alternate mechanisms have not been explored and formally excluded. More direct evidence should be provided to exclude the possibility that STX11 could act as a conventional SNARE and sustain calcium fluxes by promoting the delivery of additional ORAI1 channels from vesicles.

    2. Reviewer #2 (Public review):

      Summary:

      Vig's lab delineates a critical role for STX11 in CRAC channel function, particularly in the context of the fatal immune disorder familial hemophagocytic lymphohistiocytosis type 4 (FHL4). They demonstrate that Syntaxin 11 directly binds and regulates Orai1, and that STX11 depletion abolishes CRAC currents and downstream signaling. Loss of STX11 reduces IL2 gene expression and impairs degranulation, both of which are rescued by the constitutively active Orai1 mutant H134S, whereas a gain‑of‑function mutant targeting the C‑terminus fails to restore these defects. The authors conclude that STX11 primes Orai1 for optimal local assembly that is independent of STIM1 yet required for CRAC channel gating.

      Strengths:

      This study is firmly grounded in disease biology and demonstrates that STX11 downregulation leads to profound functional defects. Using a comprehensive suite of methods and analyses, the authors interrogate the co-regulation of STX11 and Orai1 and present a near-complete view of STX11's modulatory role in CRAC channel function and downstream signaling pathways. The figures are clear, and the statistical analyses are rigorous and convincing.

      Weaknesses:

      The authors conclude that Syntaxin 11 directly binds Orai1. This conclusion is well supported by a multifaceted approach, including co-immunoprecipitation (co-IP), molecular dynamics simulations, co-localization/FRET assays, and targeted mutational analysis-all of which are thoroughly executed. While the interaction appears reasonably strong in co-IP experiments, the STX11-Orai1 interaction is comparatively weaker in pull-down assays, which the authors attribute to instability of the purified His-STX11 protein. A remaining gap is direct evidence of interaction in live cells; this is understandably challenging given that fluorescent tagging of STX11 is not feasible. Fully resolving this question lies beyond the scope of the present study and will require more advanced approaches to capture STX11 binding dynamics.

    1. Reviewer #1 (Public review):

      Summary:

      This paper by Boni and colleagues presents the engineering of a multi-step differentiation program in Escherichia coli based on synthetic gene circuits. The motivation behind the study was to engineer a system capable of undergoing differentiation in a step-wise manner without the presence of external spatial cues and without inducers added during the differentiation process. To achieve this, the authors created several synthetic gene circuits, one being a toggle switch, and the others being quorum-sensing-mediated gene expression modules. The outputs of the differentiation process are fluorescent proteins, which allowed the authors to quantify the behavior of the system using fluorescence intensity measurements. The authors additionally built a multi-component mathematical model which is able to reproduce the experimental data.

      The data presented are convincing and support the claims; the work is well executed.

      Strengths:

      (1) The differentiation process proceeds autonomously after the initial step in liquid culture in the presence of external inducers.

      (2) It is indeed a step-wise process.

      (3) The mathematical model predicts the outcome (% of green, blue and red FP-expressing cells in the population) when changing the initial ratio of green:blue FP-expressing cells.

      Weaknesses:

      (1) No spatial pattern emerges. There are some isolated colonies that turn on the downstream FPs, but I do not see a pattern, really. Nonetheless, some colonies do differentiate (i.e. they turn on additional FPs).

      (2) The mathematical model appears somewhat superfluous. While it can clearly reproduce the data, it is not used to make interesting predictions, changing parameters (and not initial conditions) that guide further experimental implementations.

      Future directions

      The utility of this differentiation process (e.g. in metabolic engineering or for the study of biofilm formation and antibiotic resistance) will become clearer once the FPs are substituted with functional proteins that exert an effect on the cells.

    2. Reviewer #2 (Public review):

      In this manuscript, the authors implement a three-step genetic programme in E. coli that converts an initially homogeneous population into spatially structured sender, receiver, and "matured" receiver colonies on agar without externally supplied positional information. They combine a TetR/LacI toggle switch for symmetry breaking, LuxI/LuxR quorum sensing for a paracrine signalling step, and CinI/CinR for an autocrine signalling-like maturation step, and complement the experiments with a mathematical model that qualitatively reproduces pattern formation over a range of initial conditions.

      While the article has many strengths such as a clear conceptual framing using Waddington landscapes, a modular and carefully optimised circuit design, thorough experimental characterisation of the toggle and quorum-sensing modules, integration of spatial modelling with experiments, and generally clear writing and figures, I think it will benefit the article to clarify the definition and stability of "differentiated" states, clarify several quantitative and modelling aspects, better explain how fitted curves and promoter engineering were done, and improve some figure design and wording to avoid ambiguity.

      Detailed comments below:

      (1) P5-8 / and more generally: A major concern is that producing a reporter output is not, by itself, differentiation. For a state to be credibly called "differentiated", it should be stable (self-maintained) over relevant timescales, ideally in the absence of the inducing context. As written, the manuscript sometimes seems to equate cell type with reporter expression. I strongly suggest adding a short subsection explicitly defining state versus output, and for each claimed state, stating whether it is stable/bistable or unstable/reversible, with evidence. Concretely, the authors should enumerate:<br /> a) Toggle-derived sender versus receiver: stable? under what conditions (inducer ranges, hysteresis window)?<br /> b) Paracrine-induced "red" receivers: is this a stable differentiated state, or a context-dependent induction requiring proximity to senders?<br /> c) "Mature" (yellow) state: does it persist after removal from the spatial signal field? If not, it should be described as an induced output programme rather than a mature lineage state.

      At present, later sections (and the "maturation" language) risk over-stating what is demonstrated.

      (2) Figure 2d: It is unclear whether this panel is intended to be qualitative (schematic/illustrative) or generated from quantitative data. The legend should explicitly state the origin (e.g., representative image, averaged data, simulation output, schematic) and, if quantitative, what was measured, how many replicates, and how the visualisation was constructed.

      (3) Figure 2e: The cross-sectional line is described as meant to be comparable, yet the leftmost plot appears to have a different slope from the others. The authors should explain whether this reflects a different scaling/normalisation, a different underlying dataset/condition, or simply a plotting artefact. If these are fitted trends, report the fit function (see also the comment on fitted lines below).

      (4) Around P7-8: (saddle/separatrix description): When describing the saddle or separatrix between the two valleys, it would be helpful to briefly connect this more directly to a quantitative dynamical-systems perspective: for instance, the intersection of nullclines and how nullcline geometry changes under IPTG/aTc induction. This will make the landscape picture more complete for readers familiar with the original genetic toggle switch work (Garder et al., 2000).

      (5) P9, lines 157-159: The current phrasing ("in absence of noise, the system would be fully deterministic... in living cells, however, stochastic bursts... change the trajectory") risks conflating predicting population-level percentages with predicting colony-level trajectories. It would help to clearly separate (i) the ability to predict the overall fraction of ON/OFF (green/blue) colonies from inducer conditions (which is largely deterministic at the population level) from (ii) the intrinsically stochastic choice of state made by any given founder cell and its colony.

      (6) P11, lines 193-195 (promoter engineering): The main text currently only refers to screening variants and choosing pLux76; I suggest briefly stating in the main text (not only in the supplement) what was changed (for example, promoter box variants, core promoter strength modifications) and what design criteria were used (reduced leakiness, increased dynamic range).

      (7) Use of fitted lines (Figures 2, 4, 5, 7): Wherever fitted curves are overlaid on data, the asuthors should indicate in the figure legend the explicit form of the fit as well as the fit equation/ parameters. As a reader, it is difficult to interpret what is empirical smoothing versus what is a mechanistic functional form.

      (8) P13, lines 232-235: The comparison between induction directly with C6-HSL and induction from sender colonies is qualitative ("significantly smaller range"). The authors should provide distances (for example, in mm) for the induction range in each case and, if possible, approximate total HSL amounts or concentrations, so that the reader can appreciate the magnitude of the difference.

      (9) P13, lines 259-262: The authors model the transition to the stationary phase via a monotonically decreasing sigmoid in time for biosynthetic capacity. What is the rationale or literature basis for this approach to model entry into the stationary phase? The authors should cite prior work and clarify why this form is appropriate here, versus alternatives (nutrient diffusion limitation, logistic growth with resource depletion, etc.).

      (10) Figure 6c: Are the areas of the plate shown in each column the same field of view across conditions/time, or are these simply representative regions selected per condition (possibly from different plates)? The caption/legend should clarify whether these are matched locations and how images were chosen.

      (11) Figure 7a: The combination of solid, dashed, and dash-dot arrows/lines is visually hard to read. I suggest replacing the dash-dot line with a fully dotted line or using different colours (if consistent with journal style) to improve readability.

      (12) Figure 7e and similar analyses: The authors should explain in the Methods and/or captions how "distance from sender colonies" is computed when multiple senders exist. Is the distance always measured to the nearest sender, and how are cases handled where a receiver is in the overlapping influence of several senders? This clarification is important for interpreting the fitted curves.

    3. Reviewer #3 (Public review):

      This manuscript presents an engineered 3-step circuit in E. coli that combines toggle-switch-based symmetry breaking with quorum-sensing interactions to generate colony-scale spatial patterns. The work is interesting as a synthetic circuit integration study and as a demonstration of self-organized patterning across physically separated colonies. The authors provided a compelling demonstration of the characterization/tuning of parts to guide the overall system engineering. A notable strength is the demonstration that a single circuit can generate a range of self-organized spatial patterns across separate colonies.

      However, I think the paper needs to tone down the extent to which the system demonstrates multi-step differentiation or morphogenesis, which is not critical for making the paper valuable. Only the first step of their circuit design (Figure 1), the toggle switch, generates stable alternative states. The latter steps are mainly signal-dependent reporter activation states layered on top of the blue receiver state, rather than true fate transitions. The authors explicitly state that red expression is added without replacing the blue identity, and they also acknowledge that red cells lose their identity upon restreaking unless they remain near sender cells. That substantially weakens the differentiation analogy and makes the Waddington framing too strong.

      A related concern is that the 3rd step does not introduce a new spatial organizing rule. The authors show that the second signal remains confined to cells already receiving the first signal, and explicitly conclude that it functions only as an autocrine cue rather than a second paracrine layer. As a result, the 3-step system seems more like an added local readout or maturation layer. Overall, the main 2-step outcome is sparse green sender colonies surrounded by red-expressing blue receivers, with distant receivers remaining blue. That is a valid engineered pattern, but it is still a local, threshold-response circuit architecture.

      The autonomy claim should be toned down and stated more precisely. The plate patterning occurs without externally imposed spatial gradients, which is a strength. However, by design, the overall system behavior depends strongly on pre-culture inducer conditions that set the sender:receiver ratio, and this externally imposed history is central to the final pattern. This property is tied to how the circuit is designed where steps 2 and 3 largely respond to symmetry breaking introduced in step 1, which is dependent on both history and initialization on the plate. In particular, currently the pattern formation process is quite variable (e.g. figure 5), depending on how different colonies flip the toggle switch, and consequently, how many become senders and how many become receivers. It would have been fascinating if they could also demonstrate the differentiation within individual colonies, leading to intra-colony patterns. This aspect should at least be discussed.

      The mathematical model is useful in guiding both the characterization of parts, modules and the overall system. However, the claims around its quantitative predictive power should also be made narrower. The simulations are built from multiple fitted and partly hand-tuned components, including toggle-switch response curves, colony-growth rules, diffusion, reporter-response functions, and activity decline. This supports a calibrated qualitative reconstruction of the observed patterns, but not a strong predictive or mechanistic validation.

      Other specific points:

      (1) Given the topic of the work, the authors should cite closely relevant studies in programming pattern formation, including:<br /> Cao et al, Cell 2016 Collective space-sensing coordinates pattern scaling in engineered bacteria<br /> Rajasekaran et al, Cell 2024 A programmable reaction-diffusion system for spatiotemporal cell signaling circuit design<br /> Lu et al, BioRxiv 2024 Discovery of interpretable patterning rules by integrating mechanistic modeling and deep learning

      (2) The model assumes identical diffusion coefficients for C6-HSL and C14-HSL despite their substantially different molecular sizes and hydrophobicities. This assumption could distort kinetic lag with differential diffusion in explaining the autocrine confinement of the third step. Its impact should at least be explored in the simulations.

      (3) The mCherry response parameters change significantly between the 2-step and 3-step systems. The authors acknowledged this change but did not provide a clear explanation.

      (4) The 3-step system is evaluated at only a single condition with no simulation comparison, in contrast to the systematic 11-condition validation of the 2-step system.

    1. Reviewer #1 (Public review):

      Summary:

      The metabolic profiles of immune cells under steady-state or immune-activated conditions remain poorly characterized. The authors find that embryonically derived hemocytes in Drosophila larvae predominantly utilize mitochondrial respiration to generate energy and exhibit minimal glycolysis rates under unchallenged conditions. Hemocytes developmentally elevate ATP production rates. Mitochondrial respiration drives metabolic activation in larval hemocytes. More specifically, lamellocytes exhibit unique metabolic activities, including enhanced trehalose catabolism and mitochondrial remodeling, required for their encapsulation response.

      Strengths:

      The study shows the metabolism that is most likely to operate in different immune cells in Drosophila during development and also during infection. This is related to mitochondrial organization and proliferation and/or differentiation state.

      Weaknesses:

      Even though there is a rigorous analysis of mitochondrial activity using the Sea Horse analyzer, the analysis of diverse mitochondrial activities in the different immune cell types across development and in infection could be carried out using microscopy. ROS, mitochondrial membrane potential, NADH/+ and FADH/+ levels in vivo are likely to give a more specific readout of change in cellular activities. The activities of mitochondrial fusion and fission need to be collectively tested to understand their role in development and also in infection. The relevance of the change in mitochondrial activity for development or immunity remains to be tested.

    2. Reviewer #2 (Public review):

      Summary:

      This study presents an analysis of the metabolism of Drosophila larval immune cells during development and activation. The authors compared the utilization of glycolysis and oxidative phosphorylation for energy metabolism. Although this topic has been widely discussed and well-studied in immune cell research, particularly in mammals, it has received little attention in insects. The authors demonstrated that quiescent and activated larval Drosophila immune cells predominantly use mitochondrial oxidative phosphorylation to produce energy. This finding is significant for the emerging field of insect immunometabolism research and is interesting in comparison to mammalian immunity, where immune cell activation is often associated with a shift toward greater reliance on glycolysis.

      Strengths:

      Using the Agilent Seahorse system, the authors developed and fine-tuned a method to measure the energy metabolism of Drosophila immune cells, obtaining high-quality, robust data. Through genetic manipulations targeting immune cells specifically, they analyzed metabolic changes in cells with different activations, going beyond developmental changes. They convincingly demonstrated ATP production, primarily in the mitochondria of immune cells, at various developmental stages and in various activated states. The results presented mostly support the conclusions drawn. This methodology and its results are valuable for further studies of insect immunometabolism. In a broader context, they are also valuable for comparing the metabolism of immune cells across different animal groups.

      Weaknesses:

      The genetic manipulations used were suitable for obtaining immune cells of various types and activation states, such as proliferation, differentiation, and immune activation. However, this method has limitations: the mixture of different cell types was always analyzed, and the specific type of interest was often a minority cell population. Had the other cells remained in their initial control state, the observed change in metabolism could have been primarily attributed to the desired cell type. However, the remaining cells that did not transform into the desired type were also usually influenced or activated in some way, making it difficult to determine to which group the observed change should be attributed. For example, consider the induction of lamellocyte differentiation using Hml>Hop[tum]. There are approximately 1,000 lamellocytes per larva, but according to Supplementary Figure 4, there are still about 5,000 Hml+ cells, and even these cells have activated Jak/Stat signaling. Therefore, it can be assumed that they are also activated. After a real infection, the proportion of lamellocytes is greater, but the remaining plasmatocytes are also activated. The authors should mention these limitations more clearly. However, as the authors correctly note, solving this problem will require single-cell approaches, which current technologies still limit. I see this as a problem when interpreting the proliferation effect. The crucial question is what percentage of the analyzed cells induced by Hml>Ras[V12] were actually in the division stage. Not all hemocytes are Hml+, so not all are induced. Of those that are induced, how many are in the division stage at the time of analysis? Meanwhile, those that were not dividing at that moment also had activated Ras, which triggers many processes besides division. Information on what percentage of the analyzed cells were dividing is missing. This information is important because the finding that dividing Drosophila immune cells primarily use mitochondria and oxidative phosphorylation to produce ATP contrasts with the debated significance of the Warburg effect in dividing mammalian cells. This finding would be significant, but unfortunately, it is not robustly supported by the presented data.

    3. Reviewer #3 (Public review):

      Summary :

      This study investigates the metabolic profiles of hemocytes across multiple stage/conditions and suggests that hemocytes act as regulators of metabolism rather than merely receivers of metabolic cues. The authors show that hemocytes rely primarily on mitochondrial respiration, which is further enhanced during proliferation in development or upon genetic manipulation of plasmatocytes, but not crystal cells.

      Metabolic respiration is also activated in lamellocytes, and this activation correlates with changes in mitochondrial morphology. The authors further attempt to identify mechanisms underlying this activation, proposing that mitochondrial fission may contribute to the ability of lamellocytes to encapsulate wasp eggs.

      Strengths:

      This work provides detailed and valuable insights into the metabolic phenotypes of hemocyte populations at different developmental stages and under both physiological and pathological conditions. The authors perform a longitudinal assessment of hemocyte metabolism and compare metabolic states across contexts.

      Importantly, they provide evidence that hemocytes regulate metabolism to perform essential immunological functions, such as wasp egg encapsulation. This reinforces the view that hemocytes are key regulators and communicators that adapt their metabolic programs according to developmental and environmental demands.

      Weaknesses:

      The results presented are insightful, although several controls and validations could strengthen the conclusions. It would be preferable to also include responder transgenes alone as a control for leakiness, and the scRNA-seq findings would benefit from in vivo validation.

      Some conclusions appear inconsistent or insufficiently supported. For instance, although mitochondrial respiration in plasmatocytes peaks at 96 h AEL, this increase is not accompanied by detectable mitochondrial rearrangement, which remains constant between 96 h AEL and 120 h AEL.

      In general, the authors should temper some statements or provide further data.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      This manuscript addresses an important methodological issue-the fragility of meta-analytic findings-by extending fragility concepts beyond trial-level analysis. The proposed EOIMETA framework provides a generalizable and analytically tractable approach that complements existing methods such as the traditional Fragility Index and Atal et al.'s algorithm. The findings are significant in showing that even large meta-analyses can be highly fragile, with results overturned by very small numbers of event recodings or additions. The evidence is clearly presented, supported by applications to vitamin D supplementation trials, and contributes meaningfully to ongoing debates about the robustness of meta-analytic evidence. Overall, the strength of evidence is moderate to strong.

      Strengths:

      (1) The manuscript tackles a highly relevant methodological question on the robustness of meta-analytic evidence.

      (2) EOIMETA represents an innovative extension of fragility concepts from single trials to meta-analyses.

      (3) The applications are clearly presented and highlight the potential importance of fragility considerations for evidence synthesis.

    2. Reviewer #3 (Public review):

      Summary and strengths:

      In this manuscript, Grimes presents an extension of Ellipse of Insignificant (EOI) and Region of Attainable Redaction (ROAR) metrics to meta-analysis setting as metrics for fragility and robustness evaluation of meta-analysis. The author applies these metrics to three meta-analyses of Vitamin D and cancer mortality, finding substantial fragility in their conclusions. Overall, I think extension/adaption is a conceptually valuable addition to meta-analysis evaluation, and the manuscript is generally well-written.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have provided new data and text that addresses all of the reviewers' comments on the previous versions in a wholly satisfactory way.]

      Summary:

      This study presents evidence that addition of the two GTPases EngA and ObgE to reactions comprised of rRNAs and total ribosomal proteins purified from native bacterial ribosomes can bypass the requirements for non-physiological temperature shifts and Mg+2 ion concentrations for in vitro reconstitution of functional E. coli ribosomes.

      Strengths:

      This advance allows ribosome reconstitution in a fully reconstituted protein synthesis system containing individually purified recombinant translation factors, with the reconstituted ribosomes substituting for native purified ribosomes to support protein synthesis. This represents a significant development in the long-term effort to produce synthetic cells.

    2. Reviewer #2 (Public review):

      This study has developed a single-step method to assemble active bacterial ribosomes under near-physiological conditions by using the GTPase factors EngA and ObgE. These factors eliminate the need for the traditional, harsh manipulations of temperature and magnesium levels. This integration is an important step toward the bottom-up construction of synthetic cells.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      The paper from Hudait and Voth details a number of coarse-grained simulations as well as some experiments focused on the stability of HIV capsids in the presence of the drug lenacapavir. The authors find that LEN hyperstabilizes the capsid, making it fragile and prone to breaking inside the nuclear pore complex.

      Comments on previous round of revisions:

      I found that the authors addressed my concerns satisfactorily. The other reviewer raised a number of important points regarding the nuances of the model and the interpretation of the simulations, which the authors rebutted. I think the paper in its current form now is a worthwhile addition to the literature.

    2. Reviewer #3 (Public review):

      This is a technically sophisticated study that integrates coarse-grained modeling with live-cell imaging to address an important and timely question regarding HIV-1 capsid inhibition by lenacapavir.

      In summary, in my view, the manuscript represents a solid contribution to the field.

    1. Reviewer #1 (Public review):

      Summary:

      Zinn and colleagues investigated the role of proteases 2A and 3C of enterovirus D68 (EVD68), an emerging pathogen associated with outbreaks of acute flaccid myelitis (AFM), a polio-like disease, on the nucleocytoplasmic trafficking in different systems, including human neurons derived from pluripotent cells. They found that 2A specifically cleaved Nup98 and POM121. Using reporter proteins and RNA synthesis and trafficking assays in cells expressing viral proteases, they showed that 2A induces broad loss of the nuclear pore barrier function, but, surprisingly, the RNA export appears to be minimally affected. Since nucleocytoplasmic trafficking defects are known to be associated with neuropatologies, they propose a hypothesis that 2A-dependent cleavage of nucleoporins in motoneurons underlies the development of EVD68-induced AFM. They further show that a 2A-specific inhibitor increases the survival of human neurons differentiated from stem cells upon EVD68 infection.

      Strengths:

      Use of multiple methods to investigate the effect of 2A and 3C expression on nucleoporin cleavage and nucleocytoplasmic trafficking.

      Comments on revisions:

      The following issues remain unresolved:

      First, the authors still do not show representative images confirming specific nucleoporin degradation (Fig.1), which is the main focus of the work.

      Second, the conclusion that 2A-mediated degradation of the nucleo-cytoplasmic barrier does not affect export of the RNA from the nucleus is not supported by the presented data. The representative images shown in Fig 3C do not have the signal for GFP (like in Fig. 2), and therefore it is impossible to see if those cells indeed express EVD68 proteases.

      Moreover, to show RNA export, not only the decrease of nuclear EU signal should be quantified, but also the increase of the cytoplasmic signal. The diminishing of the nuclear staining may not necessarily reflect RNA export, but may well be explained by nuclease activity, all the more relevant in cells expressing 2A, where the nuclear-cytoplasmic barrier is disrupted and cytoplasmic nucleases may enter the nucleus.

      The same applies to images in Fig. 3D. There are no markers of infection; moreover, the experiment description indicates that EU labeling began at 24 h post-infection with an MOI of 5, i.e., essentially all cells should have been infected. This is difficult to believe as the replication cycle of most EVD68 strains in HeLa cells is no longer than 12 h, yet the images do not show any signs of CPE, and demonstrate a strong EU signal, inconsistent with the expected inhibition of nuclear transcription, a known attribute of enterovirus infections.

      The claim that nuclear transcription and RNA export remain unaffected in conditions of 2A-mediated disruption of the nucleo-cytoplasmic barrier is very strong and requires equally strong evidence.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates the role of EV-D68 proteases 2A and 3C in nuclear pore complex (NPC) dysfunction and their contribution to motor neuron toxicity. The authors demonstrate that both proteases cleave only a limited number of nucleoporins, with 2A^pro showing the strongest impact by inhibiting nuclear import and export of proteins and disrupting NPC permeability without affecting RNA export. Importantly, treatment with the 2A^pro inhibitor telaprevir reduced neuronal cell death in a dose-dependent manner, achieving neuroprotection at concentrations below those required to inhibit viral replication. The study addresses a relevant mechanism underlying EV-D68-induced neuropathology and explores a potential therapeutic intervention.

    3. Reviewer #3 (Public review):

      Summary:

      The author showed expression of the viral proteases 2Apro and 3Cpro of EV-D68, which cleaved specific components of the nuclear pore complex (Nup98 and POM121 by 2Apro), and 2A but not 3C expression altered nuclear import and export. Similar nucleocytoplasmic transport deficits are observed in EV-D68-infected RD cells and iPSC-derived motor neurons (diMNs). 2A inhibitor telaprevir partially rescued the nucleocytoplasmic transport deficits and suppressed neuronal cell death after infection. While it's clear that 2A can cleave NPC proteins and affect nuclear transport, the link to neurotoxicity after EV-D68 infection is less convincing.

      This study opens up a very intriguing hypothesis: that EV-D68 2Apro could be directly responsible for motor neuron cell death, mediated by POM121 and possibly Nup98 cleavage, that ultimately results in paralysis known as acute flaccid myelitis. This hypothesis notably does run counter to other published data showing that human neuronal organoids derived from iPSCs can support productive EV-D68 infection for weeks without cell death and that EV-D68-infected mice can have paralysis prevented by depletion of CD8 T cells, still with EV-D68 infection of the spinal cord. However, even if 2Apro is not ultimately responsible for motor neurons dying in human infections, that does not exclude the possibility that cleavage of nups could still disrupt motor neuron function. Notably, most children with AFM have some amount of motor function return after their acute period of paralysis, but most still have some residual paralysis for years to life. It is possible that 2A pro could mediate the acute onset of weakness, while T cells killing neurons could determine the amount of long-term, residual paralysis.

      Strengths:

      The characterization of nuclear pore complex components that appear to be targets of both poliovirus and EV-D68 proteases is quite thorough and expansive, so this data set alone will be useful for reference to the field. And the process by which the authors narrowed their focus to EV-D68 2Apro reducing Nup98 and POM121 as consequential to both import and export of nuclear cargo but not RNA was technically impressive, thorough, and convincing. As will be detailed below, when the authors move from studying over-expressed proteases in transformed cell lines to studying actual virus infection in both transformed cell lines and iPSC-derived neurons, some of the data only indirectly support their conclusions; however, the quality of the experiments performed is still high. So even if the claim that 2Apro causes neurotoxicity is circumstantial, the data certainly are intriguing and certainly justify further study of the effects of EV-D68 2Apro on the NPC and how this impacts pathogenesis. This is a convincing start to an intriguing line of inquiry.

      Comments on revisions:

      The authors have returned a stronger revised manuscript, being responsive to most of the combined reviewers' comments. It was especially important to add the clarity and specificity that the data in this manuscript did not establish a direct link for 2Apro causing AFM. The authors have clarified this language adequately, such that it is appropriate to remove the "incomplete" portion of the short assessment as they have requested. Adding in experiments with EV-D68 virus infection to complement their work with recombinant proteases also strengthened their conclusions.

      There are still some areas where discrepancies remain, although these are minor and can mostly be acknowledged as limitations of their approach rather than needing more experiments, unless the authors choose to do the additional experiments. To try to make this understandable, I have copied from the rebuttal letter (*) original comment, (**) author's rebuttal, and (***) a reply to the rebuttal:

      (*)(2) Telaprevir was able to rescue nucleocytoplasmic transport in RD cells at low concentrations (Figure 4A). It is not shown if this correlates with its antiviral effect in RD cells, or could this correlate with inhibition of 2A cleavage of Nup98 or POM121, which is never measured.

      (**) In the aforementioned new experiment in Figure 4A, we have also included a dose-response curve for telaprevir showing its inhibition of POM121 and Nup98 cleavage.

      (***) Fig.4A is in diMN not RD cells. The EC50 of telaprevir could be very different in RD cell vs diMNs. This question remains unanswered.

      (*) (3) Building off of the prior point, the authors' claim that the neuroprotective effect of telaprevir is independent of its antiviral effect is not well-founded. Figure 4E (neuroprotection) was done with MOI 5, and Figure 4G (virus growth) was MOI 0.5. Telaprevir neuroprotection is not shown at MOI 0.5, nor is the neuroprotective effect correlated with inhibition of 2A cleavage of Nup98 or POM121.

      (**) The selection of MOIs for these two experiments was limited by technical considerations. If the viral growth curve were to be performed at MOI 5, it would be confounded by cell death. Further, a low MOI is required in order to allow multiple rounds of infection, and is therefore more sensitive for assaying the effect of telaprevir on viral replication. On the other hand, at MOI 0.5 diMN death is very gradual, and the neuroprotection assay we would have lacked the statistical power to determine whether a rescue of this small magnitude of toxicity is significant. The EC50 of telaprevir is not expected to vary at different MOIs.

      (***) This should be discussed in the Discussion as a limitation of the experiment.

      (**) We have also now correlated the inhibition of 2Apro cleavage of Nup98 and POM121 with the neuroprotective effect at comparable concentrations of telaprevir, as described above.

      (***) Unless you quantify this, my eye disagrees with you. In Fig.4A, cleavage of NUP98 is rescued by 3uM telaprevir, but that does not seem to be the case for POM121.

      Additionally, in Fig. 4D, why is only NLS but not NES is impaired in diMN? This should be discussed.

    1. Reviewer #1 (Public review):

      Summary

      Fogel & Ujfalussy report an extension of a visualization tool that was originally designed to enable an understanding of detailed biophysical neuron models. Named "extended currentscape", this new iteration enables visual assessment of individual currents across a neuron's spatially extended dendritic arbor with simultaneous readout of somatic currents and voltage. The overall aim was to permit a visually intuitive understanding for how a model neuron's inputs determine its output. This goal was worthwhile and the authors achieved it. Demonstrating the utility of extended currentscape, the authors leverage their models to generate interesting and detailed biophysical insights into widely studied neurophysiological phenomena with clear behavioral relevance. Overall, this study provides a valuable and well-characterized biophysical modeling resource to the neuroscience community.

      Strengths

      The authors significantly extended a previously published open-source biophysical modeling tool. Beyond providing important new capabilities, the potential impact of extended currentscape is boosted by its integration with preexisting resources in the field.

      In keeping with the authors' goal to provide an approachable platform with intuitive visualizations of how current flows through neurons, the manuscript is approachable to non-computationalists. In particular, a dedicated glossary and elegant illustrations in Figure 2 boost accessibility for biologists.

      Extended currentscape produces intriguing and detailed predictions spanning neurophysiological phenomena such as local dendritic spikes, complex spike generation, and feature selectivity (hippocampal place fields). By triggering analysis of modeled synaptic inputs on these events, the authors trace their origins from dendritic integration to synaptic input patterns.

      The authors cleverly apply a graph theoretical approach to efficiently model bidirectional current flow throughout a neuron's dendritic arbor. As a result, extended currentscape can run on a standard personal computer.

      The code is well-documented and freely available via GitHub.

      Weaknesses

      While extended currentscape meets its objective of modeling and illustrating the propagation of axial currents throughout a model neuron in great detail, it requires simulation and measurement of synaptic input currents. For this reason, there currently exists a very high technical barrier to conclusively test its intriguing predictions: simultaneous readout of synaptic inputs throughout a neuron's dendritic arbor. Mitigating this weakness, the authors propose a relatively more feasible alternative approach in Discussion: simultaneous voltage imaging of dendrites and their soma while estimating synaptic inputs from the distributions of voltage dynamics along individual dendritic branches.

    2. Reviewer #2 (Public review):

      The electrical activity of neurons and neuronal circuits is dictated by the concerted activity of multiple ionic currents. Because directly investigating these currents experimentally is not possible with current methods, researchers rely on biophysical models to develop hypotheses and intuitions about their dynamics. Models of neural activity produce large amounts of data that are hard to visualize and interpret. The currentscape technique helps visualize the contributions of currents to membrane potential activity, but it is limited to model neurons without spatial properties. The extended currentscape technique overcomes this limitation by tracking the contributions of the different currents from distant locations. This extension allows tracking not only the types of currents that contribute to the activity in a given location, but also visualizing the spatial region where the currents originate. The procedure is first illustrated in a simple setting that allows testing its validity in an intuitive situation where a cell with an apical trunk and two dendritic branches responds to synaptic inputs. The procedure is then applied to study the initiation of complex spike bursts in a model hippocampal place cell.

      The extended currentscape method represents a significant improvement over the original technique, which is already utilized by several research groups. By enabling the analysis of current contributions in spatially extended models, this technique provides a new lens for investigating neuronal and circuit dynamics and will be of use to the modeling community.

      Comments on revisions:

      The changes in Figure 2 greatly improved the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      In this article by Xiao et al. the authors aimed to identify the precise targets by which magnesium isoglycyrrhizinate (MgIG) functions to improve liver injury in response to ethanol treatment. The authors found through a series of in-vivo and molecular approaches that MgIG treatment attenuates alcohol-induced liver injury through a potential SREBP2-IdI1 axis. The revised manuscript adds to a previous set of literature showing MgIG improves liver function across a variety of etiologies, and also provides mechanistic insight into its mechanism of action. All major weaknesses were addressed in the revised submission.

      Strengths:

      (1) The authors use a combination of approaches from both in-vivo mouse models to in-vitro approaches with AML12 hepatocytes to support the notion that MgIG does improve liver function in response to ethanol treatment.

      (2) The authors use both knockdown and overexpression approaches, in-vivo and in-vitro, to support most of the claims provided.

      (3) Identification of HSD11B1 as the protein target of MgIG, as well as confirmation of direct protein-protein interactions between HSD11B1/SREBP2/IDI1 is novel.

      Weaknesses:

      The authors addressed all my concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigated magnesium isoglycyrrhizinate (MgIG)'s hepatoprotective actions in chronic-binge alcohol-associated liver disease (ALD) mouse models and ethanol/palmitic acid-challenged AML-12 hepatocytes. They found that MgIG markedly attenuated alcohol-induced liver injury, evidenced by ameliorated histological damage, reduced hepatic steatosis, and normalized liver-to-body weight ratios. RNA sequencing identified isopentenyl diphosphate delta isomerase 1 (IDI1) as a key downstream effector. Hepatocyte-specific genetic manipulations confirmed that MgIG modulates the SREBP2-IDI1 axis. The mechanistic studies suggested that MgIG could directly target HSD11B1 and modulate the HSD11B1-SREBP2-IDI1 axis to attenuate ALD. This manuscript is of interest to the research field of ALD.

      Strengths:

      The authors have performed both in vivo and in vitro studies to demonstrate the action of magnesium isoglycyrrhizinate on hepatocytes and an animal model of alcohol-associated liver disease.

      Original comment (1):

      In Supplemental Figure 1A, all the treatment arms (A-control, MgIG-25 mg/kg, MgIG-50 mg/kg) showed body weight loss compared to the untreated controls. However, Figure 1E showed body weight gain in the treatment arms (A-control and MgIG-25 mg/kg), why? In Supplemental Figure 1A, the mice with MgIG (25 mg/kg) showed the lowest body weight, compared to either A-control or MgIG (50 mg/kg) treatment. Can the authors explain why MgIG (25 mg/kg) causes bodyweight loss more than MgIG (50 mg/kg)? What about the other parameters (ALT, ALS, NAS, etc.) for the mice with MgIG (50 mg/kg)?

      Author's response:

      We agree that this observation does not strictly follow a dose-dependent pattern. In vivo responses to pharmacological interventions, particularly in metabolic and liver disease models, are not always linear. The relatively greater body weight reduction observed in the 25 mg/kg group may be influenced by inter-individual variability, differences in metabolic adaptation, or sample size-related variation. Importantly, these differences in body weight were not statistically significant. Therefore, we selected the 50 mg/kg dose for subsequent animal experiments, as it demonstrated more consistent and stable improvements across multiple parameters, including body weight, ALT, AST, TG, and TC.

      New comment:

      My first question: All the treatment arms (A-control, MgIG-25 mg/kg, MgIG-50 mg/kg) showed significant body weight loss compared to the untreated controls (Supplemental Figure 1A), but the body weight significantly increased in the treatment arms (A-control and MgIG-50 mg/kg) compared to the untreated controls (Figure 1E). Why?

      My second question: Mice with MgIG (25 mg/kg) showed the lowest body weight, compared to either A-control or MgIG (50 mg/kg) treatment. According to the authors' explanation, the MgIG (25 mg/kg) caused bodyweight loss are attributed to inter-individual variability, differences in metabolic adaptation, or sample size-related variation. Did these differences happen in MgIG (25 mg/kg) only? or in all other groups? The mouse group assignment should be randomized; however, a large variation in bodyweight was seen in MgIG (25 mg/kg) group. It is not convincing for the author to select MgIG (50 mg/kg) group for subsequent animal experiments, because of a large variation in MgIG (25 mg/kg) group, and because that MgIG (50 mg/kg) group demonstrated more consistent and stable improvements across multiple parameters. The author should reanalyze and compare all the raw data between MgIG (50 mg/kg) group and MgIG (25 mg/kg) group, and address the issues being pointed out and justify rationale for the animal group assignment.

      Original comment (2):

      IL-6 is a key pro-inflammatory cytokine significantly involved in ALD, acting as a marker of ALD severity. Can the authors explain why MgIG 1.0 mg/ml shows higher IL-6 gene expression than MgIG (0.1-0.5 mg/ml)? Same question for the mRNA levels of lipid metabolic enzymes Acc1 and Scd1.

      Author's response:

      Thank you for this important comment. We agree that IL-6, as well as lipid metabolism-related genes such as Acc1 and Scd1, are key indicators in ALD. The relatively higher expression observed at 1.0 mg/mL MgIG compared to lower concentrations (0.1-0.5 mg/mL) may be related to experimental constraints associated with the MgIG formulation used in this study. Specifically, to maintain consistency with our in vivo experiments, we used a clinically available liquid formulation of MgIG (5 mg/mL), which is approved for intravenous administration in China. Due to its relatively low stock concentration, achieving higher working concentrations (e.g., 1.0 mg/mL) in vitro required a larger volume of the MgIG solution, thereby proportionally reducing the volume of culture medium. This reduction in effective culture conditions may adversely affect hepatocyte viability and function. Supporting this, our CCK-8 and LDH assays indicated that higher MgIG concentrations were associated with subtle cytotoxicity or impaired cell status.

      New comment:

      The author's response did not answer my question. If the authors believe it could be experimental constraints associated with the MgIG formulation, then it is questionable for this MgIG formulation used in all other associated experiments. The experiments, at least those the MgIG formulation associated experiments, need to be repeated.

      Original comment (3):

      For the qPCR results of Hsd11b1 knockdown (siRNA) and Hsd11b1 overexpression (plasmid) in AML-12 cells (Figure 5B), what is the description for the gene expression level (Y axis)? Fold changes versus GAPDH? Hsd11b1 overexpression showed non-efficiency (20-23, units on Y axis), even lower than the Hsd11b1 knockdown (above 50, units on Y axis). The authors need to explain this. For the plasmid-based Hsd11b1 overexpression, why does the scramble control inhibit Hsd11b1 gene expression (less than 2, units on the Y axis)? Again, this needs to be explained.

      Author's response:

      Thank you for this important comment, and we apologize for the lack of clarity in the Y-axis labeling, which may have led to misunderstanding.

      As shown in Figures 5A and 5B, we have revised the Y-axis description to clearly indicate that gene expression levels are presented as relative expression normalized to GAPDH (fold change relative to the control group).

      New comment:

      The author explained the relative expression was normalized to GAPDH (fold change), but they did not answer my question. My question is for Figure 5B. in Figure 5B (left, Hsd11b1-KD), scramble control showed over 100 (unit), however, in Figure 5B (right, Hsd11b1-OE), scramble control showed only 0.5-1 (unit). The data seemed that authors used same scramble control for both KD and OE? If yes, they should provide more details of the KD and OE experiments and explain why this happened. If they used plasmid for OE control, they also need to clarify it. In addition, qPCR is not a good assay to show the success of KD or OE, Western blotting should be done as convincing data to show the success of KD or OE.

    1. Reviewer #2 (Public review):

      In this paper, Biswas et al. describe the role of acetylcholine (ACh) signaling in protection against chronic oxidative stress in C. elegans. They showed that disruption of ACh signaling in either unc-17 mutant or gar-3 mutants led to sensitivity to toxicity caused by chronic paraquat (PQ) treatment. Using RNA seq, they found that approximately 70% of the genes induced by chronic PQ exposure in wild type failed to upregulate in these mutants. The overexpression of gar-3 selectively in cholinergic neurons was sufficient to promote protection against chronic PQ exposure in an ACh-dependent manner. The study points to a previously undescribed role for ACh signaling in providing organism-wide protection from chronic oxidative stress likely through the transcriptional regulation of numerous oxidative stress-response genes. The paper is well-written, and the data are robust. While the study identifies the muscarinic ACh receptor gar-3 as an important regulator of the response to PQ, the specific neurons in which gar-3 functions were not unambiguously identified, and the sources of ACh that regulate GAR-3 signaling and the identities of the tissues targeted by gar-3 remain unknown.

      Comments on revisions:

      No further comments.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Conformational Variability of HIV-1 Env Trimer and Viral Vulnerability", the authors study the fully glycosylated HIV-1 Env protein using an all-atom forcefield. It combines long all-atom simulations of Env in a realistic asymmetric bilayer with careful data analysis. This work clarifies how the CT domain modulates the overall conformation of the Env ectodomain and characterizes different MPER-TMD conformations. The authors also carefully analyze the accessibility of different antibodies to the Env protein.

      Strengths:

      This paper is state-of-the-art given the scale of the system and the sophistication of the methods. The biological question is important, the methodology is rigorous, and the results will interest a broad elife audience. The authors also establish strong connections to previous literature and acknowledge the limitations of the CT-truncated protein construct, which enhances the manuscript's relevance to the community.

    2. Reviewer #2 (Public review):

      In this work, the authors elucidate how a viral surface protein behaves in a membrane environment and how its large-scale motions influence the exposure of antibody-binding sites. Using long-timescale, all-atom molecular dynamics simulations of a fully glycosylated, full-length protein embedded in a virus-like membrane, the study systematically examines the coupling between ectodomain motion, transmembrane orientation, membrane interactions, and epitope accessibility. Multiple model variants differing in cleavage state, initial transmembrane configuration, and presence of the cytoplasmic tail are compared to identify general features of protein-membrane dynamics relevant to antibody recognition.

      A major strength of this study is the scope and ambition of the simulations. The authors perform multiple microsecond-scale simulations of a highly complex, biologically realistic system that includes the full ectodomain, transmembrane region, cytoplasmic tail, glycans, and a heterogeneous membrane. The finding that the ectodomain explores a wide range of tilt angles while the transmembrane region remains more constrained, with limited correlation between the two, offers useful conceptual insight into how global motions may be accommodated without large rearrangements at the membrane anchor. The explicit consideration of membrane and glycan steric effects on antibody accessibility further strengthens the study.

      The main limitations relate to sampling and model dependence inherent to simulations of this size and complexity. The analysis of antibody accessibility is based on geometric and steric criteria, which do not capture potential conformational adaptations of antibodies or membrane remodeling during binding; the authors have appropriately noted this as a limitation.

      In the revised manuscript, the authors have addressed all previously raised concerns. Time series plots of the tilt angles have been added, figure captions and visual encodings have been clarified, quantitative descriptions of angular distributions have been strengthened, and the distance metric for MPER exposure is now accompanied by temporal data. The overall presentation is substantially improved, and the conclusions are well supported by the data as presented.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses large-scale all-atom molecular dynamics simulations to examine the conformational plasticity of the HIV-1 envelope glycoprotein glycoprotein (Env) in a membrane context, with particular emphasis on how the transmembrane domain (TMD), cytoplasmic tail (CT), protomer cleavage, and membrane environment influence ectodomain orientation and antibody epitope exposure. By comparing Env constructs with and without the CT, explicitly modeling glycosylation, and embedding Env in an asymmetric lipid bilayer, the authors aim to provide an integrated view of how membrane-proximal regions and lipid interactions shape Env antigenicity, including epitopes targeted by MPER-directed antibodies.

      Strengths:

      The authors have made a genuine effort to address the concerns raised in the first round of review, and the revised manuscript is substantively improved. The addition of dynamical cross-correlation maps, expanded citation of prior computational work, clarification of the membrane composition rationale, data deposition to Zenodo, and the new discussion contextualizing the independence of ectodomain and TMD motions are all welcome. Several scientifically interesting aspects of the work merit highlighting before the remaining concerns are addressed.

      A key strength of this work remains the scope, scale, and realism of the simulation systems. The authors construct a very large, nearly complete-Env-scale model that includes a glycosylated Env trimer embedded in an asymmetric bilayer, enabling analysis of membrane-protein interactions that are difficult to capture experimentally. The inclusion of specific glycans at reported sites, and the focus on constructs with and without the CT or cleavage, are well motivated by existing biological and structural data.

      The observation that R696 orientation and its interacting partners give rise to asymmetric protomer conformations and distinct TMD tilts is a notable finding. The statement that interactions between R696 and lipid headgroups or CT residues can be strong enough to introduce a kink into the TMD is well-supported by representative snapshots and consistent with prior isolated-TMD simulations. The use of two initialization depths ("high" and "low") to probe R696 leaflet preference is methodologically interesting and the authors' interpretation - that there is a slight bias toward cytoplasmic leaflet interactions, but that these contacts could be highly dynamic over the course of viral entry - is appropriately cautious. It would be valuable to explicitly frame this as a hypothesis with testable predictions that future experimental or enhanced-sampling work could address. Similarly, the equilibration-driven kinking of the TMD core, consistent with prior isolated-TMD studies, represents a useful validation that extends those earlier observations to the intact trimeric context.

      The simulations reveal substantial tilting motions of the ectodomain relative to the membrane, with angles spanning roughly 0-30{degree sign} (and up to ~40{degree sign} in some analyses), while the ectodomain itself remains relatively rigid. This framing, that much of Env's conformational variability arises from rigid-body tilting rather than large internal rearrangements, is an important conceptual contribution. The authors also provide interesting observations regarding asymmetric bilayer deformations, including localized thinning and altered lipid headgroup interactions near the TMD and CT, which suggest a reciprocal coupling between Env and the surrounding membrane.

      The analysis of antibody-relevant epitopes across the prefusion state, including the V1/V2 and V3 loops, the CD4 binding site, and the MPER, is another strength. The study makes effective use of existing experimental knowledge in this context, for example by focusing on specific glycans known to occlude antibody binding, to motivate and interpret the simulations.

      Finally, the revised discussion provides more context that situates the study's findings and discrepancies within the broader literature, strengthening the manuscript's clarity and interpretability.

      Weaknesses:

      The revised work is much improved, but still includes substantive issues with writing including organization, such as paragraph run-ons, and citation issues. Improving these would help readers make the most of this important study.

      The revised Introduction now includes a paragraph summarizing prior MD work, which is an improvement. However, the paragraph remains structured around the limitations and setup of previous studies (e.g., "early studies were constrained by limited computational resources", short trajectory lengths, isolated constructs) rather than their findings. Readers benefit most from understanding what those studies showed - and where the present work confirms, extends, or diverges from those results. The current framing inadvertently positions prior work as deficient scaffolding rather than as independent data points converging on shared conclusions. The Introduction could be revised to briefly summarize the key biological conclusions from prior MD studies alongside their technical context, which could then be revisited in their appropriate place alongside key results.

      The authors have verified that PDB entries are cited at first mention, and this is noted. However, a recurring issue remains: key literature-supported conclusions appear in the Results and Discussion sections without accompanying citations at each point of use. Passages that summarize experimental or computational findings - particularly those used to validate or contextualize the authors' own results - require citation at every point of claim, not only at first introduction of a reference. This is not a minor stylistic preference. Downstream readers, systematic reviewers, and automated tools that map literature to claims (e.g., scite) rely on co-occurrence of claims and citations within the same passage. A citation appearing several paragraphs earlier does not carry attribution forward. As a practical example: the statement that "MPER-targeting antibodies bind effectively only after the gp120-gp41 trimer undergoes major conformational rearrangements toward a fusion-intermediate or post-fusion state (Frey et al., 2008; Alam et al., 2009; Chen et al., 2014; Lee et al., 2016)", which is appropriate. That same standard of inline attribution should be applied throughout - including in Results and Discussion subsections where prior experimental findings are mentioned without citation.

      Additionally, cited literature should be framed to highlight convergence with the authors' conclusions, not primarily to limitations of previous studies. Where prior studies independently support a finding, this should be stated explicitly. Independent replication across methods and systems is one of the strongest arguments for ground truth; treating it as such would improve the manuscript's scientific standing.

      Finally, the dynamical cross-correlation maps assess ectodomain-TMD coupling, and the authors appropriately acknowledge that microsecond simulations capture only the closed ground state. However, the revised manuscript does not address the question raised in the first review regarding CT-TMD and CT-ectodomain correlations. The Results section states that "very weak correlations between the ectodomain and the TMD" were found, but it is not clear whether the CT was included in this analysis or whether analogous correlation maps for CT-TMD and CT-ectodomain pairs were computed for the full-length systems. Additional analyses of the authors' deposited MD trajectories-such as probing for exposure of cryptic epitopes and potential allosteric coupling-could serve as valuable extensions of this work.

    1. Reviewer #1 (Public review):

      Summary:

      There is evidence that some genes encode mRNAs from which separate processed transcripts may arise, separating the coding sequence (CDS) from the 3'-UTR, and with both mRNA elements remaining stable in the cell. However, the functional consequences of these mRNA fragments have not been firmly established. In the manuscript by Yang et al., the authors probe the mRNA domain architecture of Nanog in the context of embryonic stem cell colonies and blastocysts. The authors detect spatial separation of Nanog CDS-containing mRNA from abundant Nanog 3'-UTR RNAs depending on the cell position in 2D embryonic stem cell colonies or in blastocysts.

      Strengths:

      The phenotypic analyses of the Nanog mRNA hold promise for revealing distinct roles for the Nanog encoded protein and a separate RNA encompassing the Nanog 3'-UTR.

      Weaknesses:

      There are a number of questions about the molecular nature of the mRNA species that the authors should address in order for the results to be firmly established, as noted below.

      (1) It is not clear how the authors verified that their probes are specific for Nanog CDS or 3'-UTR regions. Especially for the 3'-UTR probe, it is confusing why colonies show green only regions, suggesting only the CDS is present. I would expect the CDS and 3'-UTR probes to colocalize in the interior cells. Is it possible that the 3'-UTR probe is targeting another RNA?

      (2) It would help for the authors to include a graphic similar to Figure 3, Figure Supplement 1A, that diagrams the location of the CDS and 3'-UTR probes (this should also be done for Oct4 and Sox2). This graphic could also show all potential polyadenylation signals.

      (3) I think, based on the fluorescence patterns, there is evidence that the signal for the Nanog 3'-UTR probe is nuclear (images with DAPI staining), but this is not commented on that I could find. This should be discussed, as nuclear retention has implications for the noncoding function of the 3'-UTR fragment.

      (4) Figure 2, Figure Supplement 1A needs a better explanation. It's not clear how the reads map to the different regions of the Nanog mature mRNA. The authors should show examples at different ratios of CDS to 3'-UTR. Do the reads have a sharp boundary at the junction of where the isolated 3'-UTR is thought to occur?

      (5) I looked in the Zenbu browser at human NANOG CAGE mapping in the FANTOM5 dataset. I could not see evidence for substantial capping of a 3'-UTR fragment when filtering for embryonic cell types. Given the strong signal for the 3'-UTR in border cells, I would expect to see evidence for capping if the RNA were indeed capped. This suggests that if it exists, it is likely uncapped and (as noted in point 3) is likely nuclear retained.

      (6) Are there predicted polyadenylation signals near the end of the CDS that would generate a short 3'-UTR, and are these signals conserved across mammals?

      (7) It would help to see a zoomed-in view of the region targeted by one of the guide RNAs in the 3'-UTR, and where that site is relative to the polyadenylation signal. Is the polyadenylation signal upstream, i.e., CDS proximal?

      (8) A final note, the use of green and red together will be challenging for those who are colorblind. Providing a different false color palette would be helpful.

      I am refraining from comments on the cell biology and morphological insights, as they are remote from my core expertise.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript shows that the coding sequence (CDS) and 3' untranslated region (3'UTR) of mRNA transcripts from the Nanog gene have distinct expression patterns and functions. In both human and mouse embryonic stem cells colonies and blastocysts, these domains are spatially segregated, with 3'UTR-enriched cells occupying the borders and CDS-enriched cells residing in the interior. CDS mRNA expression is correlated with the expected regulation of transcription and epigenetics associated with the Nanog protein. Interestingly, expression of the 3'UTR appears to play an independent role in cell behavior and colony morphogenesis. Indeed, deletion of the 3'UTR causes specific defects in cell spreading and protrusive activity, with alteration in the localization of adhesion and cytoskeleton-associated proteins. Remarkably, a large proportion of those defects are rescued upon ROCK inhibition. Deletion of either Nanog CDS or 3'UTR leads to distinct modifications in the differentiation competence.

      Strengths:

      The independent role of 3'UTR mRNA domains, although identified in neurosciences a couple of years ago, is a novel and exciting field relatively unexplored in early development.

      The manuscript offers a multilayer series of experiments, in ES cells colony, blastocysts, and embryoid bodies, including imaging, -omics, genetic and pharmacological challenges, and differentiation experiments, thereby unveiling very convincingly the role of Nanog 3'UTR in morphogenesis.

      Weaknesses:

      The pathways leading to the generation of those distinct transcript domains are unknown. Although the functional differential roles are well demonstrated, whether the expression patterns are a cause or a consequence of the cells' localisation in the embryo remains to be explored.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Yang et al reported distinct functions of the protein-coding sequence (CDS) and the 3' untranslated region (UTR) in the Nanog mRNA in pluripotent stem cells. They first observed different localization patterns for the CDS and 3' UTR in embryonic stem cells and in blastocyst embryos, and this pattern correlates with cell populations in different pluripotent states based on single-cell sequencing data. To characterize the potentially distinct functions of these regions, the authors generated knockout (KO) cell lines in which either the CDS or the 3' UTR was genetically ablated. These deletions led to different phenotypes in multiple assays. These results provided evidence that the CDS and 3' UTR of an mRNA could have distinct functions. Although these results are potentially interesting, several questions need to be addressed before the validity of their conclusion can be confirmed.

      Strengths:

      This study provides evidence for distinct functions of the protein-coding sequence and 3' untranslated region of an mRNA in pluripotent stem cells. The concept could be more broadly applied.

      Weaknesses:

      The initial observation (distinct localization of CDS and 3' UTRs) and the causal relationship between the KO and phenotype need further validation.

      Major points:

      (1) The authors showed distinct localization patterns of the CDS and 3' UTRs in human and mouse ESCs and blastocysts, and the overlap between their signals was minimal (Figure 1). Does this mean that the CDS and 3' UTR RNAs exist separately? For example, in cells that only showed signals for 3' UTRs, do these RNAs only contain 3' UTRs and lack CDS? Was this confirmed by RNA-seq experiments? If so, how are they generated (i.e., by transcription from a novel promoter or partial degradation of the full-length mRNAs)? This is a key question. Without a clear characterization of these RNAs, the rest of the study cannot be substantiated.

      (2) To confirm that the phenotypes of CDS or 3' UTR KO cells were caused by the deleted regions instead of other artifacts, rescue experiments should be performed.

      (3) As over-expression of the 3' UTR showed a phenotype, important regions within it should be identified, and also the possibility that the 3' UTR contains open reading frame(s) and is translated should be tested.

    1. Reviewer #1 (Public review):

      Summary:

      Dalben et al. grafted the fusion loop mature (FLM) modification, based on a previously reported D2-FLM, to another serotype DENV4, and adapted them to replicate in Vero cells for live attenuated vaccine (LAV) manufacturing while retaining favorable antigenic profiles, generating two new strains: D2-vFLM and D4-vFLM. Deep sequencing revealed adapted mutations at the junction of envelope domains I and II (EDI and EDII), and both D2-vFLM and D4-vFLM showed no evidence of ADE in the presence of FL-targeting Abs. Sera from D2-vFLM immunized mice displayed strong homotypic and reduced heterotypic neutralization compared to wild-type viruses, with minimal to no ADE potential in vitro. Moreover, D2-vFLM immunization completely protected AG129 mice from lethal challenge with mouse-adapted D220. They demonstrate that the FLM modification platform is transferable across serotypes and yields strains with favorable immunogenicity and reduced ADE risk. The FLM approach provides a promising path toward the development of a safer tetravalent DENV LAV.

      Strengths:

      The authors carried out a series of experiments to generate and characterize two new strains (D2-vFLM and D4-vFLM) of FLM-modified viruses, and showed their antigenic and immunogenic profiles. The observation that the FLM modification platform is transferable across serotypes and yields strains with favorable immunogenicity and reduced ADE risk is interesting.

      Weaknesses:

      However, one concern is the total number of mutations (including originally introduced and compensatory mutations) in this FLM vaccine platform, and it is not clear regarding the future directions for the proof-of-concept vaccine in this study.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, YR Dalben et al describe the generation of DENV2 and DENV4 strains with mutations in the fusion loop (FL) of the E protein and pre-membrane (prM) protein to limit potential antibody-dependent enhancement (ADE) resulting from vaccination with live-attenuated vaccines and adapted these strains for growth in Vero cells. They show that the DENV2 version D2-vFLM is immunogenic and generates neutralizing serum against DENV2 and DENV4 after 2 boosts and is protective against lethal challenge. Serum from D2-vFLM also showed no ADE against DENV4.

      Strengths:

      Overall, the paper is well written and presented, and the data presented support most of the conclusions made. Grafting D2-FLM mutations to DENV4 and adapting both to growth in Vero cells is a good step to show that this method could be used to generate production-level LAV. The growth and stability data are clear and well-conducted.

      Weaknesses:

      However, there are several weaknesses, mostly in regard to the immunogenicity data, that limit the overall impact. The FLM mutations were only grafted to DENV4 but not to the other Dengue serotypes. The authors acknowledge that this is a proof-of-concept, but generating mutants of the other serotypes would strengthen the idea that this could be used to develop a tetravalent LAV. Immunizations in mice were only performed for D2-vFLM but not D4-vFLM. Immunogenicity data for D4-vFLM would strengthen this work if it shows that it can be immunogenic, protective, and limit ADE, as is shown for D2-vFLM. ADE from D2-vFLM was only tested against DENV4; does it also limit ADE from the other serotypes? This would better show that these mutations do limit ADE across serotypes and not just a single one.

      Additionally, some of the immunization data likely need to be repeated:

      The authors should describe why they pooled the sera from the mice and whether they purified total IgG or not (Figure 5). They should also probably repeat the challenge experiment since it was 4 mice (D2) against 5 (D2-vFLM), and it is unclear if there is a statistical difference between the results obtained. It is not even mentioned in the Results section (D2 result vs D2-FLM), and thus unclear if using D2-FLM is an improvement in the way the data is currently presented.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a simplified neural bursting model with explicitly controllable parameterization of oscillator dynamics designed for neural circuit modeling involved in rhythm generation.

      Strengths:

      (1) The purpose of the model and applied abstractions are well articulated and justified (2D model, independent parameter control).

      (2) Explicit control of burst duration, inter-burst interval, amplitude, resetting-behavior/entrainment. This allows modelers to focus on circuit interactions and is especially useful when details of intrinsic currents and bursting mechanisms are unknown. One could even imagine a scenario where this model would help identify predictions on key underlying burst generation mechanisms.

      (3) The model is well described and validated with simulations and comparisons to the base model and one alternative model.

      (4) Circuit-level validation is convincing, as it reproduces not only trivial examples.

      (5) The underlying mechanism in phase space is well reasoned and justified, extends previous work, e.g., by McKean, by improving usability.

      Weaknesses:

      (1) The paper heavily relies on numerical demonstrations but does not provide a formal analysis of stability, bifurcations, or entrainment. While appropriate for the intended purposes, a more formal footing could strengthen the model.

      (2) Lots of nice demonstrations are shown, but it is less clear how model parameterization was chosen, how behavior depends on parameterization, and in what parameter ranges certain behavior can be expected. A more detailed description of parameterization/exploration of parameter space would greatly benefit anyone using this model in the future.

      (3) Some claims on reproduction of prior locomotor CPG model and production of "more biologically realistic activity" by the presented model are overstated. The key feature of the locomotor CPG models cited was that they not only reproduced speed-dependent gait expression of intact mice, but also changes of gait expression after silencing/removal of specific commissural and long propriospinal interneurons (e.g., selective loss of trot after deleting of V0V; changes in gait expression and step-to-step variability after silencing of descending long-propriospinal neurons or ascending V3 LPNs). While likely (at least partially) feasible with the model formulation, the correspondence of these silencing/ablation of neuron classes has not been shown by the model. Importantly, though, it appears that authors didn't show how the model in general behaves under the influence of noise, which is key to reproducing LPN silencing.

    2. Reviewer #2 (Public review):

      Summary:

      The authors propose a reduced model for intrinsically bursting neurons. The model simply consists of exponential decay of an adaptation variable in a phenomenological silent phase, an exponential growth of that variable in an active phase, and imposed thresholds for jumps between these phases, with some add-ons to allow for effects such as input-dependence.

      Strengths:

      The model could be used as a controller for an artificial system that needs to switch between on and off states with separate control of state durations. It has some flexibility to allow for variable levels of the activity variable during the active phase. The authors show that the model can be tuned to capture phase response properties of neurons and patterns generated by small networks of neurons.

      Weaknesses:

      The proposed approach lacks biological relevance, practicality, and originality.

      (1) Biological relevance:

      Central pattern generators and other bursting neurons use specific physical principles to generate their bursts of activity. These principles place constraints on the tuning of these bursts, including relationships between active and silent phase durations and other properties. By discarding these relationships, the proposed model risks losing key constraints that affect performance in biologically relevant scenarios. The proposed model does not allow for the emergence of interesting dynamical phenomena, which occur naturally in neurons and neuronal networks.

      It is also important to note that spikes within bursts can be important and of interest. Biophysical models allow for easy extension to include spikes via fast sodium and potassium currents. The proposed model does not allow for such extensibility.

      Finally, as shown in the seminal early-2000s work of Izhikevich, building on fast-slow decomposition work by Rinzel and others, there is a wide variety of possible neuronal bursting patterns. At the very least, several of these have been observed in neuronal recordings. The authors' model is specific to square-wave bursting.

      (2) Practicality:

      The model makes use of various cut-off functions and other aspects that are implemented as rules. Combining rules with differential equations makes for an awkward modeling framework that is inconvenient to implement, conceptualize, and analyze (e.g., from a bifurcation perspective). Moreover, the authors add more and more adjustments to their basic framework to capture additional features, but these add-ons simply make the model more, and unnecessarily, complicated and awkward. It's worth noting that the authors argue for their model based on the idea that more biophysical models are difficult to tune, yet they compare their model to a biophysical one that they were able to tune to achieve the various patterns that they study. They do not give any indication of how easy or hard it was to tune their own model, nor do they compare simulation times between the two models. I do note that the biophysical model seems to have 22 parameters, whereas the simplified one has 21 in Table 2, which is essentially the same number. Finally, although the authors give some extensions of the model to match observed data, their model does not seem useful for predicting performance in never-before-tested scenarios.

      (3) Originality:

      As the authors note, the use of low-dimensional, specifically planar, neural models dates back to early authors such as FitzHugh and Nagumo. What the authors fail to acknowledge is that Rinzel, Terman, Kopell, and others did seminal work on neuronal activity, including phenomena such as post-inhibitory rebound and fast threshold modulation, using a relaxation oscillation framework, starting several decades ago. Their work included applications to central pattern generators (e.g., see Terman and collaborators on respiratory CPGs). It is astonishing that the authors don't seem to be aware of this work and do not mention it at all. Moreover, I don't see any advantage of the proposed framework over the earlier relaxation oscillator setting, where many important mechanistic principles have already been analyzed, including extensions to networks. On a related note, even through they propose a piecewise linear model, the authors do not cite the substantial existing work on piecewise linear models (e.g., Hahnloser, Neural Networks, 1998, for an early example; 2024 SIAM Review article by Coombes et al and references therein for much more) including work specifically on bursting, nor do they cite various other previous efforts to capture bursting with simplified models including work on piecewise linear maps by Aguirre et al.

    3. Reviewer #3 (Public review):

      This computational modeling study introduces the methodology of replacing bursting neurons in a model circuit with a simplified piecewise-linear model with an "active" and a "quiet" state representing, respectively, the burst of spikes and the inter-burst interval. The shape of the active state loosely represents the intra-burst firing rate. Because (piecewise) linear systems are explicitly solvable, the transitions from quiet to active and vice versa can be calculated explicitly to match exactly what a biophysically realistic model or a biological neuron does in different conditions. The base piecewise-linear model is built to represent a 2D biophysical neuron with a cubic v-nullcline. The simplicity of the model allows for matching the kinetics of more complex models with a tractable simplified set of equations, as exemplified by approximations of burst duration and amplitude, phase-response curves, entrainment, and, finally, mimicking the activities of two CPG circuit models using this simplified representation.

      Major comments

      (1) The use of piecewise linear approximations to explicitly estimate properties of biophysical neurons is a well-known and common technique. This study adds nothing to the technique in terms of novelty.

      (2) Although the model explicitly matches active and inactive durations of a circuit neuron, the dynamics are explicitly "clamped" by the user because the reduced model parameters explicitly depend on the input. There are cases where this is useful, for example, when we are interested in the dynamics of _other_ neurons (B, C, D, ...) within the context of activity, and we "clamp" the dynamics of neuron A. One should note that this is no better than having a look-up table. Effectively, to give a comparison, it is like using a sine wave to represent a pacemaker neuron and explicitly define its frequency at different input levels so that it responds "dynamically". However, the neuron is restricted to what the user puts in, and therefore, calling it a dynamical system is entirely wrong. I am afraid that the use of this crude tool is not described well enough in the manuscript to warn a naïve user not to fall for this trap.

      (3) The phase resetting curves are used incorrectly. PRCs are useful when the perturbation is weak (soft), which would demonstrate the nature of the vector field near the limit cycle and therefore inform us of the nature of its stability or instability. A hard PRC would always reset the cycle to the fixed offset from the perturbation phase and is therefore uninformative in understanding dynamics. (It is, however, useful experimentally in identifying which neurons are part of the CPG.) The authors clearly know that the dynamics of the system away from the limit cycle do not conserve those of a biophysical neuron. So what is the point?

      (4) I work on the STG, one of the systems exemplified here. Even in the small and relatively regular CPGs of the STG, the definition of the active and quiet parts of a burst is often less clear than what the authors suggest. Bursting neurons often do multiple bursts in a cycle, and therefore, substituting the burst envelope is a subjective matter. This is even more problematic in bursting neurons in the brain, where there is often no quiet period. This should be discussed.

    1. Reviewer #1 (Public review):

      The idea is super interesting, and the subsequent work is potentially significant because it links peripheral inflammation to remodelling of perinodal adipose tissue and draining lymph nodes. This suggests an antigen-independent manner by which local tissue inflammation can communicate with and reshape immune organ structure and tissue metabolism. However, the evidence is suggestive. For instance, many conclusions rely on correlational weight/cellularity relationships, models with confounders (spontaneous wounding; potentially systemic IMQ), and macrophage dependence inferred from a single pharmacologic approach without definitive depletion/lineage or tracer-based causal link.

      Major Comments:

      (1) "Wounding/fighting" evidence is confounding.

      Unless I am mistaken, a large part of the argument for inflammation-driven perinodal fat pad atrophy and LN expansion relies on spontaneous fighting injuries in co-housed CCR2-/- males, including animals "culled...due to excessive wounding." Because wound severity, duration, infection load, stress, and cage dynamics are uncontrolled, isn't it difficult to assign causality to "cutaneous inflammation"?

      (2) The "CCR2-independent macrophage" conclusion.

      The manuscript interprets persistence/accumulation of macrophages despite reduced inflammatory monocytes as CCR2-independent recruitment or local proliferation. However, CCR2 deficiency can alter immune baselines and long-term tissue remodelling. Perhaps consider bone marrow chimeras (WT to CCR2-/-, CCR2-/- to WT ????) or an inducible CCR2 deletion approach to separate developmental/systemic effects from acute inflammation-driven mechanisms. If "in situ proliferation" is proposed, include a direct readout (e.g., Ki67 in ATMs in the fat pad).

      (3) IMQ and systemic effects.

      The work relies on topical Aldara/imiquimod as an "inflammation without antigen" driver of distal LN/fat-pad remodelling. But IMQ is well known (and cited by the authors) to enter circulation and drive systemic responses, which could blur whether effects are truly draining-site specific vs systemic metabolic/inflammatory effects. It would be ideal to provide systemic context: plasma cytokines and/or metabolic readouts (e.g., circulating FFAs) to distinguish local vs systemic drivers.

      (4) Macrophage dependence is inferred from CSF1R inhibitor treatment.

      However, validation of macrophage depletion and specificity is incomplete. The manuscript uses AZD7507 (CSF1R inhibitor) and observes partial rescue of fat pad/LN phenotype while skin severity (PASI) is unaffected. But, to this reviewer, the data shown do not clearly quantify actual macrophage depletion efficiency in the target fat pad, and LN at endpoint, and CSF1R blockade can affect multiple myeloid populations. Therefore, show absolute macrophage counts (and likely other myeloid populations) in fat pad and LN with/without AZD7507 at the analysed timepoints, not only outcome weights. (The methods describe dosing but not endpoint depletion quantification??)

      (5) Fat pad atrophy/LN expansion is a correlation.

      The paper emphasises negative correlations between fat pad and LN weights/cellularity at baseline and with inflammation. But correlation does not establish whether fat pad lipolysis drives LN expansion, whether LN changes drive fat remodelling, or whether both reflect systemic mediators. Add tissue-level evidence distinguishing true adipocyte loss vs other contributors to "weight change" (e.g., oedema/fibrosis).

      (6) Evidence for "fatty acid donation" from fat pad to LN.

      The lipid data are described as "exemplary," and the inference that LN fatty acids originate from the fat pad is based on temporal ordering and relative abundance. This does not rule out plasma spillover, LN-intrinsic metabolism, or altered lymph flow.

    2. Reviewer #2 (Public review):

      The authors aim to demonstrate skin inflammation is associated with fat pad atrophy and lymph node expansion. They further propose that these phenotypes are driven by the recruitment and lipid metabolism of CCR2-independent macrophages.

      The authors took advantage of two skin inflammation models, fight-induced and imauimod-induced skin inflammation and analyzed multiple tissues, including skin, fat pads, and lymph nodes. Using a macropahge-depletion method (e.g., CSF-1R inhibitor), the authors further suggest the inverse correlation between fat pads atrophy and lymph node expansion is macropahge-dependent. While the study identifies this intriguing inverse correlation during skin inflammation, the causal pathway linking fat pad atrophy and lymph nodes enlargement has not been clearly established.

      To improve the rigor of the manuscript, the authors address the following concerns;

      (1) CCR2-deficient mice showed reduced inflammatory monocytes and monocyte-derived macrophages (PMID:16462739; 16341265). During tissue inflammation, CCR2+ classical monocytes are typically recruited to the injured peripheral tissues, including skin, where they differentiate into monocyte-derived macrophages (PMID:38474365). While inflammatory monocytes were reduced in the skin (Figure 3 d), fat pads (Figure 4a, S2D) of CCR2-deficient mice, macrophage numbers were significantly increased in these mice. It remains unclear whether CCR2-independent macrophages were newly recruited from alternative sources or tissue-resident macrophages underwent local self-proliferation to compensate for the loss of CCR2+ monocyte-derived macrophages.

      (2) In line 258, the authors state that there was "a significant reduction in CD11C- CD206+ anti-inflammatory macrophages (Figure 4b i-iii)". However, the quantification data in Figure 4b iii do not appear to show any reduction in anti-inflammatory macrophages in either males or females. Please reconcile this discrepancy between the text and the figure.

      (3) Although CD11C and CD206 were historically used as markers of inflammatory and anti-inflammatory markers, respectively. These markers are no longer considered sufficient to define the macrophage polarization state, particularly in adipose tissue, where they are constitutively expressed by resident macrophages (PMID:34210853). Numerous studies have demonstrated substantial macrophage diversity/heterogeneity across iWAT, eWAT, and brown fat tissues. The authors should discuss adipose macrophage diversity beyond the outdated M1/M2 frame.

    1. Joint Public Review:

      Summary:

      Calle-Schuler et. al. reconstruct all the pre- and post-synaptic neurons to the bristle mechanosensory neurons on the adult fly head to understand if neural circuits support the parallel mechanosensory pathways, which could be instrumental in shaping the sequential motor patterns during fly grooming. They find that most presynaptic neurons, interneurons and excitatory post synaptic neurons are also somatotopically organized, such that each neuron is more connected to bristles mechanosensory neurons that are closer on the head and less connected to bristles mechanosensory neurons that are further away. These include the direct BMN-BMN circuits, excitatory interneurons, as well as the inhibitory networks. They also identify that the one entire hemi-lineage 23b form excitatory postsynaptic circuit with BMNs, highlighting how these circuits and hence their function could be developmentally determined.

      Strengths:

      This is a complete map of the all the neurons which make 5 or more pre- and post-synaptic connections of the fly head BMNs. Using this, the authors have identified various trends such as ascending neurons provide most of the GABAergic inhibitory input, which could provide the presynaptic inhibition essential for the parallel model for sequential grooming generation. Moreover, they identified that the entire cholinergic hemilineage 23b is postsynaptic to BMNs. Both their excitatory postsynaptic connectivity and inhibitory presynaptic connectivity demonstrate core motifs of the parallel circuits necessary for the hierarchical suppression model of grooming sequence.

      Weaknesses:

      Somatotropic organization with hierarchical suppression is an elegant mechanism to generate sequential motor sequence during grooming. Yet, anatomical connectivity alone, in absence of functional connectivity, cannot explain the grooming motor sequences. Future work should be aimed at mapping the functional connectivity with behavioral sequence.

      Closing statement:

      The authors have addressed the major concerns regarding clarity, scope, and interpretation. The manuscript is now significantly improved and is clearly framed as an anatomical resource that identifies circuit motifs consistent with existing models of grooming control.

    1. Reviewer #1 (Public review):

      Summary

      The strength of this manuscript lies in the behavior: mice use a continuous auditory background (pink vs brown noise) to set a rule for interpreting an identical single-whisker deflection (lick in W+ and withhold in W− contexts) while always licking to a brief 10 kHz tone. Behaviorally, animals acquire the rule and switch rapidly at block transitions and take a few trials to fully integrate the context cue. What's nice about this behavior is the separate auditory cue, which shows the animals remain engaged in the task, so it's not just that the mice check out (i.e., become disengaged in the W- context). The authors then use optical tools, combining cortex-wide optogenetic inactivation (using localized inhibition in a grid-like fashion) with widefield calcium imaging to map what regions are necessary for the task and what the local and global dynamics are. Classic whisker sensorimotor nodes (wS1/wS2/wM/ALM) behave as expected with silencing reducing whisker-evoked licking. Retrosplenial cortex (RSC) emerges as a somewhat unexpected, context-specific node: silencing RSC (and tjS1) increases licking selectively in W−, arguing that these regions contribute to applying the "don't lick" policy in that context. I say somewhat because work from the Delamater group points to this possibility, albeit in a Pavlovian conditioning task and without neural data.

      The widefield imaging shows that RSC is the earliest dorsal cortical area to show W+ vs W− divergence after the whisker stimulus, preceding whisker motor cortex, consistent with RSC injecting context into the sensorimotor flow. A "Context Off" control (continuous white noise; same block structure) impairs context discrimination, indicating the continuous background is actually used to set the rule (an important addition!) Pre-stimulus functional-connectivity analyses suggest that there is some activity correlation that maps to the context presumably due to the continuous background auditory context. Simultaneous opto+imaging projects perturbations into a low-dimensional subspace that separates lick vs no-lick trajectories in an interpretable way.

      In my view, this is a clear, rigorous systems-level study that identifies an important role for RSC in context-dependent sensorimotor transformation, thereby expanding RSC's involvement beyond navigation/memory into active sensing and action selection. The behavioral paradigm is thoughtfully designed, the claims related to the imaging are well defended, and the causal mapping is strong.

      Comments on revisions:

      The authors have been responsive to the prior review and I think the manuscript is a valuable and important addition to the literature.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aim to understand the neural basis of context-dependent sensory processing and decision-making.

      Strengths:

      They used an innovative behavioral paradigm where the action-outcome association changes independent of the sensory stimulus. This allowed the authors to disentangle the effect of behavioral context on sensory processing in RSC. Using this approach combined with optogenetic silencing, they discover that RSC activity is necessary for suppressing a lick response when the stimulus switches to the unrewarded context. The authors provide compelling evidence that the RSC is an important node of context-dependent sensory processing.

      Weaknesses:

      Sensory processing appears to be entangled with jaw/tongue movement initiation. Nonetheless, it is clear that RSC and motor cortex convey contextual signals with a very short latency.

      Comments on revisions:

      Thank you for updating the manuscript. Good work.

    1. Reviewer #1 (Public review):

      Summary:

      The study examined the extent to which children's word recognition skill improves across early development, becoming faster, more accurate and less variable, and the extent to which word recognition skill is related to children's concurrent and later vocabulary knowledge.

      The main strength of the study comes from the dataset which recycles previously collected data from 24 studies to examine the development of word recognition skill using data from 1963 children. This maximizes the impact of previously collected data while also allowing the study to reliably ask big picture questions on the development of word recognition skill and its relation to chronological age and vocabulary knowledge. Data analysis is rigorous, thought through and very clearly described. Data and code necessary to reproduce the manuscript are shared on the project's Github. The limitations of the study are acknowledged and the manuscript does well to tone down the causal implications of their results.

    2. Reviewer #2 (Public review):

      Summary:

      This paper presents a series of analyses of a large dataset combining many prior studies of early word recognition (Peekbank). The analyses demonstrate that the speed, accuracy and consistency of word learning improves with age. Moreover, the speed of word learning early in development was related to vocabulary growth over time.

      Strengths:

      A key strength of the paper is the use of a large multi-study dataset. This is particularly valuable in the field of early cognitive development, which has (due to practical limitations) often been based on small-scale studies that necessarily provide a shaky foundation for conclusions. The analyses are also well-motivated.

      Weaknesses:

      In an earlier version of the manuscript, the meaning of "word recognition ability" was ambiguous and could have referred to either (A) an intrinsic ability that matures, or (B) knowledge of the common, concrete words typically used in these studies that increases with experience. The revised version of the manuscript identifies these two interpretations and acknowledges that they cannot be teased apart in the current work.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigate the role of the microtubule-binding protein EML3 during cortical development through the generation and characterization of an Eml3 mouse mutant. The authors focus mainly on the effects of EML3 loss on brain development, although Eml3 mouse mutants also present with developmental delay and growth restriction, and die perinatally due to respiratory distress caused by delayed maturation of the lungs. The main finding in the developing cortex is the presence of focal neuronal ectopias, which contain neurons from all cortical layers, as revealed by immunostaining. The authors use electron microscopy to show that ectopias seem to be caused by disruption to the pial basement membrane at early stages of development, which allows neurons to breach through it. To find a functional link between EML3 and the observed phenotype, studies are conducted that demonstrate expression of EML3 in radial glia cells and mesenchymal cells, both cell types involved in the formation and maintenance of the pial basement membrane. Furthermore, interaction partners for EML3 are identified through coIP-MS analysis, including tubulin beta-3, 14-3-3 proteins and cytoplasmic dynein light chain. However, mice carrying a mutant EML3 allele engineered to abolish the interaction between EML3 and cytoplasmic dynein light chain do not recapitulate any of the symptoms of complete EML3 loss.

      Strengths:

      The manuscript offers several important strengths that contribute significantly to the field. This study presents the first characterization of Eml3 knockout animals, providing novel insights into the role of Eml3 in vivo. Information on Eml3 function so far was restricted to cell culture data, so the results in this manuscript start to fill an important gap in our knowledge about this microtubule-binding protein. The experimental approach is carefully designed, with appropriate controls that ensure the reliability of the data. Moreover, the authors have addressed a key challenge in the analysis, namely the developmental delay of the knockout animals. By implementing a strategy to match developmental stages between wild-type and knockout groups, they allow for meaningful and valid comparisons between the two genotypes. Importantly, the authors have successfully generated three different Eml3 mutant mouse lines (knockout, floxed and with disrupted binding to cytoplasmic dynein light chain), which are very valuable tools for the broader scientific community to further study the roles of this gene in development and disease in the future.

      Weaknesses:

      While the manuscript presents valuable data, there are also several weaknesses that limit the overall impact of the study. Most notably, there is no clear mechanistic link established between the loss of Eml3 function and the observed phenotype, leaving the biological significance of the findings somewhat speculative, as it is not straightforward how a microtubule-associated protein can have an impact on the stability of the pial basement membrane. In this respect, but also in general for the whole manuscript, there seems to be a considerable amount of experimental work that has been conducted but is not presented, possibly due to the negative nature of the results. Additionally, the phenotype reported appears to be dependent on the genetic background, as it is absent in the CD1 strain. This observation raises concerns as to how robust the results are and how much they can be generalized to other mouse strains, but, more importantly, to humans.

    2. Reviewer #3 (Public review):

      Summary:

      This work aims to understand the role of Echinoderm Microtubule-associated Protein-like 3 (EML3) on embryogenesis and neocortical development. Importantly, this work shows that depletion of EML3 cause focal neuronal ectopias by disrupting the structural integrity of the pial basement membrane, describing a new model of cobblestone brain malformation. Another member of the EML family, EML1, has been already shown to trigger neuronal migration disorders, particularly subcortical band heterotopia by affecting cell polarity. The results presented here point to a different mechanism of action. The authors show that EML3 is expressed in radial glia cells and mesenchymal cells in the pial region and upon EML3 depletion (i.e., Eml3 mutant mice) the pial basement membrane is structurally damaged allowing migrating neuroblasts to ectopically migrate through. Answering, in this case, that the weakening of the pial basement membrane is a prerequisite of focal neuronal ectopias. The authors provide a meticulous characterization of the Eml3 mutant mice, strengthening the conclusions of the results.

      Strengths:

      The authors provide a very detailed analysis of the defects observed in Eml3 mutant mice, by providing not only results by inferred day of conception but by classifying embryos by their number of somite pairs.

      Weaknesses:

      Most of the weaknesses originally raised by the reviewer had been addressed.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor with further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Urination requires precise coordination between the bladder and external urethral sphincter (EUS), while the neural substrates controlling this coordination remain poorly understood. In this study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators that faithfully initiate or suspend urination. Results from peripheral nerve lesions suggest that BarEsr1 neurons play independent roles in controlling bladder contraction and relaxation of the EUS. Finally, the authors performed region-specific retrograde tracing, claiming that distinct populations of BarEsr1 neurons target specific spinal nuclei involved in regulating the bladder and EUS, respectively.

      Strength:

      Overall, the work is done with high quality. The authors integrate several cutting-edge technologies and sophisticated, thorough analyses, including opto-tagged single unit recordings, combined optogenetics and urodynamics, particularly those following distinct peripheral nerve lesions.

      Comments on revised version:

      During the revision, the authors have adequately addressed my concerns and made the suggested changes accordingly. I have no additional comments.

    2. Reviewer #2 (Public review):

      Summary:

      The authors have performed a rigorous study to assess the role of ESR1+ neurons in the PMC to control coordination of bladder and sphincter muscles during urination. This is an extension of previous work defining the role of these brainstem neurons, and convincingly adds to the understanding of their role as master regulators of urination. This is a thorough, well-done study that clarifies how the Pontine micturition center coordinates different muscle groups for efficient urination, but there are some questions and considerations that remain.

      Strengths:

      These data are thorough and convincing in showing that ESR1+ PMC neurons exert coordinated control over both the bladder and sphincter activity, which is essential for efficient urination. The anatomical distinctions in pelvic versus pudendal control is clear, and it's an advance to understand how this coordination occurs. This work offers a clearer picture of how micturition is driven.

      Weaknesses:

      The dynamics of how this population of ESR1+ neurons is engaged in natural urination events remains unclear. Not all ESR1+neurons are always engaged, and it is not measured whether this is simply variation in population activity, or if more neurons are engaged during more intense starting bladder pressures, for instance. In particular, the response dynamics of single and doubly-projecting neurons are not defined. Additionally, the model for how these neurons coordinate with CRH+ neuron activity in the PMC is not addressed, although these cell types seem to be engaged at the same time. Lastly, it would be interesting to know how sensory input can likely modulate the activity of these neurons, but this is perhaps a future direction.

    3. Reviewer #3 (Public review):

      Summary:

      The paper by Li et al explored the role of Estrogen receptor 1 (Esr1) expressing neurons in the pontine micturition center (PMC), a brainstem region also known as Barrington's nucleus (Hou wt al 2016, Keller et al 2018). First the author conducted bulk Ca2+ imaging/unit recording from PMCESR1 to investigate the correlations of PMCESR1 neural activity to voiding behavior in conscious mice and bladder pressure/external urethral muscle activity in urethane anesthetized mice. Next the authors conducted optogenetics inactivation/activation of PMCESR1 to confirm the contribution to the voiding behavior also conducted peripheral nerve transection together with optogenetics activation to confirm the independent control of bladder pressure and urethral sphincter muscle.

      Comments on revised version:

      No concerns. All my major questions were addressed.

    1. Reviewer #1 (Public review):

      The authors point out that the fitness estimates obtained from different experimental assays (monoculture, pairwise competition or bulk competition) are not generally equivalent, not even with regard to the fitness ranking of different genotypes. Using a computational model based on experimentally measured growth phenotypes for knockout strains in yeast, as well as data from Lenski's Long Term Evolution Experiment (LTEE), they derive a set of best practice rules aimed at extracting the optimal amount of information from such experiments.

      The study is very complete on a technical level, and the conceptual weaknesses raised in the first round of reviews have been fully addressed in the revision.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript "Quantifying microbial fitness in high-throughput experiments" provides a comprehensive analysis of the various approaches to quantifying fitness in microbial evolution, focusing on three primary factors: encoding of relative abundance, time scale of measurement, and the choice of reference subpopulation. The authors systematically explore how these choices impact fitness statistics and provide recommendations aimed at standardizing practices in the field. This manuscript aims to highlight the impact of differing fitness definitions and the methodologies utilized for analysis and how that can significantly alter interpretations of mutant fitness, affecting evolutionary predictions and the overall understanding of genetic interactions in the experiments.

      Strengths:

      The choices for quantifying fitness in evolution experiments are critical and highly relevant given the increasing prevalence of high-throughput experiments in evolutionary biology. The authors methodically categorize fitness statistics and their implications, providing clarity on a complex subject. This structured approach aids in understanding the nuances of fitness measurement. The manuscript effectively highlights how different choices in fitness measurement can influence fitness rankings and the understanding of epistasis, which is important for modeling evolutionary dynamics.

      Comments on revisions:

      The authors have comprehensively addressed all previous comments and suggestions. In particular, the addition of the new methods section: 'A guide to calculate pairwise relative fitness under the logit encoding from bulk competition data' - significantly improves the clarity of the implementation and helps in the overall interpretation of the framework.

    3. Reviewer #3 (Public review):

      Summary:

      The authors present analyses of different fitness measures derived from empirical data from yeast knock-out mutants and the long-term evolution experiment (LTEE) with Escherichia coli to explore discrepancies and identify preferred methods to estimate relative fitness in high-throughput experiments. Their work has three components. They first discuss the different "encodings" of relative abundance data and conclude that logit-transformations are preferred, because they transform nonlinear abundance trajectories into linear trajectories with greater predictive power. Next, they compare per-generation with per-growth cycle relative fitness estimates inferred from simulations of pairwise competitions based on published growth traits for the yeast strains and on published pairwise competition measurements for the LTEE data. Both data sets show quantitative and qualitative (i.e. rank order) discrepancies of estimates across different time scales, which are highlighted by considering possible underlying causes (i.e. trade-offs between growth traits) and consequences (i.e. epistasis among mutations affecting different growth traits). Finally, the authors compare simulated pairwise and bulk (i.e. where many mutants compete during a growth cycle in a single environment) competition assays based on the yeast knock-out mutants and demonstrate an optimal ratio of collective mutants to wild-type strains that minimizes both sampling error and overestimation of fitness estimates when compared with pairwise competitions.

      Strengths:

      The study deals with a highly relevant topic. Fitness is central to general evolutionary theory, but also poorly defined and implies different traits for different organisms and conditions. For microbes, which are often used in evolution experiments, high-throughput experiments may yield different measures to quantify abundance over time, from individual growth traits to bulk competition experiments. Hence, it is relevant to consider discrepancies among those measures and identify preferred measures with respect to predicting population dynamic and evolutionary processes. The present study contributes to this aim by (i) making readers aware of differences among commonly used fitness estimates, (ii) showing that simulated (yeast) and calculated (E. coli) competitive fitness may differ across time scales, and (iii) showing that bulk competitions may yield relative fitness estimates that are systematically higher than pairwise competitions. The study is rather thorough on the theory side, with extensive derivations and analyses of various fitness measures using their resource competition model in the Supplementary Information. The study ends with a few practical recommendations for preferred methods to infer relative fitness estimates, that may be useful for experimentalists and stimulate further investigations.

      Weaknesses:

      The study has a few limitations. Perhaps the most apparent limitation is the lack of a clear answer to the question which fitness measure is best "in the light of first principles". The authors show clear discrepancies between fitness estimates across different time scales or using different reference genotypes in bulk competition and provide useful recommendations based on practical considerations (e.g. using pairwise competitions as "golden standard"), but it remains unclear whether these measures provide the greatest value for the questions researchers may want to answer with them (e.g. predict shifts in genotype frequencies). -- The authors have convinced me in their response that their recommendations were fundamentally related to the resource competition model, and the changes in introduction and discussion help to appreciate the choice of fitness measure in relation to the research question.

      A second limitation is that the authors analyse fitness differences arising solely from resource competition, whereas microbes often interact via other mechanisms, e.g. the production of anticompetitor toxins, cross-feeding of metabolites or lack of growth to enhance their persistence in stress conditions. Without simulations of these processes, understanding discrepancies among fitness measures is necessarily limited. In addition, the analysis of trade-offs between growth traits causing these discrepancies during resource competition seems confounded by biases in measurement error or parameter estimation, at least for growth rate and lag time (Fig. 2B), where the replicate estimates for the wildtype show a similar negative correlation. -- The motivation to use a resource competition model for fitness inference is generally well motivated now. I accept their argument that resource competitive differences are most important for microbial strains with small genetic differences (e.g. from mutant libraries or from the same evolution experiment). However, it is relevant to note that this ignores situations that are rather common, where the wild-type strain produces an anticompetitor toxin or causes growth inhibition through metabolite products that lower the pH (and derived strains will likely contain resistant mutations).

      Third, the study does not validate relative fitness predictions from growth traits (as is done for the yeast mutants) with measured relative fitness estimates using competition assays, while such data are available, e.g. for the LTEE. This would strengthen their inferences about preferred fitness measures. -- In their response, the authors explain that their aim was different, i.e. the provide "proof of principle" that the choices of fitness measure can produce discrepancies even when they follow the same growth model.

      Fourth, the analysis of epistasis between mutations affecting different growth traits (shown in Fig. 3) based on the LTEE data could be better introduced and analysed more comprehensively. Now, the examples given in panels C-F seem rather idiosyncratic and readers may wonder how general these consequences of using fitness estimates based on different time scales are. -- The authors have made extensive improvements to address how different growth parameters, especially lag and growth rate, differently affect apparent epistasis based on measures at different time scale (per generation vs per cycle). These provide a more comprehensive analysis of down-stream consequences for epistasis detection.

      Finally, the study is generally less accessible to experimentalists due to the extensive and principled treatment of specific population dynamic models and fitness inferences. This may distract from the overarching aim to identify fitness measures that are most accurate and useful for predictions of population dynamic and evolutionary processes. In this light, the motivation for the initial discussion of the importance of how to best encode relative abundance (Fig. 1) is unclear. Also, the conclusion, that logit encoding is preferred, because it linearizes logistic growth dynamics and "improves the quality of predictions", is not further motivated. Experimentalists using non-linear models to infer fitness from growth curves or competition assays may miss the relevance of this discussion. -- Thanks for this explanation (indeed, I confused "logistic dynamics" with "logistic growth model"); the additional explanations and text reductions have improved accessibility for experimentalists.

      Comments on revisions:

      I appreciate the thorough and effective response to all recommendations and have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents evidence that addition of the two GTPases EngA and ObgE to reactions comprised of rRNAs and total ribosomal proteins purified from native bacterial ribosomes can bypass the requirements for non-physiological temperature shifts and Mg+2 ion concentrations for in vitro reconstitution of functional E. coli ribosomes.

      Strengths:

      This advance allows ribosome reconstitution in a fully reconstituted protein synthesis system containing individually purified recombinant translation factors, with the reconstituted ribosomes substituting for native purified ribosomes to support protein synthesis. This represents a significant development in the long-term effort to produce synthetic cells.

      Weaknesses:

      - The authors carried out additional experiments indicating that ~60% of the reconstituted ribosomes are functional and that a significant proportion are capable of synthesizing GFP from the correct initiation codon to the correct stop codon, and also of producing an enzymatically active protein at appreciable levels. Their SDS-PAGE and MS analyses of N-terminally tagged GFP are also quite useful but did not assess the frequency of initiation at the wrong start codon, termination at the incorrect stop codon, or the frequency of frameshifting during elongation. This would require examining additional reporters designed to examine dependence on a Shine-Dalgarno sequence or the impact of an in-frame stop codon to assess the fidelity of initiation and termination events, respectively, and one with a programmed frameshift site to assess the elongation fidelity of their reconstituted ribosomes.

      - Reconstitution studies in the past have succeeded by using all recombinant, individually purified RPs that, if successful here, would have eliminated the possibility that one or more unknown ribosome assembly factors that co-purify with native ribosomes was added to their reconstitution reactions.

    2. Reviewer #2 (Public review):

      This study has developed a single-step method to assemble active bacterial ribosomes under near-physiological conditions by using the GTPase factors EngA and ObgE. These factors eliminate the need for the traditional, harsh manipulations of temperature and magnesium levels. This integration is an important step toward the bottom-up construction of synthetic cells.

      Comments on revisions:

      The authors have addressed my concerns in the previous round of review.

    1. Reviewer #1 (Public review):

      Summary:

      This is an important study that describes the consequences of the DNMT3A mutation in human neuronal development for the first time. The selective impact of DNMT3A function on GABAergic interneurons is interesting and an important feature of future therapeutics. The claims made in that manuscript are supported by strong evidence for the most part. And the data are of high quality in general and presented well.

      Strengths:

      The strengths of the work include: Characterization of multiple DNMT3A loss-of-function alleles, including two misense variants, R882H, P904L, and a deletion allele. The missense mutation lines both include an ideal control with the same genetic background. The CRISPRi-mediated DNMT3A knockdown has also been included. The study identifies the mTOR-PI3K pathway as a factor of overgrowth issues found in the mutant organoid. In bulk mRNA sequencing and whole-genome bisulfite sequencing, identify hypomethylated genomic regions associated with gene expression repression. Again, this is more pronounced in the ventral organoid compared to the dorsal organoid. In addition, the extensive electrophysiological characterizations with a high-density microelectrode array support the more mature status of mutant interneurons.

      Weaknesses:

      Although a strong study overall, some weaknesses are noted. These include:

      (1) The lack of validation data for the generated iPSCs and hESCs, such as the chromosomal contents, ploidy, and pluripotency states.

      (2) Other weaknesses relate to data interpretation and insufficient discussion of related matters, as detailed in the recommendations to the authors.

      (3) Also, some errors are noted and detailed in the recommendation section.

    2. Reviewer #2 (Public review):

      Summary:

      Chapman, Determan et al. investigate how pathogenic mutations in DNMT3A, which cause Tatton-Brown-Rahman Syndrome (TBRS), disrupt human cortical developmental processes using a comprehensive panel of human pluripotent stem cell models spanning DNMT3A loss-of-function severity. The authors aim to identify the cellular and molecular mechanisms underlying TBRS-associated brain overgrowth and intellectual disability, and to test whether mechanistic convergence exists between TBRS and other overgrowth-intellectual disability disorders (OGIDs) caused by mutations in EZH2 (Weaver syndrome) or PIK3CA pathway components. Their central conclusion is that GABAergic interneuron development is selectively vulnerable to DNMT3A mutation, where reduced DNA methylation causes premature de-repression of neuronal and synaptic genes, driving precocious neuronal maturation and hyperactivity sufficient to disrupt neuronal network synchrony. This report adds to a growing literature supporting the vulnerability of GABAergic interneurons in NDDs and further provides a mechanistic view of this vulnerability, potentially convergent across OGIDs. The mechanistic claims around H3K27me3 compensation and mTOR-based therapeutic convergence, while promising, rest on more preliminary evidence and would benefit from the distinction between correlation and mechanism being made more explicit in the text. Overall, this is a compelling study with a rigorous experimental design and novel findings with a potential impact on a better understanding of the OGID pathophysiology.

      Strengths:

      (1) A major strength of this work is the breadth and rigor of the disease modeling approach. Four independent TBRS model systems are used in tandem: a patient-derived iPSC line with isogenic CRISPR-corrected control (R882H), a knock-in hESC model (P904L) with its wild-type isogenic, patient deletion iPSC lines (Del1/2), and CRISPRi knockdown models (G1/G2), collectively spanning a range of DNMT3A loss-of-function that correlates with phenotypic severity. This allelic series design substantially strengthens causal inference beyond what any single isogenic pair could provide.

      (2) The multi-omic integration across matched developmental stages provides a strong mechanistic foundation for the cellular phenotyping and provides significantly enhanced novelty. RNA-seq, whole-genome bisulfite sequencing, and H3K27me3 CUT&Tag are combined in the same cell types, and timepoints show that DNMT3A loss reduces CG methylation at neuronal and synaptic gene loci, leading to premature transcriptional activation.

      (3) The selective vulnerability of ventral (GABAergic) versus dorsal (glutamatergic) progenitors is one of the study's most important findings. This lineage specificity is consistently observed across all model systems and in both 2D and organoid formats, where ventral NPCs show increased proliferation, premature neuronal gene expression, and increased neurogenesis, while dorsal NPCs are largely unaffected at the transcriptomic and cellular level despite exhibiting comparable DNA methylation changes. This adds to a body of emerging work showing GABAergic interneuron vulnerability in NDDs where ubiquitously expressed genes such as chromatin modifiers are perturbed, and provides additional molecular insights into potential mechanisms of "resilience" of dorsal populations.

      (4) The functional characterization follows a logical progression from single-neuron electrophysiology (demonstrating GABAergic hyperactivity with increased action potential amplitude and firing rate) to network-level analysis using high-density multi-electrode arrays. The HD-MEA experimental design - pairing TBRS or control GABAergic neurons with a constant background of control iGlut neurons - cleanly isolates GABAergic dysfunction as the driver of network hypersynchrony.

      Weaknesses:

      (1) The concomitant induction of proliferation and differentiation in TBRS V-NPCs is conceptually striking, since these are generally considered antagonistic developmental programs. The authors partially address this tension by noting that DNMT3A LOF alone is insufficient to initiate neuronal differentiation, i.e., V-NPCs upregulate neuronal and synaptic genes while retaining progenitor identity, implying that transcriptomic priming and commitment to differentiation are decoupled. However, the relationship between the proliferative phenotype and the epigenetic priming phenotype remains mechanistically unresolved. The manuscript documents mTOR pathway upregulation at the protein level and identifies shared DEGs that include proliferative regulators, but it does not establish whether mTOR-driven proliferation and mCG-loss-driven neuronal gene de-repression/enhanced differentiation are causally linked or represent two independent consequences of DNMT3A LOF.

      (2) Relatedly, the rapamycin rescue experiment is a valuable proof-of-concept for the PIK3/AKT/mTOR convergence but is limited to a single dose in a single model (882) with a single readout (Ki67+ proliferation). Given the prominence of mTOR pathway convergence in the manuscript as a potential shared therapeutic avenue across OGIDs, the data supporting this claim are somewhat preliminary. It remains unknown whether mTOR inhibition rescues downstream phenotypes (neurogenesis, gene expression, neuronal maturation) or whether less severe TBRS models respond similarly. This might also help tackle the first comment above. e.g., if mTOR inhibition rescued proliferation but not the transcriptomic priming, that would support two independent mechanisms.

      (3) The claim that H3K27me3 compensates for mCG loss is an important mechanistic point, but the current data do not distinguish between active compensation, in which EZH2 is recruited in response to methylation loss, and functional redundancy, in which H3K27me3 is independently established and becomes the dominant repressive mark once DNA methylation is reduced. The EZH2 knockdown/inhibition experiments show that H3K27me3 is sufficient to maintain repression at hypo-DMR sites, but they do not establish that H3K27me3 gain is itself a response to methylation loss. Because H3K27me3 profiling was performed only in the severe 882 model, it is also unclear whether H3K27me3 gain scales with DNMT3A LOF severity, as a compensatory model would predict. Finally, the EZH2 overexpression rescue is performed in V-NPCs, whereas the compensation model is developed primarily in D-NPCs, making it difficult to assess whether the same mechanism operates in the lineage where it was originally inferred.

      (4) The narrative framing of dorsal neuron development as unaffected by DNMT3A LOF is somewhat at odds with the data presented. The 882 D-NPCs show substantial DNA methylation changes, and TBRS D-INs exhibit what the authors describe as "substantive transcriptomic differences" involving persistent expression of pluripotency and progenitor genes, which seems to be a distinct but potentially significant phenotype. The impact of DNMT3A loss between ventral and dorsal lineages might be more accurately framed as divergent in nature rather than specific to a certain population.

      (5) SST stainings are not entirely convincing. They appear mostly nuclear, and some instances localized to rosettes in organoids, whereas the protein is largely confined to processes and is expected to be found outside progenitor-rich zones like rosettes.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors investigated TBRS etiology by using new human pluripotent stem cell models, modeling varying levels of TBRS-associated loss of DNMT3A function. They identified increased lineage-specific proliferation of precursors in TBRS ventral MGE-like progenitors, which they propose was related to increased signaling through the PIK3/AKT/mTOR pathway. Furthermore, they show that reduced DNA methylation during MGE-like progenitor differentiation into GABAergic interneurons can cause a premature expression of neuronal and synaptic genes, triggering precocious neuronal maturation. In conclusion, they propose that TBRS-derived GABAergic neurons exhibit hyperactivity that can alters the development and structure of neuronal networks.

      Strengths:

      Overall, the data presented is convincing, from an early developmental point of view, given that the iPSC-derived 2D cultures or organoids used do not get to reach a mature state. Nonetheless, the data clearly show the effects that deleterious mutations in TBRS can cause during the period of neurogenesis, which was missing in the field.

      Weaknesses:

      (1) Li et al., 2022 (referred to in the manuscript) seems to already show the interplay between H3K27me3 and Dnmt3a discussed in this study i.e., that in the absence of DNA methylation, there is an expansion of polycomb-like repression. These data should be better acknowledged in the paragraph 'Repressive H3K27me3 compensates for severe loss of DNA methylation' (page 9), given it supports the data presented in this manuscript and suggests this as a common mechanism in the interplay between these two repressive marks, as it is well established in the literature.

      (2) The authors should acknowledge that the omics data come from a mixed population of cells.

      (3) The authors are encouraged to further discuss whether the overgrowth observed in ventral GABAergic cultures or organoids compares to the overgrowth observed in diseased patients. One expects MRIs to have been performed in patients and that these could be harnessed to discern if overgrowth occurs in the cortex or ventral regions of the brain.

    1. Reviewer #1 (Public review):

      Kawamura et al. investigated the role of circumferential smooth muscle contractions in chick gut tube elongation, addressing the hypothesis that "peristaltic activity generated by the gut promotes its own elongation during embryogenesis". Although not acknowledged in the current manuscript, this interesting premise was, in fact, previously demonstrated.

      Indeed, the experiments in the present manuscript closely parallel a previous study (Khalipina et al, 2019: "Smooth muscle contractility causes the gut to grow anisotropically") that also cultured chick gut tissue and performed time-lapse analyses to quantify peristalsis. Both studies showed that inhibiting peristalsis with Ca-channel blockers induces a switch from elongational to radial growth in the gut.

      However, one of the main strengths of the current study is the innovative use of optogenetic manipulation to rescue gut lengthening in drug-inhibited gut tissue by re-stimulating peristaltic contractions. In addition, the authors use aphidicolin to show that peristalsis-mediated gut elongation is independent of cell division. They also track individual smooth muscle cells and show that they divide circumferentially, but become redistributed along the length of the gut tube with peristalsis.

      While these data are solidly quantitative, they do not provide mechanistic insight into how peristaltic contractions cause smooth muscle cells to be redistributed.

      The evidence presented in this manuscript supports the main conclusion that peristalsis plays a critical role in embryonic gut elongation, but this conclusion itself is not novel. In addition to corroborating previous work, this manuscript provides some useful additions to our existing knowledge of the role of mechanical forces in embryonic gut morphogenesis and illustrates the utility of a previously published optogenetic manipulation technique.

    2. Reviewer #2 (Public review):

      Summary:

      This study uses the chicken caecum ex vivo culture to show that embryonic peristaltic activity is a key mechanical factor for gut elongation. It is shown that pharmacological inhibition arrests intestinal growth, while optogenetic restoration rescues longitudinal elongation. The authors propose a two-step mechanism in which circular smooth muscle cells proliferate circumferentially, but peristalsis pushes them toward longitudinal rearrangement, which explains the anisotropic growth of the gut.

      Strengths:

      The experiments combine loss-of-function (peristalsis inhibition) with gain-of-function (optogenetic rescue) experiments and quantifiable readouts in an embryonic gut culture model. The work is clearly presented with nice microscopy videos and offers a potentially valuable conceptual framework linking tissue-scale mechanics to smooth muscle cell behaviors during development.

      Weaknesses:

      Some results appear conceptually inconsistent with the claim of peristalsis-essential rearrangement (e.g., longitudinal separation of daughter cells even without peristalsis), and the mechanistic link would benefit from clearer quantification and reconciliation. The study largely overlooks contributions from other gut layers and the ECM (and aphidicolin affects all proliferating cells), limiting interpretation of how smooth muscle rearrangement translates into whole-wall elongation.

    3. Reviewer #3 (Public review):

      Summary:

      The authors noted a steep increase in the rate of growth with the onset of more frequent peristaltic-like movements and hypothesized that peristaltic activity rearranges the orientation of cell growth from circumferential to longitudinal. This study sought to alter peristalsis and then (1) carefully examine the growth of the chick cecum relative to the frequency of peristaltic-like movements and (2) examine the orientation of cells relative to the circumferential and longitudinal axes to determine whether peristalsis is required for cecum lengthening. To alter peristaltic-like movements, contraction was inhibited through treatment with nifedipine (a calcium channel blocker that acts to relax smooth muscle) or Ani9 (inhibits Ca-activated chloride channels), and contractions were induced through activation of a blue light-activatable channel rhodopsin 2 (introduced through electroporation).

      Strengths:

      (1) Use of multiple methods to alter peristalsis in initial studies.

      (2) Live imaging.

      (3) Careful measurements.

      (4) Nicely presented figures.

      Weaknesses:

      (1) Only Nifedipine inhibition was examined for cell positional changes.

      (2) Ki67 was not carefully analysed, and apoptosis was not shown at all.

      (3) The results shown are suggestive of a role for peristalsis in the lengthening of the cecum. Demonstration that increased peristalsis could further increase lengthening would be helpful.

      (4) The novelty of this work is incremental for the field in that the reagents used and the model of smooth muscle driving gut lengthening in mouse and chick small intestines have both previously been published. This manuscript does suggest that the role of smooth muscle in longitudinal growth may extend to other tubular organs (chick cecum).

    1. Reviewer #1 (Public review):

      Summary:

      Zhang et al. report on an ambitious study that investigates multiple aspects of the neural and behavioral underpinnings of auditory-motor surprisal in the context of an auditory-motor learning paradigm (piano keyboard). Using an intricate design comprising several sub-parts and control procedures, they report that early ERPs (50-100 ms latency) reflect violations of established key-pitch mappings.

      Strengths:

      This is a carefully devised and executed study. The paradigm is quite intricate and, at the same time, addresses multiple aspects of auditory-motor learning, and does so in a rigorous way.

      Weaknesses:

      Perhaps because of the exhaustive approach, it is sometimes difficult to follow which parts of the experimental design the results come from; there are some questions regarding appropriate statistical methods, the inclusion/treatment of musical background in participants, and the nature (latency & extent) of the identified neural components that detect auditory-motor violations.

    2. Reviewer #2 (Public review):

      Summary:

      Zhang et al. report an EEG study (n=18) of participants playing a keyboard where the correspondence between keys and pitches is varied to introduce sensory-motor mismatches (discrepancies between sensory inputs and expected sensory consequences of motor commands). They find that the auditory N100 amplitude is enhanced for the initial keystroke following a mapping switch but rapidly attenuates for subsequent keystrokes (showing rapid updating of the forward model), whereas the motor-related P50 amplitude only differentiates trained versus untrained mappings after 30 minutes of goal-directed practice (potentially showing timescales of inverse model updating). Using parallel univariate and mTRF decoding analyses, they conclude that forward models (mapping action to predicted sound) update almost instantly to track short-term context, while inverse models (mapping sound to motor commands) update slowly and require extended, targeted practice.


      Strengths

      (1) Methodological innovation:<br /> The study utilizes an interesting, continuous auditory-motor paradigm that moves beyond standard trial-by-trial oddball designs, offering a more ecologically valid measure of trial-to-trial adaptation.

      (2) Analytical elegance and rigor:<br /> The combination of traditional univariate ERP analyses with multivariate temporal response function (mTRF) decoding is elegant, allowing the authors to successfully dissociate overlapping auditory and motor variance streams.

      (3) The dissociation between the rapid adaptation of the N100 forward model and the slower adaptation of the P50 inverse model is interesting.

      Weaknesses

      (1) Confounded passive listening baseline:<br /> The passive listening control condition lacks an orthogonal behavioural task (e.g., an occasional oddball detection task). Active playing inherently necessitates focused attention on auditory feedback to monitor performance, whereas passive playback does not. The globally weaker stimulus-evoked pattern at electrode Fz during passive listening strongly suggests that the absence of an N100 effect in this condition may simply reflect a lower state of attention, rather than isolating the absence of a motor-driven forward prediction, in particular because the pure sensory suprisal was also enhanced for "firsts" notes, so this could also lead to stronger N1, but this effect may be masked.

      (2) Overclaimed theoretical novelty:<br /> The conceptual framing leans excessively on the authors' specific "MirrorNet" framework, presenting foundational, decades-old tenets of the motor control literature (i.e., unsupervised exploration for forward models vs. supervised skill acquisition for inverse models; Wolpert, Jordan, both in the nineties) as their own novel "conjectures." This theory-heavy introduction obscures the paper's actual empirical contribution to the design and the interesting question regarding the distinct temporal adaptation scales of forward versus inverse models. I think some rewriting can improve the paper.

      (3) Misplaced surprisal terminology:<br /> In a similar vein, I find the use of the term "auditory-motor surprisal" more theoretical grandstanding than actually useful. The significance statement claims to "extend this principle from sensory processing" but in fact, the concept of sensory motor unexpectedness is again a staple of the forward motor literature. Moreover, nowhere in the paper do they actually estimate sensorimotor surprisal. While the authors compute surprisal for their auditory baseline using IDyOM, their central sensorimotor analysis relies entirely on a simple categorical mismatch (first vs. subsequent keystrokes). The phenomenon can equally be referred to by its established nomenclature-"sensorimotor mismatch" or "sensory motor unexpectedness".

      (4) Incremental conceptual advance regarding the N100:<br /> The paper frames the N100 finding as a major discovery, but as far as I know, the attenuation of the auditory N1 to self-generated sounds via accurate motor prediction-and its enhancement during sensorimotor mismatch - is one of the most heavily documented phenomena in the auditory-motor literature (e.g. Timm et al., 2013; Bendixen et al, 2012; 2013). As far as I'm concerned, the authors should clarify that the novelty lies in the novel, elegant design that provides a new way to correct for non-sensory-specific motor-induced attenuation, and characterizing the distinct adaptation timescales of forward versus inverse models  -- not in demonstrating N100 modulation by sensorimotor mismatch, which is well-documented, AFAIC.

    1. Reviewer #1 (Public review):

      Summary:

      Osswald and colleagues aim to show how motor units of the first dorsal interosseous (FDI) are flexibly recruited across two functionally different movements: index finger abduction and index finger flexion. They motivate this by arguing that FDI is the prime mover in abduction but acts as a synergist in flexion, alongside flexor digitorum profundus (FDP) and flexor digitorum superficialis (FDS) as the prime movers. This is a worthwhile question because it speaks to how descending neural inputs to the spinal cord flexibly control movement.

      The authors claim that recruitment order and recruitment threshold of FDI motor units differ between abduction and flexion, and that beta-band intramuscular coherence is reduced when FDI acts as a synergist. However, there are significant methodological concerns that undermine the results and conclusions.

      Strengths:

      The study certainly aims to address a central question in motor neuroscience - how flexible recruitment of motor units occurs across movements where the same muscle changes its functional role. They correctly identify the FDI as a multi-functional muscle and use intramuscular high-density EMG arrays to record several motor units simultaneously, which is a major technical strength. They also track individual motor units between conditions and, therefore, have generated a potentially valuable dataset for studying spinal motor control across different movements.

      Weaknesses:

      The key limitation comes from the authors' interpretation of "neural drive" to FDI. The authors acknowledge that global EMG during flexion is smaller than that during abduction (for the same force), and surmise that the FDI receives different amounts of neural drive between these two movements, which is a potential confound for their analyses. To match the neural drive (i.e., global EMG), the authors ask participants to generate the same global EMG in flexion as in abduction; the forces generated by FDI are significantly different (2-3N for abduction and 1-8-6.2 for flexion). From this, they find changes in recruitment order, recruitment threshold, and beta coherence. However, different FDI motor units (and different muscle fibres) are active during abduction versus flexion. Using global EMG as a proxy for neural drive ignores this spatial separation of EMG generation during abduction and flexion, such that some amount of global EMG generated by one part of FDI (during abduction) is considered the same (from a neural drive perspective) as the same amount of EMG generated by a completely different part of FDI (during flexion). But these two global EMGs (during abduction and flexion) are not biologically equivalent because they are generated by different motor units and muscle fibres. Consequently, neural drive during flexion and abduction is not equivalent, which makes biological interpretation less clear. Furthermore, it is difficult to tell if abduction-versus-flexion differences are due to task role (prime mover vs synergist) or differences in force/mechanical demands, multi-muscle coordination, and spatial sampling limits of intramuscular recordings.

      As mentioned, we think that the question asked is a very interesting one and framed appropriately to investigate the behaviour of motor units during prime mover and synergist roles. Simultaneously recording the prime movers for index flexion (FDP and FDS) would significantly improve the completeness of the study and allow for multi-muscle comparisons that are more relevant to how the motor system resolves prime mover vs synergist roles.

      The authors use motor unit action potential as a proxy for motor unit size. This is not suitable because muscle fibres closer to the electrode will appear larger, independent of their true size. We advise that the authors remove analyses pertaining to motor unit size if it cannot be accurately measured.

      Finally, several mechanistic interpretations in the discussion (e.g., spinal interneuronal suppression, reduced corticospinal input, proprioceptive mechanisms) read as more speculative than the current data can support without added controls or citations.

    2. Reviewer #2 (Public review):

      In this study, the authors examine whether the structure of motor unit (MU) recruitment and firing varies across movement directions in the human first dorsal interosseous (FDI) muscle. While task-dependent changes in MU recruitment have been reported previously (e.g., Thomas et al. 1986), these findings were largely based on recordings from a limited number of isolated single motor units. By applying high-density intramuscular electromyography and decomposition techniques, the authors demonstrate similar phenomena at the level of larger MU populations, thereby providing a useful consolidation of prior observations. In addition, they show that recruitment thresholds shift across tasks while the inverse relationship between discharge rate and recruitment threshold (the "onion-skin" organization) is preserved, suggesting that the overall structure of inputs to the motoneuron pool remains stable despite changes in recruitment order. Furthermore, by analyzing intramuscular coherence across MU firing, the authors attempt to characterize differences in the extent of synchronization among frequency components of neural inputs between abduction and flexion of the index finger. In particular, they report reduced beta-band coherence during flexion compared to abduction, indicating decreased synchronization in this frequency range (13-30Hz). This observation is noteworthy, as it points to potential differences in the neural inputs underlying these task-dependent changes.

      A key strength of the study is that it extends prior work on task-dependent MU recruitment to larger populations using state-of-the-art recording and decomposition approaches. This represents a meaningful technical and conceptual advance over earlier studies limited to small numbers of units. The finding that recruitment shifts between flexion and abduction occur consistently across MUs, independent of motor unit size, further strengthens the robustness and generality of the observed phenomenon. Together, these results provide convincing evidence that MU recruitment is not strictly fixed by a rigid size principle across functional contexts and thus make a valuable contribution to the literature on motor control.

      However, several aspects of the mechanistic interpretation are less well supported. The authors interpret their findings as reflecting a "redistribution" of net excitatory input to the motoneuron pool across tasks. While this is a plausible interpretation of the observed changes in recruitment thresholds and recruitment order, it is not directly demonstrated by the analyses presented. The current data do not clearly distinguish redistribution of inputs from alternative explanations, such as task-dependent modulation of shared versus independent inputs, or changes in the effective gain of existing pathways. As such, the evidence for a specific redistribution of input remains incomplete.

      The interpretation of the intramuscular coherence analysis represents a further key weakness. By computing frequency-specific coherence across MUs during abduction (as a prime mover) and flexion (as a synergist), the authors report reduced beta-band coherence during flexion and interpret this as evidence for attenuated corticospinal input and increased involvement of spinal circuits. However, the relationship between changes in downstream coherence and the magnitude of upstream neural drive is not well established. Coherence reflects the synchronization of inputs rather than their net strength, and therefore, a reduction in coherence cannot be directly interpreted as a decrease in input from a specific source. Moreover, coherence measures alone do not permit identification of the origin of the inputs, and thus do not provide sufficient evidence to attribute the observed differences to descending or spinal pathways. While the difference between tasks is clear and potentially informative, the mechanistic interpretation appears overstated and should be treated more cautiously.

      A related issue concerns the interpretation of the preserved RT-DR relationship. While this finding supports the presence of a stable common input structure across tasks, the additional claim that proprioceptive feedback contributes significantly to maintaining this organization is not clearly justified by the presented data. No direct evidence is provided to dissociate afferent from descending inputs, and the absence of task-dependent differences in lower-frequency coherence further limits support for this interpretation. As such, the proposed role of proprioceptive feedback appears speculative.

      Overall, the authors successfully achieve their primary aim of demonstrating task-dependent flexibility in MU recruitment at the population level, and the results provide useful empirical support for this phenomenon using modern techniques. The study is likely to be of interest to researchers in motor control and neuromuscular physiology, particularly given the increasing relevance of MU-level analyses in both basic and applied contexts. However, the broader mechanistic conclusions regarding the nature and origin of the underlying neural inputs are not fully supported by the data and would benefit from more cautious interpretation or additional experimental evidence.

    1. Reviewer #1 (Public review):

      Summary:

      In this work, the authors study the migration of isolated cells and of cells in ensembles. They quantify several aspects of the corresponding migration patterns and investigate how these quantities depend on molecules that are known to play an important role in migration. Furthermore, they study the effect of external cues on these migration processes.

      Strengths:

      The authors provide a clean and uniform setting for comparing the migration of isolated cells and of cells in an ensemble in control and mutant conditions, and in the presence and absence of external cues. This allows for a meaningful comparison between different conditions. In this way, the authors obtain useful data that link the migration of isolated cells to that of cells in collectives.

      Weaknesses:

      A major weakness of the manuscript is that the authors do not properly introduce the quantities and concepts they are working with. In this way, it is hardly accessible for a reader who does not have a thorough background in cell migration and anomalous transport. In addition, the manuscript uses some notions that are not standard, for example, vinculin or FA stability, which are not properly introduced. Most strikingly, "collective directional memory" is not defined.

      The authors infer relationships between different quantities, but they remain qualitative, even though the authors use a language that suggests otherwise. For example, "The combination of Focal Adhesion stability and force transmission from the cytoskeleton predicts the migration speed of single cells" (p 2). I am not sure what is meant by prediction, but this heading suggests that knowledge of FA stability and force transmission yields the migration speed. Reading this line, I expect that if I give you values for FA stability and force transmission, you would give me a value for the migration speed. Such a quantitative mapping is not provided. In fact, it cannot be provided, because - as mentioned before - these quantities are not properly defined, so I would not know how to measure them. I do not even know their units.

      Furthermore, the authors do interpret some of their results without explaining or justifying the basis for their interpretation. For example, they use the FRET index of vinculin - another notion that is not properly introduced - to make statements about mechanical stress.

      It also seems that the figures could be improved. Some of the sketches are, in my opinion, not helpful. Examples are Figure 3A (how could a cell move while the hexagonal arrangement of the cells is maintained?) or Figures 2F, 4F, and 6F (what do the colored ellipses indicate?). In Figures 1B, 1D, 2A, 2E, 3B, 3D-F, 4A, 4F, 5B-D, it is not clear which lines merely connect data points and which lines are fits to the data.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Canever et al was assessed by three Referees at another journal, who brought up a range of critical points. I will not repeat a summary of the work; this can be found in the first-round reviews.

      Strengths:

      In their revised manuscript, the authors include substantial changes and additional reasoning. Along with their rebuttal letter, I think they make a very convincing case. While the claims are well supported by the analysis, I do not see that the findings need to be universal to be relevant. It might be rather surprising to me if there existed such a universality, in fact. I think that the findings are solid and interesting in their own right and are worthy of publication, especially with the amended discussion in this revision.

      Weaknesses:

      However, while the more bio-oriented parts are not fully accessible to me, I do have a few points from the data analysis point of view that need amendment.

      (1) The used mathematical models need to be specified more precisely. First, the authors confuse Levy flights and walks. These are distinct processes in the sense that a Levy flight does not have a finite variance and thus no finite speed. The proper model here would be Levy walks. As in a big body of the literature, both notions are used interchangeably here, while they are distinct processes. Then the authors speak about a "superdiffusive model", for which I do not find a proper definition. There exists an entire range of superdiffusive models, each with a different physical background, so this needs more clarity. The authors may consult one of the standard reviews for more details, e.g., Soft Matter 8, 9043 (2012) or Phys Chem Chem Phys 16, 24128<br /> (2014). Overall, a few equations (maybe in the Supplement) would help to be more specific.

      (2) For fractional Brownian motion, the authors should check the displacement correlation function; it should show slowly decaying, positive correlations. More details on the practical analysis of FBM can be found, e.g., in Phys Chem Chem Phys 27, 14350 (2025). These correlations should decay as a function of the bin time, e.g., as discussed for the opposite case of subdiffusion in Phys Rev E 88, 010101(R) (2013) [cf Fig 3b]. In general, FBM was determined to be a highly relevant process for a number of systems, including amoeba cells at shorter times, see the detailed analysis in Phys Rev Res 4, 033055 (2022). In this paper, there are also different ways to characterise the motion in terms of scaling. Exponents are detailed.

      (3) Some relevant approaches discussed in literature that should be discussed in the context of this work: eLife 9, e52224 (2020); Rep Prog Phys 86, 126601 (2023); Chaos 35, 023145 (2025). In the context of non-Gaussianity for active particles: Phys Rev E 104, 064615 (2021); New J Phys 25, 013010 (2023).

      (4) In the abstract, I am having some issues with the formulation in the sentence: "This directional memory emerges from fractional Brownian motion". It sounds as if FBM were a fully clarified phenomenon. I would prefer some statement along the lines that the data are consistent with such a mathematical modelling approach.

      After fixing these points, I think the manuscript will clearly warrant being shared.

    3. Reviewer #3 (Public review):

      This manuscript focuses on the presence/origin of directional memory during epithelial cell migration. It starts by analyzing single cells and then moves to more complex systems (confluent layers and scratch assays). The paper first demonstrates that the migration in all of these systems is well-described by persistent random walks, which likely emerge from fractional Brownian motion. This is an important demonstration, as it implies orientation memory in the systems. Then the paper proceeds to attempt to discern the origin of this memory and claims to establish key roles for adherens junctions and vinculin dimerization. While for the most part the manuscript is well-written, there are some significant overinterpretations in experimental results. The largest issue is demonstrating the role of vinculin dimerization, which is not a well-studied phenomenon inside living cells, as all data is reliant on a single point mutation (Y1065E). Additionally, the authors seem to be over-interpreting several of the assays; the statistical analysis does not seem to encompass all comparisons made, and the molecular model proposed does not clearly explain the observed results. The discussion could also be strengthened by considering other aspects of vinculin behavior (e.g., vinculin catch bonding) as well as discussing some other recent similar papers.

      (1) Likely the most significant issue with the manuscript is the interpretation of the vinculin Y1065E variant and the assumption that the only defect the mutations cause is a lack of dimerization. Vinculin dimerization is mediated by a conformational change in the vinculin tail domain induced by F-actin binding (Thompson, FEBS Letters, 2013). Dimerization of the vinculin tail domain has been clearly demonstrated in in vitro systems involving purified proteins, as the authors point out in the manuscript. However, the dimerization of full-length vinculin has not been well characterised in living cells. There are several reasons to suspect dimerization is potentially not prevalent in cells. For instance, in the presence of other actin-binding proteins, there may not be sufficient binding sites available on neighboring actin filaments to facilitate dimerization. Additionally, pY1065 vinculin and vinculin Y1065E have been associated with increased vinculin activation (Huang, JBC, 2014), so other effects seem possible. While the Y1065E variant clearly has an effect on the tension sensor readout and vinculin dynamics, further experimental evidence is needed to show that these effects are due to a lack of dimerization in living cells. To justify the definitive claims made in the manuscript, the authors likely need to develop, or employ, an assay for detecting vinculin dimerization in living cells. The authors could choose between intermolecular FRET, proximity labeling assays (i.e., antibodies with DNA for signal amplification), bimolecular fluorescent complementation (i.e., split GFP) based approaches, or some other approach. It should be noted that working with full-length vinculin, not just Vt, and designing an assay that can incorporate vinculin Y1065 variants (Y1065E and potentially Y1065A/F) would strengthen results. Also, the authors should be aware that the observation of strong dimerization may invalidate the use of FRET-based tension sensors in this system or at least necessitate intermolecular FRET control experiments.

      (2) The authors have seemed to assume that FRAP and adhesion stability are interchangeable. To this reviewer's knowledge, this is not the standard in the field. FRAP informs about molecular dynamics. Stability assays, which probe the spatial position of an entire focal adhesion over time (Zaidel-Bar, JCS, 2007, although other approaches are equally suitable), are typically used for assessing adhesion stability. If the authors wish to make strong claims about the stability of the adhesions, non-FRAP-based assays should be employed. Alternatively, the authors could interpret the FRAP data simply in terms of vinculin dynamics.

      (3) A major conclusion in the manuscript is that in response to overexpression of a specific vinculin construct, focal adhesions behave the same in single cells, confluent cells, and collectively migrating cells for all the mutants but Y1065E. However, outside of the FRET measurements, there is not much evidence to support this claim. The authors should perform a greater comparison of the focal adhesions between the systems used in the manuscript (single cell, confluent cells, collectively migrating cells). Key measurements would include focal adhesion number per cell, focal adhesion size, focal adhesion orientation, vinculin dynamics (e.g., FRAP), focal adhesion stability, and some indicators of focal adhesion composition. For the last aspect, focusing on focal adhesion components that also have roles in adherens junctions, such as VASP, seems appropriate. Without such characterization, it is an overinterpretation to assume that focal adhesions are the same in each system and, therefore, effects are due to vinculin behavior in the adherens junctions.

      (4) What is shown in Figure 3G is not clear. How are P/Po and alpha shown on different areas of the same plot?

      (5) It seems that an insufficient statistical test was used in many experiments. There are comparisons being made between systems (cell migration speed, FRET index...) that are not directly compared in a statistical test. Statistical tests are limited to differences from control (over-expression of full-length vinculin), and consistent increases or decreases (not quantitative values) are taken as evidence of similarity across systems. It seems that a more rigorous and standard approach would be to use an ANOVA/MANOVA with a suitable post-hoc test to perform all of these.

      (6) It is unclear how a lack of vinculin dimerization at adherences junctions perturbs epithelial migration, but the complete lack of vinculin tail, which can also not dimerize, does not. In other words, how can TL "have no other role in cell migration at confluence than those at FAs as in single cells." Notably, the authors do not include the tailless variation in the schematic model figures. These results should be included and explained.

    1. Reviewer #2 (Public review):

      [Editors' note: this version has been assessed by the Senior Editor without further input from the original reviewers. The authors have addressed the minor comments raised in the previous round of review.]

      Summary:

      This study uses dental traits of a large sample of Chinese mammals to tract evolutionary patterns through the Paleocene. It presents and argues for a 'brawn before bite' hypothesis -- mammals increased in body size disparity before evolving more specialized or adapted dentitions. The study makes use of an impressive array of analyses, including dental topographic, finite element, and integration analyses, which help to provide a unique insight into mammalian evolutionary patterns.

      Strengths:

      This paper helps to fill in a major gap in our knowledge of Paleocene mammal patterns in Asia, which is especially important because of the diversification of placentals at that time. The total sample of teeth is impressive and required considerable effort for scanning and analyzing. And there is a wealth of results for DTA, FEA, and integration analyses. Further, some of the results are especially interesting, such as the novel 'brawn before bite' hypothesis and the possible link between shifts in dental traits and arid environments in the Late Paleocene. Overall, I enjoyed reading the paper and I think the results will be of interest to a broad audience.

      Weaknesses:

      For the original draft of the manuscript, I had four major concerns with the study, especially related to the sampling, diet, and evidence for the 'brawn before bite' hypothesis. I still believe that the original issues that I raised may be weaknesses of the study. For example, there is still limited discussion on diets (even though the dental topographic analyses used in the study are designed for inferring diets). And I find the results a little challenging to interpret because teeth of multiple positions are included in the same samples, which seems problematic. That said, the authors have addressed each of my previous concerns and have made major revisions, including running new analyses, and thus I support the paper.

    1. Reviewer #1 (Public review):

      The revised manuscript includes several useful additions, and I appreciate the efforts to clarify parts of the analysis. The dataset remains valuable. However, several key issues raised previously are not yet fully resolved and continue to limit the clarity of the main conclusions.

      (1) I appreciate that the authors guide the reader to the relevant regions in the analysis of chromosome fusions (Fig. 2b). However, these subtelomeric regions are not clearly visualized, making it difficult to compare fused and unfused profiles, even though the conclusions rely largely on visual inspection of them. A more direct comparison between fused and unfused ends, together with quantitative summaries (e.g., binned Red1 enrichment and comparisons with internal regions), would make this experiment more convincing.

      (2) The SK1/S288c comparison (Fig. 2c) is an excellent approach, but is currently presented just as profiles, which again requires substantial effort from the reader to extract the relevant information. A systematic analysis across all informative chromosome ends-for example, comparing Red1 levels in syntenic regions using binned log2 fold-change-would more directly test the proposed in cis effect (L168) and clarify the contribution and range of Y'-associated effects. Other factors (e.g. distance from chromosome ends) could also be assessed within this framework.

      Related to this, it is unclear if Y' elements themselves exhibit lower Red1 binding than the genome average. Providing the mean Red1 signal per Y' element would clarify this point and may also aid interpretation of the relationship between coding density and Red1 enrichment.

      (3) The Dot1-Sir3 section is now simpler. However, I still find it difficult to follow the underlying rationale. In particular, it is unclear why a Dot1 function dependent on H3K79 methylation is introduced, given that the data in the previous section suggest H3K79 methylation is dispensable for subtelomeric Red1 depletion. A clearer statement of the authors' working model would be helpful.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Raghavan and his colleagues sought to identify cis-acting elements and/or protein factors that limit meiotic crossover at chromosome ends. This limitation is important for avoiding chromosome rearrangements and preventing chromosome mis-segregation.

      By comparing protein axis recruitment in SK1 and S288C background, which differ in their number and distribution of Y' elements, the authors show that Y' element have a limited impact on axis protein enrichment. Genetic analyses coupled with ChIP experiments revealed that the differential binding of the Red1 protein in subtelomeric regions requires the methyltransferase Dot1. Interestingly, the lack of Red1 depletion in subtelomeric regions in this mutant does not impact DSB formation. Another surprising finding is that deleting DOT1 has no effect on Red1 loading in the absence of the silencing factor Sir3. Unlike Dot1, Sir3 directly impacts DSB formation, probably by limiting promoter access to Spo11. As now clearly stated in the abstract and the discussion, this explains only a small part of the low levels of DSBs forming in subtelomeric regions and the main mechanisms suppressing crossover close to the ends of chromosomes remain to be deciphered.

      Strengths:

      This work provides intriguing observations, such as the impact of Dot1 and Sir3 on Red1 loading and the uncoupling of Red1 loading and DSB induction in subtelomeric regions.

      The separation of axis protein deposition and DSB induction observed in the absence of Dot1 is interesting because it rules out the possibility that the binding pattern of these proteins is sufficient to explain the low level of DSB in subtelomeric regions.

      The demonstration that Sir3 suppresses the induction of DSBs by limiting the openness of promoters in subtelomeric regions is convincing.

      Weaknesses:

      The section examining the impact of Dot1 and Sir3 remains complex, which is partly inherent to the intricate relationship between Dot1 and Sir3. However, the authors conclude that Dot1 acts independently of its catalytic activity based on the phenotype of the H3K79R mutant phenotype. Although this is possible it is not fully demonstrated as the H3K79R mutant may exhibit its own phenotype independently of Dot1. Unless the authors test the impact of the catalytic dead mutant Dot1-G401R on axis protein enrichment at subtelomeres they cannot claim that Dot1 act independently of its catalytic activity.

      Sir3's impact on DSB induction is compelling, yet it only accounts for a small proportion of DSB depletion in subtelomeric regions. Thus, the main mechanisms suppressing crossover close to the ends of chromosomes remain to be deciphered.

    1. Reviewer #1 (Public review):

      In this paper, Solyga, Zelechowski & Keller study human visuomotor mismatch responses as an alternative instantiation of prediction errors to classic oddball paradigms. Using VR, they created a condition in which participants were moving around thereby creating a visuomotor coupling between physical movement and visual flow. To attempt to isolate the contribution of specifically movement-related predictions in this condition, they contrasted it to a condition in which participants were seated and rewatching their movement trajectory during the 'active' condition. Visuomotor mismatches were created by temporarily decoupling movement and visual experience by halting the VR display as participants continued to move.

      The core finding of the paper is that participants exhibit a positively-valenced response to the visuomotor decoupling in the active but not in the passive condition. Since walking speed only insignificantly slows down following decoupling events in the active conditions, the authors argue that this difference can not be accounted for by "changes in participants' behavior or to simple visual offset responses" with the latter being equal across both conditions. The following reinstatement of the coupling in turn does not differ between the two conditions. The authors additionally show that this mismatch response differs from visual onset responses elicited by checkerboard inversions and that it's "qualitatively" stronger than more commonly studied auditory oddball mismatch responses.

      The design with its focus on ecological validity is impressive, well-rationalized and the results are well illustrated. I additionally appreciate the control analyses with regards to changes in walking speed and playback DOF and, now added, additional participants who experience the passive condition before the active. I have a couple of questions/comments.

      My main question in round 1 regarded the isolation of visuomotor mismatch. Although the comparison with a seated control seems like a very sensible way to control for simple visual responses, there seem to be more differences than just a break in visuomotor coupling between the conditions. I therefore wonder whether the reduced offset response in the seated condition may be, in part, explained differently. For example, given that participants always conduct the active condition before rewatching their movement in the seated condition, it seemed likely that there is a component of learning across the session that flow will sometimes be halted. This is confirmed with the analyses. The explanation that there is a visuomotor component here is given further weight by their conduction of an additional group of participants who perform the conditions in the reverse order, so this has strengthened the manuscript considerably. However, it does of course remain an imperfect control because the visual stimulus is now different between the conditions for these participants. It's the best that can be achieved with this type of paradigm though and of course it yields a great deal of ecological validity.

      I was also wondering whether the authors may consider the findings in frontal electrodes more closely given that the title of the paper focuses on a specifically occipital effect. Their further analyses have confirmed that there are likely interesting frontal effects. From a theoretical point of view, the spatial dissociation in adaptation effects, which were stronger in frontal and weaker in occipital areas, seems interesting and perhaps worth discussing, especially given the interpretation that "mismatch processing may initially arise in sensory visual areas before engaging higher-order frontal regions." How come the frontal decrease in responses is not accompanied by an analogous decrease in its supposed occipital source? Could these two responses reflect different kinds of prediction error signals (i.e. objective vs subjective)?

      I remain concerned that the authors fight too defensively that they have absolutely isolated visuomotor prediction mechanisms with this paradigm. It's a nice, informative study, but it seems odd to argue there are no other possible explanations. One picks a design to optimize some features but they will always come at some cost to others. Prioritising ecological validity, which is a justifiable aim, necessarily usually weakens some control over confounds.

      To outline my reasoning fully: My concerns wrt generic influences of action on perception are reflected in Fig 1. The P1 is smaller when walking than sitting. It seems likely that the mismatch response reflects something about extrapolation or prediction, because it is larger when walking. However, it's not necessarily sensorimotor prediction. Even if you remove action from the equation, the flow can be extrapolated or predicted most of the time in a way it cannot so well when the video is halted. Of course the sitting condition somewhat controls for it, but when it came second the visual flow disruptions were more predictable here. A reduction in effects over time is indeed confirmed with their analyses. They now have conducted a study with the conditions in the reverse order and they find the same thing. But of course this necessitates non-identical visual flow because the sitting condition is playing the previous participant's flow. So it is likely that across all of these comparisons, it is the visuomotor mismatch that is especially salient. It's just that each comparison is a bit messy/confounded. It would strengthen the manuscript if there were some consideration given to the other processes likely at play here.

      As a more minor point in response to our previous review, whether particular accounts represent an 'orthodox' view at present does not determine whether they raise logical issues in need of consideration. The authors may have missed that the papers in question consider mechanisms underlying the attenuation of particular pieces of information *from perception*. Not perceptual processing. We have one percept at any one moment in time and must understand how different population types synergistically generate that percept.

      Similarly a little strange is the way in which the authors aggressively defend the position that self-generated motion is 'the strongest' type of prediction. Sure, we probably experience the effects of our actions more often than ambulances. But what about objects obeying laws of gravity or others' faces being structured and moving in systematic ways? It is hard to quantify, such that presumably many scientists would be skeptical of such a claim, and it is not needed logically to justify the importance of examining mechanisms enabling action to shape perceptual processing. I'd assume it better to fight the battles you need to (and can) fight, such that the robust claims carry more weight.

      Hope these comments are helpful.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates whether visuomotor mismatch responses can be detected in humans. By adapting paradigms from rodent studies, the authors report EEG evidence of mismatch responses during visuomotor conditions and compare them to visual-only stimulation and mismatch responses in other modalities.

      Strengths:

      - Authors use a creative experimental design to elicit visuomotor mismatch responses in humans.

      - The study provides an initial dataset and analytical framework that could support future research on human visuomotor prediction errors.

      Weaknesses:

      - Methodological issues (e.g., volume conduction) make it difficult to confidently attribute the observed mismatch responses to activity in visual cortical regions. This could be alleviated by increasing the number of channels.

      The authors successfully demonstrate that visuomotor mismatch paradigms can, in principle, be applied in human EEG. This approach provides a translational bridge between rodent and human work on predictive processing.

    3. Reviewer #3 (Public review):

      Solyga, Zelechowski, and Keller present a concise report of an innovative study demonstrating clear visuomotor mismatch responses in ambulating humans, using a mobile EEG setup and virtual reality. Human subjects walked around a virtual corridor while EEGs were recorded. Occasionally, motion and visual flow were uncoupled, and this evoked a mismatch response that was strongest in occipitally placed electrodes and had a considerable signal to noise ratio. It was robust across participants and could not be explained by the visual stimulus alone.

      This is an important extension of their prior work in mice, and represents an elegant translation of those previous findings to humans, where future work can inform theories of e.g. psychiatric diseases that are believed to involve disordered predictive processing. For the most part, the authors are appropriately circumspect in their interpretations and discussions of the implications. The paper in its current form represents an important addition to the literature.

      The authors have included analyses of the auditory mismatch using temporal electrodes, referenced to Cz (and therefore should exhibit a mismatch positivity). This added data clearly and convincingly shows that the sensorimotor mismatch is, indeed, stronger than the passive auditory MMN.

      - The reference electrode placed at Cz makes it is difficult to interpret relative differences between frontal and occipital electrode responses, as the occipital electrodes are placed farther away from the Cz reference than the frontal electrodes. Similarly, signal occuring cortically near the Cz reference might only appear as though it is occipitally distributed in this montage. It is common in EEG research to re-montage the data to an averaged common reference in order to better interpret the scalp distributions. As the electrode coverage was sparse for some subjects, this could be challenging, and this reviewer does not feel that it is necessary to do this analysis step, or even to drastically rewrite the body of the paper. We only request that some discussion, however brief, is included in the discussion section or the methods that recommend more dense electrode coverage in the future to better interpret scalp distributions and potential meso-scale sources.

      - This is just a suggestion. The authors are encouraged to analyse (and report) time-frequency power and phase locking for these mismatch responses, as is common in much of the literature (see Roach et al 2008 Schizophrenia Bulletin). This is not to say that doing so will yield insights into oscillations per se, but converting the data to the time-frequency domain provides another perspective that has some advantages. fosters translations to rodent models, as ERP peaks do not map well between species, but e.g. delta-theta power does (see Lee et al 2018 Neuropsychopharmacology; Javitt et all 2018 Schizophrenia research; Gallimore et al 2023 Cereb Ctx). Further, ERP peaks can be influenced by the actual neuroanatomy of an individual (especially for quantifying V1 responses). Time frequency analyses may aid in interpreting the "early negative deflection with a peak latency of 48 ms " finding as well. As it stands, the report is complete, and it would be acceptable if the authors chose to save this type of analysis for a future publication.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. We appreciate the revisions and the authors addressed all of the remaining minor concerns listed by the reviewers. We have no further suggestions for revision.]

      Summary:

      Rolland and colleagues investigated the interaction between Vibrio bacteria and Alexandrium algae. The authors found a correlation between the abundance of the two in the Thau Lagoon and observed in the laboratory that Vibrio grows to higher numbers in the presence of the algae than in monoculture. Timelapse imaging of Alexandrium in coculture with Vibrio enabled the authors to observe Vibrio bacteria in proximity to the algae and subsequent algae death. The authors further determine the mechanism of the interaction between the two and point out similarities between the observed phenotypes and predator prey behaviours across organisms.

      Strengths:

      The study combines field work with mechanistic studies in the laboratory and uses a wide array of techniques ranging from co-cultivation experiments to genetic engineering, microscopy and proteomics. Further, the authors test multiple Vibrio and Alexandria species and claim a wide spread of the observed phenotypes.

      Comments on revisions:

      I thank the authors for their additional work on the manuscript. My comments were addressed to my satisfaction.

    2. Reviewer #2 (Public review):

      Goal summary:

      The authors sought to (i) demonstrate correlations between the dynamics of the dinoflagellate Alexandrium pacificum and the bacterim Vibrio atlanticus in natural populations, ii) demonstrate the occurrence of predation in laboratory experiments, iii) demonstrate that predation is induced by predator starvation, and iv) test for effects of quorum sensing and iron-uptake genes on the predation process.

      Strengths include:

      - Data indicating correlated dynamics in a natural environment that increase the motivation for study of in vitro interactions<br /> - Experimental design allowing clear inference of predation based on population counts of both prey and predators in addition to microscopy-based evidence<br /> - Supplementation of population-level data with molecular approaches to test hypotheses regarding possible involvement of quorum sensing and iron update in predation

      Weaknesses include:

      - A quantitative analysis of effects of manipulating V. atlanticus density on rates of predation would have been valuable

      Appraisal:

      The authors convincingly demonstrate that V. atlanticus can prey on A. pacificum, provide strongly suggestive evidence that such predation is induced by starvation and clearly demonstrate that both iron availability and correspondingly the presence of genes involved in iron uptake strongly influence the efficacy of predation.

      Discussion of impact:

      This paper will interest those interested in the diversity of forms of microbial predation and how microbial predatory behavior responds to environmental fluctuations. It will also interest those investigating bacteria-algae interactions and potential ecological controls of algal blooms. It may also interest researchers of microbial cooperation in light of the suggestion of communication between predator cells.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents an ambitious and technically impressive attempt to map how well humans can discriminate between colours across the entire isoluminant plane. The authors introduce a novel Wishart Process Psychophysical Model (WPPM) - a Bayesian method that estimates how visual noise varies across colour space. Using an adaptive sampling procedure, they then obtain a dense set of discrimination thresholds from relatively few trials, producing a smooth, continuous map of perceptual sensitivity. They validate their procedure by comparing actual and predicted thresholds at an independent set of sample points. The work is a valuable contribution to computational psychophysics and offers a promising framework for modelling other perceptual stimulus fields more generally.

      Strengths:

      The approach is elegant and well-described, and the data are of high quality. The writing throughout is clear and the figures are clean (elegant in fact) and do a good job of explaining how the analysis was performed. The whole paper is tremendously thorough and the technical appendices and attention to detail are impressive (for example, a huge amount of data about calibration, variability of the stim system over time etc). This should be a touchstone for other papers that use calibrated colour stimuli.

      Comments on revised version:

      The authors have addressed all the issues I raised to my satisfaction.

    2. Reviewer #3 (Public review):

      Summary:

      This study presents a powerful and rigorous approach for characterizing stimulus discriminability throughout a sensory manifold, and is applied to the specific context of predicting color discrimination thresholds across the chromatic plane.

      Strengths:

      Color discrimination has played a fundamental role in studies of human color vision and for color applications, but as the authors note, remains poorly characterized. The study leverages the assumption that thresholds should vary smoothly and systematically within the space, and validates this with their own tests and comparisons with previous studies.

      Comments on revised version:

      My comments have been addressed.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors investigate whether glycogen phosphorylase is a potential molecular target of benzoylphenylurea insecticides and examine the physiological consequences of inhibiting glycogen breakdown in the diamondback moth Plutella xylostella. The authors express and characterize recombinant glycogen phosphorylase, test its inhibition by a mammalian glycogen phosphorylase inhibitor and by the insecticide diflubenzuron, and assess the physiological effects of glycogen phosphorylase inhibition through chemical exposure and RNA interference. Based on these experiments, the authors conclude that benzoylphenylurea insecticides do not target glycogen phosphorylase and propose that insects compensate for glycogen phosphorylase inhibition through activation of gluconeogenesis, allowing them to maintain glucose homeostasis and complete development despite strong suppression of the enzyme.

      Strengths:

      The study addresses an interesting and long-standing question in insect toxicology regarding the mechanism of action of benzoylphenylurea insecticides. The authors combine several complementary approaches, including recombinant enzyme characterization, inhibitor assays, RNA interference, gene expression analyses, and metabolite measurements. The biochemical characterization of the recombinant glycogen phosphorylase and the demonstration that the tested glycogen phosphorylase inhibitor can strongly inhibit enzyme activity represent important technical strengths. In addition, the study integrates biochemical and physiological observations to explore how insects might compensate for disruptions in central carbohydrate metabolism.

      Weaknesses:

      Several aspects of the central conclusions rely on indirect evidence and would benefit from additional validation. The proposed compensatory mechanism (gluconeogenesis supported by amino acid mobilization) is inferred primarily from transcriptional changes in gluconeogenic genes, reduced protein levels, and changes in metabolite concentrations. While these observations are consistent with increased gluconeogenic activity, they do not directly demonstrate metabolic flux through this pathway. Direct measurements of gluconeogenic flux would be required to confirm that carbon derived from non-carbohydrate substrates contributes to glucose production.

      Some interpretations are also speculative. For example, the lack of glycogen accumulation following glycogen phosphorylase knockdown is attributed to alternative glycogen degradation pathways, such as α-amylase or glycogen debranching enzymes, but these possibilities are not experimentally examined. Measuring the expression or activity of these enzymes would help evaluate whether such pathways contribute to the observed metabolic response.

      The physiological consequences of the proposed metabolic compensation are also not fully explored. If proteins are mobilized to support gluconeogenesis, this shift might be expected to affect organismal traits such as adult body size, flight capacity, or reproductive performance. Assessing these traits could provide valuable insight into whether the proposed compensatory metabolism carries fitness costs.

      Finally, some conclusions extend beyond the direct evidence presented. The study shows that diflubenzuron does not inhibit glycogen phosphorylase in vitro, but broader conclusions regarding the mechanism of action of benzoylphenylurea insecticides as a class may require additional evidence. In addition, some biochemical and cell-based observations would benefit from confirmation in whole insects, given that metabolic regulation can differ substantially between isolated enzyme or cell-based systems and intact larvae, where hormonal signaling, tissue interactions, and nutrient availability influence metabolic responses.

    2. Reviewer #2 (Public review):

      (1) Significance of the findings and strength of the evidence

      This manuscript evaluates the hypothesis that benzoylurea (BPU) insecticides exert their effects through inhibition of glycogen phosphorylase rather than chitin synthase (CHS). The central premise-that structural similarity among acylurea compounds implies shared molecular targets-is not supported by existing evidence.

      Extensive genetic and biochemical studies, including Reference 5, demonstrate that chitin synthase is the primary insecticidal target of BPUs. In particular, amino acid substitutions at a single site in CHS confer high levels of resistance to diflubenzuron and related compounds, with causality established through CRISPR/Cas9 editing in Drosophila melanogaster. This body of evidence substantially weakens the rationale for proposing glycogen phosphorylase as an alternative primary target.

      The manuscript reports that an acylurea compound previously identified as an inhibitor of mammalian glycogen phosphorylase also inhibits glycogen phosphorylase from Plutella xylostella, while diflubenzuron does not. This observation is consistent with prior work showing that glycogen phosphorylase inhibition among acylureas depends on specific side chain substitutions rather than the shared acylurea core. Consequently, the finding does not support the broader inference that acylurea structure predicts common biological function.

      The manuscript further argues that inhibition of glycogen phosphorylase is not insecticidal and attributes this to metabolic compensation through alternative glucose producing pathways. While it is well established that eukaryotic cells possess multiple mechanisms for maintaining glucose availability, the evidence provided here does not fully support the broader claim that this mechanism explains the lack of insecticidal activity. In particular, the conclusion that the study "resolves" the primary hypothesis is not justified by the data presented.

      Overall, while some experimental observations are sound in isolation, the overarching conclusions are not supported by the strength of the evidence. The significance of the findings is therefore limited.

      (2) Interpretation in the context of existing literature

      The introduction states that the molecular target of BPU insecticides remains a major unresolved controversy. However, multiple prior studies, including References 1, 4, and 5, provide strong genetic evidence that CHS is the primary and essential target of BPUs. These results demonstrate causality rather than simple correlation, particularly through targeted gene editing approaches.

      The manuscript further claims that biochemical studies have failed to demonstrate CHS inhibition by BPUs in cell free assays. However, the cited references (6-9) did not express CHS in such assays and therefore do not directly address this question. As a result, the suggested discrepancy between genetic and enzymatic evidence is not well founded.<br /> Structural analysis of acylurea compounds indicates that biological activity depends on side chain composition rather than the conserved acylurea core. Prior screening studies (Reference 11) show substantial variability in glycogen phosphorylase inhibition among acylureas despite a shared core structure. This undermines the proposal that the acylurea moiety itself constitutes a meaningful clue to a shared molecular mechanism.

      Regarding implications for pesticide design, targeting chitin synthesis remains an attractive strategy because chitin is essential for arthropods and absent in mammals, providing both efficacy and specificity. By contrast, metabolic enzymes such as glycogen phosphorylase are widely conserved, making them less suitable targets from a toxicological and safety perspective.

      (3) Specific technical comments

      The manuscript uses the term "dataology," which is neither defined nor contextualized within the text. As currently used, the term appears unrelated to the subject matter and may be confusing to readers. Clarification or removal would improve clarity.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aimed to determine whether dietary conditioning of fecal microbiota donors can influence the therapeutic efficacy of fecal microbiota transplantation (FMT) in alcohol-associated liver disease (ALD). Specifically, they tested whether donor diets enriched in vegetable or egg-derived proteins alter microbiota composition and function in ways that enhance recovery from alcohol-induced liver injury. Using a murine ALD model, the study integrates microbiome profiling, metabolomics, proteomics, and functional assays to identify mechanisms underlying improved outcomes. The authors propose that vegetable protein-conditioned microbiota promote beneficial microbial remodeling and increased production of caproic acid, which in turn activates hepatic PPARα signaling and enhances fatty acid β-oxidation, thereby reducing steatosis and inflammation.

      Strengths:

      The study is ambitious and methodologically comprehensive. The central idea, that donor diet can modulate FMT efficacy in ALD, is compelling and potentially impactful. It combines in vivo disease models, microbiome analysis (16S rRNA sequencing), metabolomics and proteomics, pharmacological inhibition experiments, and in vitro validation in hepatocytes. This multi-layered approach is a clear strength and allows the authors to explore the gut-liver axis. The comparison between different protein sources (vegetable vs egg) is very interesting, and the PPARα inhibition experiments provide relatively strong functional support for the involvement of host metabolic signaling pathways in mediating the observed effects.

      Weaknesses:

      Despite the comprehensive scope of the manuscript, several aspects of the study limit the strength of its mechanistic conclusions. The causal attribution to caproic acid remains incomplete. While caproic acid is identified and functionally tested, there is no direct demonstration that it is necessary for the Veg-FMT phenotype in vivo. The metabolomics data suggest multiple candidate metabolites, but these are not systematically explored. The study identifies specific bacterial taxa and, separately, key metabolites, but does not establish a direct connection between microbial composition and metabolite production. The use of GW6471 supports involvement of PPARα but does not fully establish specificity, as off-target effects cannot be excluded. Finally, it is not fully clear whether effects are exclusively microbiota-driven or could partially reflect the transfer of diet-derived metabolites.

      The authors successfully demonstrate that donor dietary conditioning influences the therapeutic efficacy of FMT in a murine model of ALD. The data convincingly show that vegetable protein-conditioned microbiota is associated with improved liver injury, reduced inflammation, and enhanced intestinal barrier integrity compared with controls or an egg protein-enriched diet. While the proteomic and gene expression data suggest activation of pathways related to fatty acid β-oxidation, these measurements do not directly demonstrate increased metabolic flux. The use of the PPARα antagonist GW6471 provides important functional support for the involvement of this pathway, as inhibition attenuates the protective effects of Veg-FMT. However, this approach primarily establishes pathway dependency rather than directly confirming enhanced β-oxidation activity. The authors may therefore wish to moderate their interpretation or clarify this distinction, particularly given the relatively modest fold changes observed in several targets. The role of caproic acid as a central mediator is plausible but not definitively established. Finally, the link between microbiota composition, metabolic function, and host signaling remains partly correlative. Overall, the study achieves its primary aim at a phenotypic level, but some of the mechanistic claims would benefit from more cautious interpretation or additional validation.

      Likely impact of the work on the field, and the utility of the methods and data to the community:

      The work addresses an important and underexplored question: how donor characteristics influence FMT efficacy. By introducing donor diet as a modifiable variable, the study has potential implications for optimizing microbiota-based therapies. The datasets (microbiome, metabolomics, and proteomics) may also be valuable to the community, as they provide a resource for exploring gut-liver metabolic interactions. The translational impact will, however, depend on validation in human systems and a clearer identification of causal mechanisms.

    2. Reviewer #2 (Public review):

      The manuscript explores a valuable strategy for optimizing Fecal Microbiota Transplantation (FMT) efficacy in alcoholic liver disease through donor dietary intervention. I have identified several critical logical gaps, missing links in the evidence chain, and methodological ambiguities that require detailed explanation and supplementation.

      (1) While the Methods section states that each recipient mouse group consisted of 16 animals, microbiome sequencing was performed on only 4 samples per group. This sample size is insufficient, and the high inter-individual variability observed reduces the statistical power and representativeness of the data. I recommend increasing the sequencing sample size or, at a minimum, explicitly acknowledging the risk of false positives due to the small sample size in the Discussion.

      (2) The layout of Figure 4 should be adjusted. Panel A should be enlarged for better visibility, while Panel B should be reduced in size to balance the figure composition.

      (3) A rationale should be provided for the selection of egg white protein as the animal protein control. Does this adequately represent animal proteins in general? Could the results differ if casein or whey protein were used? The current choice limits the generalizability of the conclusions, and this limitation should be addressed.

      (4) The ALD model was established over 12 weeks, yet the FMT intervention consisted of only 3 administrations with a 1-week observation period. In the context of such a severe liver injury model, a 1-week recovery period appears insufficient to observe genuine fibrosis reversal, which typically requires a longer timeframe. The authors should discuss whether short-term FMT can truly induce structural remodeling or if the observed effects are transient.

      (5) The results rely heavily on PICRUSt2 for functional prediction. As prediction does not equate to factual validation, the authors should exercise caution in their wording within the Discussion. Alternatively, I recommend supplementing the study with shotgun metagenomic sequencing to verify the existence of these pathways rather than relying solely on predictive algorithms.

      (6) Although Egg-FMT was less effective than Veg-FMT, it performed better than the standard FMT or abstinence groups. Why is the effect of egg white protein intermediate? Is this due to rapid digestion resulting in insufficient substrate, or differences in metabolite production? A deeper comparative analysis of the Egg-FMT group is required, rather than treating it merely as a negative control.

      (7) Relying solely on the "inhibitor blocking effect" proves only that Caproic acid's function is dependent on the PPARα pathway, not that it directly acts on PPARα. To claim direct activation, the authors must demonstrate direct binding between Caproic acid and the PPARα protein (e.g., via SPR or MST assays). Alternatively, a luciferase reporter assay driven specifically by PPARα response elements (PPRE) should be conducted. If Caproic acid induces luminescence, it would confirm transcriptional activation of PPARα rather than mere downstream activation.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The review comments were minor and constructive, and the authors have been very responsive.]

      Summary:

      This brief piece by Swartz and colleagues outlines the complexities surrounding the choice of clinical specialty for physician-scientists. It is, in general, clear and well-written, and it will be useful to research-oriented medical students choosing a path and to the mentors who are guiding them.

      Strengths:

      The writing is clear. The points made are not profound, but they are important and will be of use to the intended audience.

    2. Reviewer #2 (Public review):

      Summary:

      This article is a useful compendium of advice for MD/PhD students (and research-focused MD students) to consider when it is time to decide on a clinical field for residency training. The authors are a distinguished group of physician-scientists and program directors who are drawing on published data and their own experience as mentors to provide advice and resources to students about to make what can be a career-defining choice. It makes an effective argument for considering important differences between clinical fields in their ability to sustain research integration, provide mentorship, meet lifestyle expectations, and foster a long-term career as a research-focused physician-scientist.

      Strengths:

      (1) A lot has been written about physician-scientists as an endangered species. Given the important role that physician-scientists can play if they engage in research that is informed by experience in patient care, not nearly enough has been written about the choices that students make during training that can keep them on track or throw them off.

      (2) The article provides not only general advice, but specific information in the 2 tables that can help trainees to weigh their priorities and consider their options.

      (3) Among the best advice is to weigh clinical demands, maintenance of procedural skills, recognition of the impact of research time on salary, and the impact of high salaries on the tension between research effort and clinical effort in clinical departments, which is where most physician-scientists in academia are employed.

    1. Reviewer #1 (Public review):

      Summary:

      This paper describes a deep learning toolbox that can be used to automatically estimate functional topographic maps directly from human brain anatomy. Building on the first author's earlier work, which demonstrated the feasibility of using deep learning for this purpose, the new version of the toolbox now requires only a single anatomical MRI scan to generate predictions, eliminating the need for a myelin scan. This represents a significant practical improvement.

      Strengths:

      Having such a toolbox is very useful, since manual annotation and delineation of functional visual field maps is a laborious process that also requires deep expertise. The toolbox can save researchers substantial amounts of time and money, and also allows less experienced researchers to now perform this type of analysis. Notably, for certain participants and patients, the time they are able to reside in the scanner might be limited. Being able to focus on the primary research question, rather than the essential yet basic topographic information, could boost data quality and evaluation and might limit the number of participants that need to be included.

      Weaknesses:

      In the paper, the authors compare the performance of their new version to two previous approaches. Figure 2b shows that the new toolbox performs similarly to the previous deep-learning-based toolbox, but requires only an anatomical scan, which is a significant improvement. They also compare it to an older method that uses an atlas without requiring deep learning. For eccentricity and pRF size predictions, both deep-learning methods perform better than the older approach. For polar angle, a critical parameter for delineating visual field maps, the gain is substantially less. Moreover, the comparison to the atlas method (Benson2014) is not entirely fair, as, to our knowledge, there is also a more advanced atlas version that uses Bayesian fitting methods and already performs better than the old method. To better understand the gain of using deep learning, it would be beneficial if the authors also made the comparison to this more recent atlas-based approach. Moreover, it would be useful to know the correlations for the representative participant. Some examples of relatively "bad" maps would also be useful to have (and could be provided as supplementary information).

      Figure 2b shows that the toolbox is quite good at estimating eccentricity and polar angle parameters, but less good at estimating the population receptive field (pRF) size. I will return to this latter point.

      An interesting feature is that while the toolbox is trained on a specific data set (HCP), it can, "out-of-the-box", be applied to different existing data sets, without the need to retrain the model. This is quite important for the general utility of the method. The results for this are shown in Figure 3. Again, in panel b, it can be seen that the toolbox does a good job at estimating eccentricity and polar angle values, but performs rather poorly for pRF size: the deepRetinotopy toolbox has a strong tendency to only estimate very small pRFs, particularly when applying it across different datasets. For this reason, at the moment, these estimates appear hardly useful. It would be very helpful for readers if the authors could clarify or elaborate on this point, particularly regarding the limitations of pRF size predictions. They explain that this could be due to the use of different types of stimuli, but even within the same (HCP) dataset, the predictions primarily suggest tiny pRFs, even though the training dataset also contains larger ones (which can be better seen in supplementary Figure 4). Showing the predictions for higher-order brain areas, which have larger pRFs on average, could serve a similar evaluation purpose. Presumably, the underlying reasons are complex and could relate to the use of different stimuli, different analysis toolboxes, and how the deep learning model is currently being trained. Possibly, the abundance of small pRFs at lower eccentricity in the training set (which is usually the case in any empirical analysis) has given the model a very strong bias toward predicting small pRFs.

      There would be various ways to verify which of these components is critical. For example, the model could be trained only on the bar stimuli of the HCP dataset, or the pRFs for all stimuli and datasets could be estimated using the same software tool. The latter seems important. For example, Supplementary Figure 4 indicates a high correlation between the Stanford and NYU cohorts that have used the same stimulus and analysis package, despite having different resolutions and scanners. Further investigation into the underlying reasons for these discrepancies would strengthen the paper. It would also provide valuable guidance for users of the toolbox on which toolbox predictions to trust and which not, as well as how well the model generalizes to other stimulus types, scanners, and image resolutions.

      An aspect that is not directly apparent from the title, abstract, and introduction is that the deepRetinotopy toolbox does not by itself produce estimates of visual area labels or boundaries. It predicts only polar angle and eccentricity values. To predict labels and boundaries, the authors combine the toolbox with an atlas (the aforementioned Bayesian atlas). For visual areas V1 - V3, it does a very good job, in that the predictions are as good as the empirical ones. Notably, the authors indicate that the predictions for V2 and, in particular, V3 are worse than for V1, but Figure 4 clearly shows that predictions are as good as the empirical ones. More cannot be expected from a model that is trained on such empirical data.

      Irrespective of the limitations with respect to predicting pRF size, the toolbox opens up functionally oriented analyses of very large cohorts of healthy participants, of which only anatomical data is available. The authors present an example of this by confirming the existence of differences in horizontal and vertical asymmetries in the field maps of the visual cortex of children and adults. While Figure 5 confirms the existence of differences, the analysis could be expanded to provide deeper insights, such as normalized developmental trajectories for both asymmetries, given the size of the dataset. This would better highlight the true power of their approach.

      While the authors address limitations with respect to studying experience-dependent atypical functional organization, they do not address how the deepRetinotopy toolbox would handle (acquired) brain lesions. Addressing this, even if only speculative, would be welcome. Another welcome addition would be to see the predictions for additional brain areas, even if those would (presumably) be worse at present. Such information would nevertheless be essential for users considering applying this toolbox. Moreover, this could be a valuable resource serving as a benchmark for future iterations of either deepRetinotopy or other approaches.

    2. Reviewer #2 (Public review):

      Summary:

      The authors introduce the deepRetinotopy toolbox, a deep learning-based software package that allows for user-friendly automatic delineation of visual areas based on anatomical (T1-weighted) MRI scans. This is an important evolution over a prior published version of the software, which required myelin maps additionally. The new version will hence allow many more users to obtain high-fidelity field-map delineations based on existing data or using standard protocols, providing a huge advance to the field. The authors exploited this strength and mapped visual field maps (for areas V1-V3) in 11060 human MRI scans covering different age classes to quantify changes of retinotopic organization across age groups, showing that previously functionally identified imbalances of early visual cortex field maps can now be identified on the basis of anatomical scans alone.

      Strengths:

      Overall, this is a tremendously important methodological contribution of primarily high practical and applied value. It allows functional imaging labs to delineate human cortical visual field maps with confirmed high fidelity using anatomical T1-weighted scans only. This will save expensive functional imaging and time-consuming analyses that were previously required to achieve nearly the same result and far better results than prior model-based approaches offered.

      Also, the quantification of the accumulated very large dataset is meticulous and provides impressively detailed results of the field map changes for areas V1-V3 as a function of age.

      Weaknesses:

      (1) The weak point of the contribution is the choice to limit anatomical quality assessments and error quantifications to just three early regions, V1-V3, even though the deepRetinotopy toolbox can delineate over 20 regions (including parietal, ventral, and lateral regions, such as IPS0-5, hV4, VO1-2, V3A, PHC1-2, LO1-2, and TO1-2).

      (2) The limit is fine for their large-scale application of the toolbox to age groups, as here, a clear hypothesis on early cortex variability was tested.

      (3) However, the introduction of the toolbox itself warrants quality assessments and comparisons to prior models and ground truth beyond V1-V3, just like the authors did in their prior publication of the predecessor model.

      (4) This is important as the vast majority of applications of this toolbox will likely go beyond V1-V3 to delineate dorsal, ventral, and lateral regions.

      (5) For the present paper, this will require only 1 or 2 additional figures, or extending their present figures 2 and 4 along the lines of their previous figure 7 (Ribeiro et al 2021), which included error measures for high-level regions. Ideally, you provide sub-graphs separately for early visual, dorsal, ventral, and lateral regions.

      (6) Going beyond V1-V3 is important for several reasons: first, future studies applying the software beyond V3 will need quantification for reassurance and justification. Second, for the sake of transparency, even if results are noisy or on par with prior models. Third, as a benchmark or reference point for future approaches.

    3. Reviewer #3 (Public review):

      Summary:

      This valuable study presents a tool that uses brain anatomy to predict the layout and size of early visual maps, and it is strengthened by testing across a large and diverse collection of scans. The work will be useful for researchers who want to estimate likely visual map layout from standard anatomical scans and to relate anatomical differences to differences in visual organization across groups. The evidence is solid for the general usefulness of the approach, but incomplete for broader claims about prediction accuracy and use across datasets, particularly for estimates of map size and for showing that the model improves on repeated functional measurements.

      Strengths:

      The paper addresses a useful and important problem: estimating early visual map organization from anatomical measurements alone. Tools that predict these types of functional data from anatomical measurements were introduced more than a decade ago by Benson and colleagues, and the present authors have significantly extended that work. That is a real strength of the manuscript, because there is genuine value in having a practical tool that can estimate likely visual organization from standard anatomical scans.

      Another major strength is the rigorous cross-dataset benchmarking and the accumulation of multiple datasets. The authors assembled a large and diverse set of scans and assessed model performance across different scanners, field strengths, and visual stimuli, which gives the reader a much better sense of how broadly the approach may apply. The retrospective analysis of more than 11,000 scans is especially notable and creates an unusual opportunity to ask how anatomical variation may relate to population differences in visual organization.

      I also think the paper does a good job of showing why such a tool could matter in practice. A complete tool could be used in several ways. First, it could help users identify the locations of activations measured in other experiments with respect to the typical V1-V3 maps. Second, maps measured from an individual subject or patient could be compared with the predictions from the tool to ask whether they differ meaningfully from a standard anatomy-based map. Third, the tool can be used, as the authors have done here, to examine differences in anatomy across populations and interpret these differences with respect to retinotopic maps. Of these uses, the first already seems well supported by the current presentation.

      Weaknesses:

      (1) Quantification of the Analysis

      My main concern is that the analysis relies heavily on global summary measures such as correlation and Dice score. Those measures are useful, but the paper would be more informative if it also quantified boundary differences in millimeters, especially for comparisons such as the V1/V2 boundary in Figure 2. That kind of analysis would help readers understand how large the errors are in physically meaningful terms.

      (2) Model fitting methods

      I also think the discussion of prediction failures for pRF size should be more explicit. The mismatch is likely influenced by the fact that the training data and several evaluation datasets were fit with different models and different analysis software. In particular, the network was trained on non-linear size estimates from the HCP data, while the comparison datasets were derived using other packages and, in some cases, different model assumptions. That likely contributes to the spread in Figure 3b and should be discussed more directly. It is important to discuss that the pRF parameters were derived using different software tools.

      - HCP dataset (training data): analyzePRF (Compressive Spatial Summation model)

      - NYU dataset: vistasoft

      - Stanford dataset: vistasoft

      - New Zealand dataset: SamSrf

      - CHN dataset: Custom MATLAB software

      (3) Clarifying Model Accuracy

      If deepRetinotopy generates a true "noise-removed" representation of functional mapping based on anatomy, then fitting it to one fMRI measurement should predict a second, independent fMRI run better than the noisy data from the first run does.

      The authors possess the exact data for this test. For the HCP dataset, the empirical fMRI data were explicitly separated into two halves: "fit 2" (the first half of the fMRI runs) and "fit 3" (the second half). They correlated these two halves to establish a "noise ceiling," the maximum possible reliability of the data. Looking at their results in Figure 2b, the correlation of the deepRetinotopy predictions falls below this noise ceiling. This means that the noisy functional Half 1 actually predicts functional Half 2 better than the anatomical model does.

      The authors should state this explicitly. A side-by-side plot of Half 1 predicting Half 2 versus deepRetinotopy predicting Half 2 would show that the anatomical model regularizes map location well, but misses reliable subject-specific variation that anatomy alone cannot capture.

      (4) The Hemodynamic Response Function

      The assumptions used to generate the original empirical maps are permanently baked into the deep learning model. However, the authors explicitly mention the hemodynamic response function (HRF) only once, noting in the Methods that the modeled time series was "convolved with a canonical hemodynamic response function."

      Beyond this single mention, there is no direct discussion of how the assumption of a single canonical HRF across all 161 HCP training subjects might have systematically impacted or biased the network's predictions. The authors address cross-dataset differences broadly under the umbrella of "experimental design" and "fMRI preprocessing pipeline" biases, but the HRF is a core biological property that mediates the connection between the anatomy and the data. The authors should explicitly discuss how this canonical assumption limits or biases the resulting deepRetinotopy network.

      (5) Scoping the Input Data and Normative Use

      The authors use FreeSurfer to generate a mean curvature map for the entire midthickness cortical surface. This full-hemisphere curvature map is resampled to a standard template surface space (32k_fs_LR), acting as the data frame that feeds input features into the neural network. However, while the network receives the full geometric structure of the hemisphere, it is explicitly trained to predict retinotopic parameters only within a restricted posterior ROI, based on the Wang et al. atlas and containing roughly 3,200 vertices per hemisphere.

      A useful experiment to try, and perhaps the authors have already considered this, would be to restrict the input features exclusively to the posterior vertices. Including all anterior vertices may make it harder for the network to fit the localized visual data. A brief commentary on why the full hemisphere was retained as input could be highly informative for researchers adapting this geometric deep learning pipeline.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Emperador-Melero et al. seek to determine whether recruitment of endocytic machinery to the periactive zone is activity-dependent or tethered to delivery of active zone machinery. They use genetic knockouts and pharmacological block in two model synapses - cultured mouse hippocampal neurons and Drosophila neuromuscular junctions - to determine how well endocytic machinery localizes after chronic inhibition or acute depolarization by super-resolution imaging. They find acute depolarization in both models have minimal to no effect on the localization of endocytic machinery at the periactive zone, suggesting that these proteins are constitutively maintained rather than upregulated in response to evoked activity. Interestingly, chronic inhibition slightly increases endocytic machinery levels, implying a potential homeostatic upregulation in preparation for rebound depolarization. Using genetic knockouts, the authors show that localization of endocytic machinery to periactive zones occurs independently of proper active zone assembly, even in the absence of upstream organizers like Liprin-α.

      Overall, they propose that the constitutive deployment of endocytic machinery reflects its critical role in facilitating rapid and reliable membrane internalization during synaptic functions beyond classical endocytosis, such as regulation of the exocytic fusion pore and dense-core vesicle fusion. Although many experiments reveal limited changes in the localization or abundance of endocytic machinery, the findings are thorough, and data substantially supports a model in which endocytic components are organized through a pathway distinct from that of the active zone. This work advances our understanding of synaptic dynamics by supporting a model in which endocytic machinery is constitutively recruited and regulated by distinct upstream organizers compared to active zone proteins. It also highlights the utility of super-resolution imaging across diverse synapse types to uncover functionally conserved elements of synaptic biology.

      Strengths:

      The study's technical strengths, particularly the use of super-resolution microscopy and rigorous image analyses developed by the group, bolster their findings.

      Weaknesses:

      One limitation, acknowledged by the authors, is the persistence of spontaneous activity at these synapses, which could still impact the organization of these regions.

      Comments on revisions:

      The authors have addressed all of my previous comments.

    2. Reviewer #2 (Public review):

      Summary:

      This study examines whether the localization of endocytic proteins to presynaptic periactive zones depends on synaptic activity or active zone scaffolds. Using genetic and pharmacological perturbations in both Drosophila and mouse neurons, the authors show that key endocytic proteins remain localized to periactive zones even when evoked release or active zone architecture is disrupted. While the findings are largely negative, the study is methodologically solid and provides useful constraints for current models of synaptic vesicle recycling.

      Strengths:

      The experimental design is careful and systematic, spanning both fly and mammalian systems. The use of advanced genetic models, including Liprin-α quadruple knockout mice, is a notable strength. High-resolution imaging approaches (STED, Airyscan) are appropriately applied to assess nanoscale organization. The study clarifies that strict activity dependence of endocytic recruitment may not be a general principle.

      Weaknesses (largely addressed in revision):

      Several initial concerns have been satisfactorily addressed in the revised manuscript. In particular, the inclusion of EndoA/Dap160 experiments and the expanded discussion improve the work. Some limitations remain, including the reliance on Tetanus toxin at the Drosophila NMJ, which does not fully abolish presynaptic fusion, and the still limited insight into the mechanistic basis of periactive zone organization. The biological interpretation of small changes in protein levels upon silencing also remains somewhat unclear.

      Comments on revisions:

      I thank the authors for the careful revision of the manuscript. The additional experiments, in particular the inclusion of EndoA and Dap160 at the Drosophila NMJ, as well as the extended discussion of limitations, are appreciated and address important points raised in the first round.

      While the principal conclusions of the study remain unchanged, and the manuscript is still largely based on negative results, I find that the authors now present these data in a more balanced and transparent manner. The discussion of activity-dependence is improved and more nuanced, especially with regard to possible contributions of spontaneous release and homeostatic effects.

      In my opinion, despite the mostly negative nature of the findings, the work provides a valuable and relevant contribution, as it defines important constraints on current models of periactive zone organization. The study is technically strong, carefully executed, and systematically performed across different model systems.

      Overall, the revised manuscript is clearly improved and represents a solid and well-executed piece of work that will be of interest to the field.

    3. Reviewer #3 (Public review):

      Summary:

      This study examines how synaptic endocytic zones are positioned using a combination of cultured neurons and the Drosophila neuromuscular junction. The authors test whether neuronal activity, active zone assembly, or liprin-α function is required to localize endocytic zone markers, including Dynamin, Amphiphysin, Nervous Wreck, PIPK1γ, and AP-180. None of the manipulations tested caused a coordinated disruption in the localization or abundance of these markers, leading to the conclusion that endocytic zones form independently of synaptic activity and active zone scaffolds.

      Strengths:

      The work is systematic and carefully executed, using multiple manipulations and two complementary model systems. The authors consistently examine multiple molecular markers, strengthening the interpretation that endocytic zone positioning is robust to changes in activity and structural assembly.

      Weaknesses:

      The main limitation is that the study does not test whether the methods used are sensitive enough to detect subtle functional disruption, and no condition tested produces clear disorganization of the endocytic zone. As a result, the conclusion that these zones assemble independently is supported by negative data, without a strong positive control for disassembly or mislocalization.

      This paper addresses a longstanding question in synaptic biology and provides a well-supported boundary on the types of mechanisms that are likely to govern endocytic zone localization. The conclusions are well justified by the data, though additional evidence would be needed to define the assembly mechanism itself.

      Comments on revisions:

      The authors responded to the initial review with care. They both revised the manuscript and conducted new experiments to address each reviewer's concern. The responses to the review were effective, and I think that the revised manuscript provides significant new insights. In my view, it does not require additional revisions.

    1. Reviewer #3 (Public review):

      In this manuscript, the authors use HiC to study the 3D genome of CD14+ CD16+ monocytes from the blood of healthy and those from patients with Alcohol-associated Hepatitis.

      Overall, the authors perform a cursory analysis of the HiC data and conclude that there are a large number of changes in 3D genome architecture between healthy and AH patient monocytes. They highlight some specific examples that are linked to changes in gene expression. The analysis is of such a preliminary nature that I would usually expect to see the data from all figures in just one or two figures.

      In addition, I have a number of concerns regarding the experimental design and the depth of the analyses performed that I think must be addressed.

      (1) There is a myriad of literature that describes the existence of cell-type-specific 3D genome architecture. In this manuscript, there is an assumption by the authors that the CD14+ CD16+ monocytes represent the same population from both the healthy and diseased patients. Therefore, the authors conclude that the differences they see in the HiC data are due to disease-related changes in the equivalent cell types. However, I am concerned that the AH patient monocytes may have differentiated due to their environment so that they are in fact akin to a different cell type and the 3D genome changes they describe reflect this. This is supported by published articles, for example: Dhanda et al., Intermediate Monocytes in Acute Alcoholic Hepatitis Are Functionally Activated and Induce IL-17 Expression in CD4+ T Cells. J Immunol (2019) 203 (12): 3190-3198, in which they show an increased frequency of CD14+ CD16+ intermediate monocytes in AH patients that are functionally distinct.

      I suggest that if the authors would like to study the specific effects of AH on 3D genome architecture then they should carefully FACsort the equivalent monocyte populations from the healthy and AH patients.

      (2) The analysis of the HiC data is quite preliminary. In the 3D genome field, it is usual to report the different scales of genome architecture, for example, compartments, topologically associated domains (TADs) and loops. I think that reporting this information and how it changes in AH patients in the appropriate cell types would be of great interest to the field.

      Comments on revisions:

      In the revision the authors did not respond to my concerns which I believe still remain valid and compromise the author's conclusions of AH-specific effects on genome architecture.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Wang et al. describes the development of an optimized soluble ACE2-Fc fusion protein, B5-D3, for intranasal prophylaxis against SARS-CoV-2. As shown, B5-D3 conferred protection not only by acting as a neutralizing decoy, but also by redirecting virus-decoy complexes to phagocytic cells for lysosomal degradation. The authors showed complete in vivo protection in K18-hACE2 mice and investigated the underlying mechanism by a combination of Fc-mutant controls, transcriptomics, biodistribution studies, and in vitro assays.

      Strengths:

      The major strength of this work is the identification of a novel antiviral approach with broad-spectrum and beyond simple neutralization. Mutant ACE2 enables broad and potent binding activity with the S proteins of SARS-CoV-2 variants, while the fused Fc part mediates phagocytosis to clear the viral particles. The conceptual advance of this ACE2-Fc combination is convincingly validated by in vivo protection data and by the completely abrogated protection of Fc LALA mutant.

      Additionally:

      The authors include a discussion (in Discussion part) about a previously reported ACE2 decamer (DOI: 10.1080/22221751.2023.2275598) and compared with the ACE2-Fc fusion protein developed in this study. The authors also tested the off-target activity and showed no evidence of toxicity in vivo.

    2. Reviewer #2 (Public review):

      Summary:

      Wang et al. engineered an ACE2 mutant by introducing two mutations (T92Q and H374N), and fused this ACE2 mutant to human IgG1-Fc (B5-D3). Experimental results suggest that B5-D3 exhibits broad-spectrum neutralization capacity and confers effective protection upon intranasal administration in SARS-CoV-2-infected K18-hACE2 mice. Transcriptomic analysis suggests that B5-D3 induces early immune activation in lung tissues of infected mice. Fluorescence-based bio-distribution assay further indicates rapid accumulation of B5-D3 in the respiratory tract, particularly in airway macrophages. Further investigation shows that B5-D3 promotes viral phagocytic clearance by macrophages via an Fc-mediated effector function, namely antibody-dependent cellular phagocytosis (ADCP), while simultaneously blocking ACE2-mediated viral infection in epithelial cells. These results provide some insights into improving decoy treatments against SARS-CoV-2 and other potential respiratory viruses.

      Strengths:

      The protective effect of this ACE2-Fc fusion protein against SARS-CoV-2 infection has been evaluated in a reasonable way.

      Weaknesses:

      (1) Some of the mice experiments suffer from insufficient sample numbers, which affect the statistical power and reliability of the results. The author acknowledged this weakness, noting that the supply of aged mice was limited, while arguing that, although the sample size is small, the data from these mice are consistent.

      (2) Compared to 6 hours, intranasal administration of B5-D3 at 24 hours before viral infection results in reduced protective efficacy. However, only survival and body weight data are provided, with no supporting evidence from virological assays such as viral titer measurement. The author acknowledged that such data would be more comprehensive and attributed the limitation to constraints in animal services.

      (3) The efficacy of the B5-D3-LALA group was not as good as that of the B5-D3 group. The author suggested that there might be a certain degree of viral variation, and viral infection in the lungs may be uneven in the B5-D3-LALA group.

    3. Reviewer #3 (Public review):

      Strengths:

      The core strength of this study lies in its innovative demonstration that an engineered sACE2-Fc fusion redirects virus-decoy complexes to Fc-mediated phagocytosis and lysosomal clearance in macrophages, revealing a distinct antiviral mechanism beyond traditional neutralization. Its complete prophylactic protection in animal models and precise targeting of airway phagocytes establish a novel therapeutic paradigm against SARS-CoV-2 variants and future respiratory viruses.

      Weaknesses:

      The study attributes the complete antiviral protection to Fc-mediated phagocytic clearance, a central claim that requires more rigorous experimental validation. The observation that abrogating Fc functions compromises protection could be confounded by potential alterations in the protein's stability, half-life, or overall structure. To firmly establish this mechanism, it is crucial to include a control molecule with a mutated Fc region that lacks FcγR binding while preserving the Fc structure itself. Without this critical control, the conclusion that phagocytic clearance is the primary mechanism remains inadequately supported. The strategy of deliberately targeting virus-decoy complexes to phagocytes via Fc receptors inherently raises the question of Antibody-Dependent Enhancement (ADE) of disease. While the authors demonstrate a lack of productive infection in macrophages, this only addresses one facet of ADE. The risk of Fc-mediated exacerbation of inflammation (ADE) remains a critical concern. The manuscript would be significantly strengthened by a direct discussion of this risk and by including data, such as cytokine profiling from treated macrophages, to more comprehensively address the safety profile of this approach. The exclusive use of the K18-hACE2 mouse model, which exhibits severe disease, limits the generalizability of the findings. The "complete protection" observed may not translate to models with more robust and naturalistic immune responses or to human physiology. Furthermore, the lack of data against circulating SARS-CoV-2 variants of concern. The concept of sACE2-Fc fusion proteins as decoy receptors is not novel, and numerous similar constructs have been previously reported. The manuscript would benefit from a clearer demonstration of how the optimized B5-D3 mutant represents a significant advance over existing sACE2-Fc designs. A direct comparative analysis with previously published benchmarks, particularly in terms of neutralizing potency, Fc effector function strength, and in vivo efficacy, is necessary to establish the incremental value and novelty of this specific agent.

      Comments on revised version:

      The author has successfully addressed the raised issue.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have appropriately addressed the comments raised in the previous round of review.]

      Summary:

      The study by Lemen et al. represents a comprehensive and unique analysis of gene networks in rat models of opioid use disorder, using multiple strains and both sexes. It provides a time-series analysis of Quantitative Trait Loci (QTLs) in response to morphine exposure.

      Strengths:

      A key finding is the identification of a previously unknown morphine-sensitive pathway involving Oprm1 and Fgf12, which activates a cascade through MAPK kinases in D1 medium spiny neurons (MSNs). Strengths include the large-scale, multi-strain, sex-inclusive design, the time-series QTL mapping provides dynamic insights, and the discovery of an Oprm1-Fgf12-MAPK signaling pathway in D1 MSNs, which is novel and relevant.

    2. Reviewer #2 (Public review):

      Summary:

      This highly novel and significant manuscript re-analyzes behavioral QTL data derived from morphine locomotor activity in the BXD recombinant inbred panel. The combination of interacting behavioral-pharmacology (morphine and naltrexone) time course data, high-resolution mouse genetic analyses, genetic analysis of gene expression (eQTLs), cross-species analysis with human gene expression and genetic data, and molecular modeling approaches with Bayesian network analysis produces new information on loci modulating morphine locomotor activity.

      Furthermore, the identification of time-wise epistatic interactions between the Oprm1 and Fgf12 loci is highly novel and points to methodological approaches for identifying other epistatic interactions using animal model genetic studies.

      Strengths:

      (1) Use of state-of-the art genetic tools for mapping behavioral phenotypes in mouse models.

      (2) Adequately powered analysis incorporating both sexes and time course analyses.

      (3) Detection of time and sex-dependent interactions of two QTL loci modulating morphine locomotor activity.

      (4) Identification of putative candidate genes by combined expression and behavioral genetic analyses.

      (5) Use of Bayesian analysis to model causal interactions between multiple genes and behavioral time points.

      Appraisal:

      The authors largely succeeded in reaching goals with novel findings and methodology.

      Significance of Findings:

      This study will likely spur future direct experimental studies to test hypotheses generated by this complex analysis. Additionally, the broad methodological approach incorporating time course genetic analyses may encourage other studies to identify epistatic interactions in mouse genetic studies.

    3. Reviewer #3 (Public review):

      Summary:

      This is a clearly written paper that describes the reanalysis of data from a BXD study of the locomotor response to morphine and naloxone. The authors detect significant loci and an epistatic interaction between two of those loci. Single-cell data from outbred rats is used to investigate the interaction. The authors also use network methods and incorporate human data into their analysis.

      Strengths:

      One major strength of this work is the use of granular time-series data, enabling the identification of time-point-specific QTL. This allowed for the identification of an additional, distinct QTL (the Fgf12 locus) in this work compared to previously published analysis of these data, as well as the identification of an epistatic effect between Oprm1 (driving early stages of locomotor activation) and Fgf12 (driving later stages).

    1. Reviewer #1 (Public review):

      Summary:

      Many studies have investigated adaptation to altered sensorimotor mappings or to an altered mechanical environment. This paper asks a different but also important question in motor control and neurorehabilitation: how does the brain adapt to changes in the controlled plant? The authors addressed this question by performing a tendon transfer surgery in two monkeys during which the swapped tendons flexing and extending the digits. They then monitored changes in task performance, muscle activation and kinematics post-recovery over several months, to assess changes in putative neural strategies.

      Strengths:

      (1) The authors performed complicated tendon transfer experiments to address their question of how the nervous system adapts to changes in the organisation of the neuromusculoskeletal system, and present very interesting data characterising neural (and in one monkey, also behavioural) changes post tendon transfer over several months.

      (2) The fact that the authors had to employ to two slightly different tasks -one more artificial, the other more naturalistic- in the two monkeys and yet found qualitatively similar changes across them makes the findings more compelling. After all these are very challenging experiments!

      (3) The paper is well written, the analyses are sound, and the authors interpret the data appropriately, acknowledging the key limitations.

      Weaknesses:

      None of note.

    2. Reviewer #3 (Public review):

      Summary:

      In this study, Philipp et al. investigate how a monkey learns to compensate for a large, chronic biomechanical perturbation--a tendon transfer surgery, swapping the actions of two muscles that flex and extend the fingers. After performing the surgery and confirming that the muscle actions are swapped, the authors follow the monkeys' performance on grasping tasks over several months. There are several main findings:

      - There is an initial stage of learning (around 60 days), where monkeys simply swap the activation timing of their flexors and extensors during the grasp task to compensate for the two swapped muscles.

      - This is (seemingly paradoxically) followed by a stage where muscle activation timing returns almost to what it was pre-surgery, suggesting that monkeys suddenly swap to a new strategy that is better than the simple swap.

      - Muscle synergies seem remarkably stable through the entire learning course, indicating that monkeys do not fractionate their muscle control to swap the activations of only the two transferred muscles.

      - Muscle synergy activation shows a similar learning course, where the flexion synergy and extension synergy activations are temporarily swapped in the first learning stage and then revert to pre-surgery timing in the second learning stage.

      - The second phase of learning seems to arise from making new, compensatory movements (supported by other muscle synergies) that get around the problem of swapped tendons.

      Strengths:

      This study is quite remarkable in scope, studying two monkeys over a period of months after a difficult tendon-transfer surgery. As the authors point out, this kind of perturbation is an excellent testbed for the kind of long-term learning that one might observe in a patient after stroke or injury, and provides unique benefits over more temporary perturbations like visuomotor transformations and over studying learning through development. Moreover, while the two-stage learning course makes sense, I found the details to be genuinely surprising--specifically the fact that: 1) muscle synergies continue to be stable for months after the surgery, despite being maladaptive; and 2) muscle activation timing reverts to pre-surgery levels by the end of the learning course. These two facts together initially make it seem like the monkey simply ignores the new biomechanics by the end of the learning course, but the authors do well to explain that this is mainly because the monkeys develop a new kind of movement to circumvent the surgical manipulation.

      I found these results fascinating, especially in comparison to some recent work in motor cortex, showing that a monkey may be able to break correlations between the activities of motor cortical neurons, but only after several of coaching and training (Oby et al. PNAS 2019). Even then, it seemed like the monkey was not fully breaking correlations but rather pushing existing correlations harder to get succeed at the virtual task (a brain-computer interface with perturbed control).

      Weaknesses:

      I found the analysis to be reasonably well considered and relatively thorough. The authors have also suitably addressed my comments on the previous version. One minor weakness that remains (understandably so) is that the two animals in the study performed different tasks, and the results of the secondary synergy analysis seem to be quite different (Figure 10). That said, I don't think this weakness reduces the impact of the study, and though multiple replications of the same results would provide more convincing evidence, I don't think it's necessary to make the points that the authors are making.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors set up a pipeline to predict insect repellents that are pleasant and safe to humans. This is done by daisy chaining a new classification model based predicting repellents with a published model on predicting human perception. Models use a feature-engineered selection of chemical features to make their predictions. The predicted molecules are then validated against a proxy humanoid (heated brick) and its safety is tested by molecular assays of human cells. The humanistic approach to modeling these authors have taken (which consider cosmetic/aesthetic appeal and safety) is novel and a necessary step for consumer usage. However, the importance of pleasantness over effectiveness is still up for debate (DEET is unpleasant but still used often) and the generalization of safety tests is unknown and assumed. The effectiveness of the prediction models is also still warranted. They pass the authors own behavioral tests, but their contribution to the field is unknown as both models (new and published) have not been rigorously bench-marked to previous models. Moreover, the author's breadth of literature in this field is sparse, ignoring directly related studies.

      Strengths:

      Humanistic approach to modeling consider pleasantness and safety. Chaining models can help limit the candidate odorants from the vastness of odor space.

      Weaknesses:

      The current models need to be bench-marked against leading models predicting similar outcomes. Similarly, many of these papers need to be addressed and discussed in the introduction. The authors might even consider their data sources for model training to increase performance and lexical categorization for interoperability. For instance, the Dravnikes data lexicon, currently used in the human perception lexicon, has been highly criticized for its overlapping and hard to interpret descriptive terms ("FRAGRANT", "AROMATIC").

      Human Perception<br /> Khan, R. M., Luk, C. H., Flinker, A., Aggarwal, A., Lapid, H., Haddad, R., & Sobel, N. (2007). Predicting odor pleasantness from odorant structure: pleasantness as a reflection of the physical world. Journal of Neuroscience, 27(37), 10015-10023.

      Keller, A., Gerkin, R. C., Guan, Y., Dhurandhar, A., Turu, G., Szalai, B., ... & Meyer, P. (2017). Predicting human olfactory perception from chemical features of odor molecules. Science, 355(6327), 820-826.

      Gutiérrez, E. D., Dhurandhar, A., Keller, A., Meyer, P., & Cecchi, G. A. (2018). Predicting natural language descriptions of mono-molecular odorants. Nature communications, 9(1), 4979.

      Lee, B. K., Mayhew, E. J., Sanchez-Lengeling, B., Wei, J. N., Qian, W. W., Little, K. A., ... & Wiltschko, A. B. (2023). A principal odor map unifies diverse tasks in olfactory perception. Science, 381(6661), 999-1006.<br /> Related cleaned data: https://github.com/BioMachineLearning/openpom

      Insect Repellents:<br /> Wright, R. H. (1956). Physical basis of insect repellency. Nature, 178(4534), 638-638.

      Katritzky, A. R., Wang, Z., Slavov, S., Tsikolia, M., Dobchev, D., Akhmedov, N. G., ... & Linthicum, K. J. (2008). Synthesis and bioassay of improved mosquito repellents predicted from chemical structure. Proceedings of the National Academy of Sciences, 105(21), 7359-7364.

      Bernier, U. R., & Tsikolia, M. (2011). Development of Novel Repellents Using Structure− Activity Modeling of Compounds in the USDA Archival Database. In Recent Developments in Invertebrate Repellents (pp. 21-46). American Chemical Society.

      Wei, J. N., Vlot, M., Sanchez-Lengeling, B., Lee, B. K., Berning, L., Vos, M. W., ... & Dechering, K. J. (2022). A deep learning and digital archaeology approach for mosquito repellent discovery. bioRxiv, 2022-09.

      The current study assumes that insect repellents repel via its odor valence to the insect, but this is not accurate. Insect repellents also mask the body odor of humans making them hard to locate. The authors need to consult the literature to understand the localization and landing mechanisms of insects to their hosts. Here, they will understand that heat alone is not the attractant as their behavioral assay would have you believe. I suggest the authors test other behaviors assays to show more convincing evidence of effectiveness. See the following studies:

      De Obaldia, M. E., Morita, T., Dedmon, L. C., Boehmler, D. J., Jiang, C. S., Zeledon, E. V., ... & Vosshall, L. B. (2022). Differential mosquito attraction to humans is associated with skin-derived carboxylic acid levels. Cell, 185(22), 4099-4116.

      McBride, C. S., Baier, F., Omondi, A. B., Spitzer, S. A., Lutomiah, J., Sang, R., ... & Vosshall, L. B. (2014). Evolution of mosquito preference for humans linked to an odorant receptor. Nature, 515(7526), 222-227.

      Wei, J. N., Vlot, M., Sanchez-Lengeling, B., Lee, B. K., Berning, L., Vos, M. W., ... & Dechering, K. J. (2022). A deep learning and digital archaeology approach for mosquito repellent discovery. bioRxiv, 2022-09.

      Comments on revisions:

      The revisions made to the manuscript do not fully address the concerns raised in the previous round of review. The authors are encouraged to consider the following points to strengthen the work.

      The benchmarking of the human perception models against Keller et al. (2017) and Gutiérrez et al. (2018) is insufficient, as the field has progressed considerably in the last five years with newer approaches using larger data sources. Benchmarking against more recent models would better situate the contribution of this work.

      The exclusion of human repellency data from preprint Boyle et al. (2016) is worth reconsidering. For a study that takes an explicitly human-centric modeling approach, human behavioral data on repellency, pleasantness, and usage intent would directly support the central claims of the manuscript.

      The key claims regarding repellency and consumer acceptability would be considerably strengthened by the addition of these data.

    2. Reviewer #2 (Public review):

      Summary:

      This is an interesting study that seeks to identify novel mosquito repellents that smell attractive to humans. This is the second time I have reviewed, and the authors have not done anything to address the weaknesses. Although the subject matter may provide important new information for the development of new repellents, its current breadth is limited without additional assays. Arm-in-cage assays, testing the longevity of the new repellents, other ML analyses and confusion matrices, would strengthen the manuscript and demonstrate innovation. The lack of cohesion and new experimental results weakens the manuscript.

      Strengths:

      The combination of standard machine learning methods with mosquito behavioral tests is a strength.

      Weaknesses:

      The study would be strengthened by describing how other modern ML approaches (RF, decision trees) would classify and identify other potential repellents.

      A comparison of the repellent activity between DEET and the top ten hits identified in this new study indicates little change in repellent activity (~3%), suggesting that DEET remains the gold standard. Without additional toxicity tests and longevity tests, the study is arguably incremental. The study's novelty should be better clarified.

      The Methods in the repellency tests are sparse, and more information would be useful. Testing the top repellents at low doses (<<1%) and for long periods (2-12 h) would strengthen the manuscript. Without this information, the manuscript is lacking in depth.

      Testing human subjects on their olfactory percept of the repellents would also increase the depth and utility of the manuscript. Without additional experiments, the authors' conclusions lack support and have limited impact on the state-of-the-art.

      This manuscript is a mix of different approaches, which makes it lack cohesion. There is the ML method for classifying new repellents that smell good, but no testing of the repellents on human volunteers. The repellents are not tested at realistic concentrations and durations. And the calcium mobilization test is strange, and makes little sense in the context of the other experiments and framing of the manuscript.

      Comments on revisions:

      The authors have a potentially strong manuscript. However, I would urge the authors to address the reviewer comments in a substantive manner.

    1. Reviewer #1 (Public review):

      Summary:

      The paper describes a biologically plausible version of JEPA using recurrent neural networks called RPL for recurrent predictive learning. Given an embedding z_t, a recurrent neural network processes these inputs with the form: c_t+1 = RNN(c_t, z_t). Then the predictive network f is predicting the future inputs with the format: min || f(c_t) - stop_grad(z_t+delta t) ||^2. I understand that a prediction error is defined as: e = z_t+delta t - f(c_t) to model cortical measurements in the oddball task.

      The RPL model is also shown to build an internal world model, with "real-world" data like the movement of moving animals or speech signals. The representation is then compared to V1 data and expected prediction error signals in an oddball setting. In a stacked hierarchy of RNN learning with RPL, the higher layers appear to learn high-level latent variables, although gradients are not propagated downward to the lower layers.

      Strengths:

      (1) The paper tackles an open question: Self-supervised learning is thought to be a fundamental principle to explain how computation is structured in the brain. Cortical data suggest qualitatively that prediction error is a core principle of representation learning in the brain, but the field is still looking for a simple yet expressive model that would explain how the cortex learns its representations. RPL contributes in that direction by making a useful link between cortical representation learning in RNN models and the JEPA learning algorithm that was demonstrated to scale to large world model learning from video data by Lecun's group. It is very useful to connect this popular deep learning algorithm to cortical data.

      (2) The model formalism is relatively elegant and simple: Simple next input prediction objectives are conceptually simple but not necessarily trivial to build at scale. There is a clear benefit in comparison with contrastive or IL methods because they are free from dataset-specific data augmentation and negative samples. Thereby moving the comp neuro field towards conceptually simpler models of representation in the cortex. Yet predictive only models (and in particular predictive models in latent space instead of pixel space) are not easy to build in a stable fashion. JEPA family is basically intended to solve this question; it is very nice and timely to bring this to comp neuro.

      (3) The methodology combining comp neuro and deep learning makes sense: The conceptual and qualitative analogy with cortical prediction errors is relevant and consistent with what is expected as a model of self-supervised learning in cortical models. The methodology to compare RPL with IL and CL is methodologically meaningful and grounded: showing, for instance, how some of the models fail to represent some latent structure in some toy datasets is interesting.

      (4) h-RPL: The h-RPL is perhaps the most creative departure from the JEPA model family. It would be interesting to say more about what was particularly difficult to see in the latent variables emerging in the hierarchical model. I often find it magical that layer-wise learning rules of this type are not learning redundant representations. Any insights why this is not the case here would be potentially insightful.

      Weaknesses:

      In general, I fully support the type of question and ideas that the paper is putting forward. It is, however, very hard in this research field to gain insight into specific conceptual contributions or specific bits of experimental data that the model puts forward. In pointing to the following weaknesses, I am encouraging the authors to lay out more clearly what the unique hypothesis is or the contribution of the RPL model that we should remember it for.

      (1) The devil is in the details:

      1a) Comparison with JEPA variants: JEPA variants are integrating different details into the learning algorithm. Integrating, for instance, "masking" of the latent encoder targets, or EMA in the style of BYOL or Siamese networks, for the predicted representations. It is great that RPL does not seem to need any of those (next input prediction is a natural implementation of masking, and EMA does not seem to be used). It is notoriously hard for the JEPA model to work without these features. Since some of these details are sometimes surprisingly crucial for a simulation to work, it would be good to report which of the other important details were key to live without EMA and masking. Is it the difference in learning rate, for instance? Or maybe the tasks considered are simply easy enough for any model to work; if so, it could be useful to acknowledge to what extent this is true.

      1b) Comparison with IL and CL: On a high level, the comparison with IL and CL algorithms is written as conclusive. I suspect that the failure modes of IL and CL that are described are not due to the algorithms themselves, but rather to the construction of invariance statistics or the choice of negative sample sets (the sets of samples among which variance 1 is requested by VICreg). For instance, if variance (or negative sample set) is taken only across time, the variance object identity is expected to collapse. Similarly, if the variance is taken across the object identity, the variance across time can collapse. So I wonder if the failure of IL and CL is induced by the construction of the variance definition.

      (2) Prediction error: When compared to the recording of cortical activity in Figure 7. It is not obvious from the figure which latent space we are talking about mathematically. Is the vector z, c or the prediction error e? This is rather important from a neuroscientific point of view, because the prediction error e is expected to explain the neuronal data. On the other hand, the prediction error e is only used in the learning algorithm to define the loss function, but it is not the communication medium between the RNN units c (or with the encoder z).

      In the brain, since the measurements are recorded as neural activity, they are communication channels between specific units (z or c). It is probably c or z that would already explain the oddball prediction error. I believe that other models, like Forward-forward of Nejad et al., have tried quite hard to address this apparent tension. Whether or not this is resolved by RPL, it thinks it would be beneficial to state the problem and clarify how the algorithm addresses or ignores the issue.

      (3) Successor representation without value? I believe the term successor representation is historically relevant in a reinforcement learning (RL) setting and has a precise mathematical definition. Without RL, I feel that learning successor representation is conceptually identical to learning a transition matrix (aka, a primitive world model). I therefore wonder if the pitch for high-level framing of the successor representation is appropriately described or trivial.

      (4) Learning in RNN: Learning with recurrent networks appears to be a key in this model presented here (it is in the algorithm name). Yet, this aspect of the model and the literature on biologically plausible learning rules for RNN is not really discussed.

    2. Reviewer #2 (Public review):

      This is a very interesting manuscript, which proposes a novel idea on how cortical networks may learn useful representations of sensory stimuli. The model implementing this idea is thoroughly tested in multiple experimental paradigms. The manuscript is very clearly written. I feel it may have a significant impact on our understanding of cortical circuitry.

    3. Reviewer #3 (Public review):

      Summary:

      This paper presents Recurrent Predictive Learning (RPL), a self-supervised model conceptually similar to Joint-Embedding Predictive Architecture (JEPA) models. RPL sequentially observes dynamic scenes to predict subsequent observations. A central claim of the work is that the model's trained representations are simultaneously invariant and equivariant to transformations, such as movement properties that emerge without explicit supervision. These representational qualities are demonstrated through three experiments utilizing two simulated datasets and one naturalistic dataset. Furthermore, the latent embeddings are qualitatively compared with neural data, showing that the model reproduces the successor representation observed in human V1 and the local/global oddball effect in the monkey Prefrontal Cortex.

      Strengths:

      (1) The paper addresses a fundamental question relevant to both computational neuroscience and machine vision: how the brain learns representations that are simultaneously invariant and equivariant to transformations. The manuscript is well-written, easy to follow, and supported by clear visualizations.

      (2) While JEPA-style models have recently gained significant traction in the artificial intelligence community, this paper nicely bridges the gap to neuroscience. By framing these architectures as a theory for visual learning in the brain, the authors provide valuable insights into how predictive frameworks can explain cortical processing.

      (3) The qualitative alignment with V1 and PFC data is a particularly strong contribution, as it offers a potential mechanistic explanation for observed neural phenomena through the lens of self-supervised learning.

      Weaknesses:

      (1) The central claim, that both invariance and equivariance emerge spontaneously, requires further scrutiny (see Ghaemi et al., NeurIPS, 2025; Garrido et al., arXive, 2024). In particular, the synthetic "moving animal" dataset used in this paper may be too simple to fully support this claim. In latent space prediction, a model must predict both the scene content and the dynamics of movement. Because movement (whether ego-motion or external) is often highly uncertain (or multi-modal), predictive models in naturalistic settings often "collapse" toward learning purely invariant representations, ignoring the hard-to-predict dynamics. In the provided simulations, the movements are extremely predictable. In more complex scenarios, the model would likely prioritize content (invariance) over dynamics (equivariance) unless aided by action-conditioning or explicit factor estimation (Zhang et al., ICLR, 2026). The authors' results in Figure 5 using naturalistic video seem to reflect this limitation, given the lower performance on the naturalistic videos compared to the synthetic datasets.

      (2) The framing of the RPL model as an entirely new theory of representation learning is slightly overstated. The focus on prediction in representation space rather than input space is the defining characteristic of JEPA and various other Self-Supervised Learning (SSL) models, even sequential prediction. While this paper clarifies the connection between these AI frameworks and cortical circuits, the work would be strengthened by more explicitly positioning RPL within the context of existing JEPA-style models and prior SSL theories of the visual system.

      (3) A significant challenge in latent-space SSL is avoiding "representational collapse" (where the model provides a trivial constant output). While the paper alludes to JEPA-like solutions, it lacks a detailed explanation (in both the text and the architectural schematics) of the specific technique used to prevent collapse. Consequently, it is difficult to evaluate the authors' claim of "biological plausibility," as the biological equivalents of common machine learning techniques (such as stop gradient) are not discussed.

      (4) Recent work has shown that the capacity (size) of the predictor significantly influences the learned representations in a JEPA-type world model (Gorrido et al., 2024). In simpler scenarios, a large enough predictor can allow a model to "memorize" dynamics rather than learning generalized equivariant features. It would be beneficial to see how the ratio of predictor size to encoder size affects the emergence of these features.

      Methodological Clarifications:

      (1) The authors mention a contrastive learning comparison but provide few details. Since contrastive learning is primarily a technique to avoid collapse, it would be a more rigorous baseline if implemented within the same architecture as RPL to isolate the effect of the predictive objective.

      (2) In the PFC data comparison (Figure 7f), there appears to be a discrepancy where the local and global conditions show nearly identical results in PFC, while different dynamics in the model. It is unclear if this is a visualization error or a genuine model deviation.

      (3) The criteria for selecting specific model variables for comparison with V1 versus PFC are not explicitly defined. Clarification is needed on whether the same latent variables were used for both brain regions or if different layers were selected.

    1. Reviewer #1 (Public review):

      Summary:

      The authors combine discriminative auditory fear conditioning with longitudinal in vivo calcium imaging to ask how prelimbic (PL) representations of learned and generalized threat evolve across recent and remote memory time points. Using two different CS+ frequencies and a no-shock control group, they report that PL population activity tracks graded behavioral generalization, that population similarity is highest for tones eliciting strong threat responding, and that distinct subnetworks can be identified that appear to encode tone-specific sensory features versus learned threat-related response structure.

      To my knowledge, this may be the first study to comprehensively examine neural encoding of fear generalization in prelimbic cortex (PL). The manuscript is ambitious and technically interesting, and several aspects are potentially important. In particular, the suggestion that neurons showing graded, learning-related response patterns become selectively stabilized over time is intriguing. The inclusion of two CS+ training conditions and a no-shock control also strengthens the case that at least some of the reported effects are related to associative learning rather than simple sensory differences. However, in its current form, the manuscript does not yet fully support the strength of the conceptual claims. Several issues limit confidence in the interpretation, including the possibility that repeated testing itself contributes to changes across days, uncertainty about the relationship between neural activity and freezing behavior, limited quantitative documentation of longitudinal cell registration, and a number of problems in figure clarity and statistical framing. Overall, the study contains promising observations, but the claims should be narrowed, and several analyses or controls would be needed to fully support the proposed framework.

      Detailed Comments

      (1) A general concern is that the repeated test procedure itself may contribute to extinction. Because the animals are exposed to multiple CS frequencies across multiple test days, and each tone is presented three times per session, some of the reported changes in behavior and neural activity across days could reflect extinction or repeated nonreinforced retrieval rather than the passage of time per se. This is especially relevant given that the manuscript makes claims about recent versus remote representations and representational drift over 30 days. At a minimum, the authors should discuss this limitation explicitly and temper claims about time-dependent changes. Ideally, they would include a control group in which animals are tested only once or twice (e.g., at an early and later time point with fewer CS frequencies), or a reduced-frequency testing design that minimizes extinction while still allowing evaluation of recent versus remote memory.

      (2) More generally, some of the reported learning-related neural differences may be driven by behavioral differences, particularly freezing, rather than by learning or generalization per se. For example, animals that freeze more to certain frequencies may show corresponding neural response differences simply because freezing alters PL activity. The authors should examine this possibility more directly. Analyses testing whether recorded cells encode freezing behavior, or whether tone frequency-related neural differences remain robust when comparing high- and low-freezing epochs, would help determine whether the reported effects reflect learned stimulus value rather than behavioral state differences.

      (3) A central feature of the manuscript is the analysis of neural response properties over an extended period of time, up to 30 days after learning. However, aside from a brief mention in the Methods that spatial registration was used, the manuscript provides very little quantitative information about this critical aspect of the study. The paper would be strengthened by including explicit metrics describing longitudinal cell tracking, such as the number and proportion of ROIs retained across all sessions, distributions of spatial-footprint correlations or centroid distances across days, and representative examples of matched imaging fields over time. Without this information, it is difficult to assess how strongly the longitudinal claims are supported.

      (4) The text states that "Figs. 1c and 1d show GCaMP6f expression in PL, representative calcium footprints, and activity traces". However, the figure as presented does not clearly show all of these elements, at least not in a way that matches the description in the Results. The correspondence between text and figure should be corrected.

      (5) The labeling of Figure 2a is insufficient for interpretation. The legend states that the panel shows raster plots of sound responsiveness, but the axes and scaling are not clearly defined. It is not clear from the figure what the x-axis represents, whether the y-axis corresponds to individual neurons, where the CS period occurs, or what the activity scale at the right denotes. Also, the term 'rasters' implies that spikes were analyzed. It seems that the spike inference approach (CASCADE) was only used for later analyses. Perhaps 'heat-plot' would be more accurate here? Generally, this figure should be annotated more clearly so that the reader can understand it without referring back to the Methods.

      (6) In relation to Figure 3, the analysis of population-averaged responses across tone frequencies is useful, but the manuscript would be stronger with additional statistical analyses across time and across groups. For example, if the authors want to argue that learning induces graded changes in neural responses and that these evolve across time, they should directly compare within-group responses across days and also compare matched frequencies between the conditioned groups and the no-shock controls. These analyses would help establish whether the observed differences are genuinely learning dependent and whether they change significantly over time.

      (7) The inclusion of two different CS+ frequencies and a no-shock control is a strength of the study and substantially improves the interpretation that graded neural responses are related to learning and generalization rather than to simple sensory processing or passage of time. That said, I am not entirely comfortable with the use of the term "inference" throughout the manuscript. What is being measured here appears closer to sensory generalization than inference in a stronger cognitive sense. The current task does not clearly require that animals infer hidden structure or stimulus value through abstract reasoning; rather, the generalized stimulus may simply be treated as similar to the conditioned cue. The terminology should therefore be reconsidered or softened.

      (8) I also found the use of the term "valence" somewhat problematic. The manuscript appears to use valence to refer to graded responding across tones with different aversive significance, but valence typically refers more broadly to distinctions between appetitive and aversive value. Here, terms such as "threat value," "aversive value," may be more precise. The authors should consider revising this language throughout.

    2. Reviewer #2 (Public review):

      Summary:

      The following points are those that occurred to me across readings of the paper. They are listed in what I take to be the order of their significance. Many of the points relate to the loose use of language and invocation of concepts that are not warranted, given the study design and results obtained.

      Major Comments:

      (1) The concept of ensemble turnover is interesting - the way it is introduced and discussed implies some type of spontaneous change in the neural underpinnings of fear discrimination and generalization in the PL. But, of course, every trial involves an opportunity to learn about the threat CS or the generalization test stimuli, and I am troubled by the thought that stability in the neural underpinnings of fear discrimination and generalization will actually reflect the level of defensive behaviours evoked on different trial types and/or the discrepancy between those behaviours and the outcome of a given trial in the generalization test. That is, stability in the neural underpinnings may be related to an animal's certainty or uncertainty in the contingency between a stimulus and danger; or, put another way, an animal's confidence that danger will or won't occur given the presence of some stimulus. This is not uninteresting. It is, however, not considered anywhere in the paper, which is overloaded with references to inferred threat values and integration of information across different types of stimuli. The protocol is not one that requires inference about anything or integration across anything.

      (2) I appreciate the link to Gu and Johansen in paragraph 3 of the Introduction, but the type of generalization under investigation here is not the same as the type of 'generalization' studied by Gu and Johansen [who used a sensory preconditioning protocol]. Nonetheless, the authors have forced the language used by Gu and Johansen into their paper, and this has created tension [at least for this reader] as the concepts introduced by Gu and Johansen [inference, integration] are simply not relevant given the generalization protocol used here. Here are a few examples of points where the tension might interfere with a reader's understanding:

      a. 'We hypothesized that generalization to novel stimuli depends on stable subnetwork organization that enables comparisons between learned and inferred valence, as well as population-level features that reduce variability across related representations.'

      I understand the words in the hypothesis, but can't form a representation of what is being said because of the reference to terms that stand in need of clarification [inferred valence, variability across related representations], but, ultimately, won't be clarified. This needs to be re-expressed so that the reader can appreciate what is being said.

      b. 'Our results show that stable cortical subnetworks integrate the emotional "gist" of memory and inferred valence for novel cues over time, despite ongoing ensemble reorganization, and that population-level firing rate similarity across stimulus presentations determines threat generalization.'

      Again, what does this mean? How is the gist of a memory integrated with inferred valence for novel cues over time? The statement simply doesn't make sense. This needs to be rewritten for clarity.

      c. 'In CS⁺15 mice, positively modulated sound-responsive neurons exhibited graded tone activity reflecting the contingency learned valence as well as the inferred valence of novel tones across testing days...'.

      Can this be rewritten as 'In CS⁺15 mice, positively modulated sound-responsive neurons exhibited graded activity to the tone CS and its variants that were used to assess generalization.'? The overloading of the text with references to 'contingency learned valence' and 'inferred valence' is unnecessary and makes it much harder to understand what has been shown in the results.

      (3) Re the same passage of text as in 2c:

      Is it the case that these neurons are simply tracking the expression of freezing to the various tones? The same question applies to the results obtained for the CS+3 mice. If this is the case, then why should the results be taken to support the banner statement that 'Sound-modulated PL population responses encode learned and inferred valence' - these analyses do not support that statement. And, as indicated, I don't believe that the language of learned and inferred valence is appropriate to such statements, given the nature of the protocol used and results obtained. It is a study looking at how populations of neurons in the PL respond during presentations of auditory stimuli that were subject to discriminative conditioning, and during tests of generalized freezing to other [intermediate] auditory stimuli.

      (4) It is stated that:

      'In no-shock controls, although both positive and negative responses were present, population activity was not modulated by tone frequency or valence'.

      What does this mean? I can understand that population activity was not modulated by tone frequency. But what does it mean to say that it was not modulated by valence? Why should it have been when none of the tones were conditioned in this group and, hence, mice were responding to all the tones equally? And given that this is true, I don't understand the use of 'valence' here, or the subsequent statements in this paragraph that 'graded responses require associative learning' and that 'PL population responses encode graded sound-valence associations that reflect both learning and inference, closely matching behavioral generalization.' The latter statement is particularly unwarranted and, again, highlights a major issue with the paper. It could and should be rewritten as 'PL population responses reflect behavioral generalization.' There is nothing in the additional language that adds to the reader's understanding of what has been shown. The reference to 'graded sound-valence associations that reflect both learning and inference' is completely unwarranted, given the nature of this study. It is anathema to the vast literature on stimulus generalization. If the authors wished to make statements of this sort, they should have taken a different approach, perhaps using protocols like those featured in Gu and Johansen.

      (5) The section titled, 'Consistently active neurons preserve valence representations as newly recruited neurons sharpen remote memory traces' ends with the following summary:

      'Together, these results indicate that consistently active neurons maintain stable representations of learned and inferred sound associations across time, whereas neurons recruited after conditioning progressively acquire graded tuning at later retrieval stages. This dynamic refinement suggests that cortical memory representations become increasingly selective during systems consolidation, while a stable neuronal subpopulation preserves the core emotional content of the memory.'

      Once again, the summary is not in keeping with the results obtained. The 'dynamic refinement' of representations is far more likely to reflect the repeated testing across days 1, 15, and 30 rather than anything to do with systems consolidation - at the very least, it is the simplest interpretation of the results. The impact of repeated testing is evident in the sharpening of generalization gradients over time, which is contrary to what is otherwise observed in the literature - the incredibly well -documented broadening of generalization gradients with time. Given this impact of repeated testing, surely the changes in the neuronal population that underlie performance are more likely to reflect the learning that occurs on days 1, 15, and 30, which is reflected in reduced freezing to the non-conditioned tones. If this is a reasonable take on the results, then I don't see the basis for invoking systems consolidation at all, and I don't see the basis for inferring a stable neuronal subpopulation that preserves the emotional content of the memory. Rather, non-reinforced presentations of 'never-reinforced' tones result in recruitment of additional neurons that result in suppression of freezing responses to those stimuli.

      (6) In the section titled, 'Population vector similarity at stimulus onset determines degree of generalization', it is stated that:

      'Because population similarity peaked shortly after stimulus onset, we quantified similarity during the first 5 s after tone onset relative to the CS⁺. In CS⁺15 mice, population similarity was highest for 15/15 and 15/11 tone pairs with no differences between them.'

      Isn't this consistent with the view that the population response in the PL simply reflects the level of freezing? Freezing to the 15-15 and 15-11 tones is most likely to be similar on their first presentation prior to the effects of extinction on the 11 Hz tone; hence the results obtained. That is, these results appear to clearly indicate that neuronal responses in the PL reflect the degree of stimulus generalization, as evidenced in freezing behavior. Given all that we know about the involvement of the PL in expressing fear responses, it is not appropriate to claim that 'population vector similarity at stimulus onset *determines* the degree of generalization. The PL responses simply reflect the varying levels of performance displayed to the different types of tones. What have I missed that could be taken to support additional statements?

      Later in the same section, it is stated that 'population-level similarity at stimulus onset scales with behavioral threat generalization and is maximal for tones associated with robust threat responses.' For simplicity and, therefore, clarity, this should be rewritten as 'population-level similarity at stimulus onset reflects behavioral threat generalization.'

      (7) In the section titled, 'Different subnetworks encode acoustic versus learned properties of sound association', it is stated that:

      'Our previous analyses show that learned and inferred associations are represented at the population level. However, these results do not resolve whether graded responses arise from pooled activity of frequency-selective neurons or from subnetworks encoding integrated learned valence across tones.'

      What does it mean to say 'integrated learned valence across tones'? As it presently stands, the meaning of the phrase is unclear. It only makes sense if one supposes that generalized freezing responses to the 11 and 7 kHZ tones reflect separate associations between those tones and the aversive foot shock US. This supposition is inconsistent with the rich literature on generalization of Pavlovian conditioned fear responses. Specifically, it is inconsistent with the many theories of fear generalization, which attribute the reduction in fear as one moves away from the specific conditioned stimulus to a decrement in the ability of the test stimulus to activate the trained CS-US association. My strong impression is that the authors would do well to ground their findings in theories of stimulus/fear generalization, of which there are many. This would better serve the results obtained [and the reader's appreciation of them] - at present, the unnecessary invocation of concepts does very little to enhance the reader's appreciation or understanding of what has been found in the study.

      (8) Another example of what has been a common theme in this review :

      '...we hypothesized that the PL active ensemble segregates into functionally distinct subnetworks: one encoding tone-specific sensory features with dynamic characteristics, and another responding to all frequencies encoding stable core memory content and inferred emotional valence.'

      What does it mean to say 'all frequencies encoding stable core memory content and inferred emotional valence'? Do the authors mean to say '...and another that tracks freezing/defensive responses regardless of whether they were elicited by the trained CS or one of the generalization test stimuli'?

      (9) It is stated that - 'Graded clusters encode emotional valence but constitute only a fraction of the active population; yet valence coding at the population level remains accurate and precise. This indicates that neurons newly recruited into the population-likely frequency-selective and organized within learning-independent clusters-can be shaped by associative processes through modulation of firing activity.'

      What does this mean? Are the authors trying to say that - 'Some clusters of PL neurons track freezing responses. In spite of the fact that these are only a fraction of the total active neuronal population, the population-level response of PL neurons also tracks the levels of fear to the trained tone and its variants used in the test for generalization.' If this is what one wants to say, then the final statement in the reproduced section does not follow. That is, there is no indication that 'neurons newly recruited into the population-likely frequency-selective and organized within learning-independent clusters-can be shaped by associative processes through modulation of firing activity.' As noted, the characteristics of other ensembles that become active across the repeated tests on days 1, 15, and 30 are more likely to reflect learning from non-reinforcement that occurs within and across those sessions. Perhaps this is what is meant by the phrase, 'shaped by associative processes'? If so, it should be stated explicitly instead of left to the reader to work out.

      (10) The following points all relate to the Discussion and reiterate many of the points above.

      a. 'A subset of neurons remains consistently active across sessions, preserving core components of the memory trace and supporting inference of emotional valence for novel sounds, while neurons recruited after conditioning progressively acquire valence selectivity at remote time points.'

      'Inference of emotional valence' is unclear and unwarranted for all of the reasons provided above regarding the use of language.

      b. '...Our data reconcile these views by demonstrating that cortical representations of emotional valence emerge rapidly after learning and persist within stable subnetworks, even as the broader population undergoes substantial turnover. This architecture preserves core mnemonic content while allowing flexibility in the surrounding ensemble.'

      These statements assume that the PL neuronal responses reflect something more than the levels of freezing behavior to the different stimuli; what are the grounds for this assumption?

      c. 'Importantly, these subnetworks encode both learned contingencies and the inferred valence of novel stimuli along a graded representational axis, suggesting that strong recurrent connectivity provides a stable scaffold for emotional memory representations.'

      What is a graded representational axis, and what part of the first statement suggests that 'strong recurrent connectivity provides a stable scaffold for emotional memory representations'? If the authors' goal was to make statements about emotional memory representations vis-à-vis emotional memory content, they should have used protocols that allowed them to probe such content. The auditory fear conditioning protocol used here [followed by tests for generalization to other auditory stimuli that differ in frequency from the conditioned tone] is not one that lends itself to analysis of emotional memory representations or content.

      d. 'Dynamic tone-selective responsive neurons emerge independently of learning, as they are present in both control and experimental mice, reflecting pre-existing PL sensory-driven properties (Hockley & Malmierca, 2024; Zikopoulos & Barbas, 2006).'

      Maybe. They are also likely to have developed as a consequence of the repeated testing on days 1, 15, and 30, which involved intermixed exposures to the tones of different frequencies. That is, rather than 'pre-existing PL sensory-driven properties', the responses of these neurons might reflect the emergence of discrimination between the various tones across testing, and greater suppression of freezing to the non-trained tones compared to the trained tone across the various test intervals.

    3. Reviewer #3 (Public review):

      Summary:

      Normandin et al. explore the coding of stimuli predicting an aversive event in the prelimbic cortex. Stimuli could either be explicitly paired, explicitly unpaired, or novel but with an inferred association with the aversive event (generalization). Long-term tracking of GCaMP-positive neurons allowed them to examine how coding evolves out to a month following training. In general, they found two types of ensemble codes. One was ensembles coding for each stimulus independently, but with enhanced responding to the one eliciting a freezing response. The other was ensembles that responded to all stimuli in proportion to their similarity to the stimulus paired with the aversive event, either increasing or decreasing their activation with the degree of freezing elicited by a stimulus. Importantly, this second set of ensembles was more stable across days, potentially providing a memory trace.

      Strengths:

      (1) The authors track ensembles in prelimbic cortex over long time scales, providing valuable information on the consolidation of neural codes.

      (2) Neural coding of generalization is examined, which is under-examined in the field.

      Weaknesses:

      (1) Difficult to determine if responses treated as encoding stimulus valence are driven instead by the behavior that the stimulus elicits, freezing.

      (2) The study implies that the identified ensembles are causally related to valence memory, but no experimental interventions are performed to justify this.

    1. Reviewer #1 (Public review):

      The authors demonstrate an innovative approach to investigate the effect of cone dropout on visual acuity using their newly developed olo system. By systematically reducing the coverage of real-world input to the cone photoreceptor mosaic ("cone dropout condition"), the authors are able to assess how having fewer cones leads to reduced vision, in comparison to existing approaches ("pixel dropout condition").

      The capture of a rich dataset, including cone imaging and eye motion, is valuable. Benchmarking with the prior literature, suggesting that good visual acuity can be maintained despite a 50% loss in cone density, is impressive. However, it is known that cone density varies dramatically from the peak cone density location in the foveal center to even a location a few degrees outside of the fovea. In addition, there is a high degree of subject-to-subject variation in peak cone density. Given that the C stimulus is hollow in the middle, the stimulus does not actually hit the location of the peak cone density but must land slightly outside of it. Therefore, considering the actual cone density of where the stimulus lands will be important to discuss and/or analyze.

      The observation of visual acuity maintenance with cone dropout has been a longstanding mystery since the 2013/2018 papers by Ratnam and Foote. The authors should be commended for their approach to addressing this important question. However, there are some simplifications and assumptions being applied to make this jump (i.e., that a 50% reduction in cone stimulation in a healthy eye is comparable to a 50% reduction in cone density in a patient). It seems unlikely that, in a patient's eye, with cone dropout, there will be gaps in the mosaic. Not considering any other non-photoreceptor-related reasons for visual acuity loss, which can occur in patients, the cone aperture acceptance angle may be different due to changes in cone size or packing; the sensitivity of individual cones may also be reduced due to deficits in the visual cycle recovery, which could be affected in disease. Some of these limitations could be addressed and acknowledged more explicitly.

      Overall, this is an impressive study incorporating state-of-the-art technology to probe the fundamental limits of human vision.

    1. Reviewer #1 (Public review):

      Summary:

      Fujita and colleagues investigated two selective peripheral nerve voltage-gated sodium channel inhibitors targeting either Nav1.7 or Nav1.8 on the excitability of human dorsal root ganglion neurons. The authors discovered that Nav1.8 inhibition is more effective at suppressing repetitive firing of DRG neurons, and this may explain the greater clinical efficacy observed for suzetrigine.

      Strengths:

      The study is interesting, and the findings are conceptually satisfying in that they may explain one aspect of Nav1.7 vs Nav1.8 targeting success.

      Weaknesses:

      (1) The use of postmortem human DRG neurons provides translational relevance, but the use of these cells is also a liability, given their high degree of variability. Of note are the 10 to 20-fold differences in baseline properties among cells, which dwarf the effects of the test compounds. The experiments may suffer from undersampling.

      (2) A potential confounder when using post-mortem human DRG neurons is heterogeneity of cell types. The methods clearly state that the cells selected for recording were of 'generally' small size, but specific criteria for what constitutes 'small' or other unstated selection criteria were not provided. A table of individual cell capacitance and input resistance values, along with information about individual donors (age, sex, ethnicity), is important to include. Additionally, some discussion of how DRG neuron heterogeneity impacts the findings. This relates to concern #1 about sample size determination and how cell heterogeneity factored into this calculation.

    2. Reviewer #2 (Public review):

      Summary:

      The authors examine the functional role of Nav1.7 voltage-gated sodium channels in human sensory neuron electrogenesis using a Nav1.7 selective inhibitor and human dorsal root ganglion neurons obtained from organ donors. Patch-clamp electrophysiology is used at physiological temperature to measure the impact of Nav1.7 inhibition on sensory neurons' action potential firing. This is an important topic as Nav1.7 and Nav1.8 have been identified as therapeutic targets for the treatment of pain, but there has been mixed success with isoform-specific inhibitors in clinical trials. The data suggest that Nav1.7 and Nav1.8 have overlapping yet complementary functions in nociceptor neurons and that targeting both may be most effective for reducing nociception.

      Strengths:

      The data are of high quality. Action potential properties are measured at 37 degrees Celsius. Threshold is measured using brief pulses. The Nav1.7 inhibitor has been reported to be highly selective for Nav1.7 over Nav1.8 and moderately selective for Nav1.7 over Nav1.1 and Nav1.6. Data are collected using identical conditions and protocols to a previous study on the role of Nav1.8 in similar neurons.

      Weaknesses:

      The study relies on a single Nav1.7 inhibitor that has not been extensively characterized. One prior study indicates that the IC50 is around 140 nM, thus the 600 nM concentration used in this study could be predicted to reduce Nav1.7 currents by 80%. However, there is no voltage-clamp data in the current study to confirm this, and therefore, it is unclear if the batch of AM-2099 is as potent as reported in the paper that initially described its selectivity. The impact of Nav1.7 inhibition is compared to data from a previous study by this lab, and this is a minor concern. It would have been interesting to see if the combined inhibition of Nav1.7 and Nav1.8 completely blocked action potential generation in the human DRG neurons.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Fujita/Jo/Stewart/Osorno et al. investigate the contribution of Nav1.7 in regulating the excitability and firing properties of human dorsal root ganglion (hDRG) neurons in vitro. The authors characterize the effects of a previously reported Nav1.7-selective blocker AM-2099 in cultured hDRG neurons from postmortem organ donors. The authors observed modest changes in many of the properties expected by inhibiting Nav channels, including decreased action potential upstroke rate and amplitude, while increasing the voltage and current thresholds for spike generation. However, AM-2099 did not change the maximum number of APs in response to suprathreshold stimulation, leading the authors to conclude that Nav1.7 inhibition alone has limited efficacy in reducing the firing properties of hDRG neurons and that Nav1.7 blockers may have limited efficacy as analgesics. This is surprising, given that patients with loss-of-function mutations in Nav1.7 suffer from congenital insensitivity to pain. While it may indeed be true that pharmacological inhibition of Nav1.7 is unlikely to produce analgesia, the present study was limited to a single concentration of AM-2099. The manuscript would be significantly strengthened by a more careful and thorough pharmacological characterization of this compound, which has not been widely used or validated in native human DRG neurons.

      Strengths:

      Experiments are well-designed and executed, and the results presented are convincing. The focus on voltage-gated sodium channels in native human DRG neurons is highly relevant to recent efforts to develop safer analgesic options for chronic pain in people.

      Weaknesses:

      Only a single concentration of AM-2099 was used for all experiments. This compound was reported to be selective for cloned human Nav1.7 channels in heterologous systems, but has not been validated in other studies after the original publication in 2016. Since the original study reported a substantial state-dependent block of recombinant Nav1.7 channels, more detailed pharmacological characterization of AM-2099 is needed in human DRG neurons to fully support these claims. This study would be significantly strengthened by the inclusion of dose-response curves to assess how much of the sodium current is inhibited at this concentration, confirming selectivity in hDRG, and whether maximal inhibition of Nav1.7 still has limited efficacy in reducing the firing of native human sensory neurons.

    1. Reviewer #1 (Public review):

      The manuscript shows that different traits of adults and larvae correlate with Red List status. The authors argue that this shows a big gap in the conservation of amphibians and that the traits of all life stages should be taken into account in amphibian conservation. Specifically, amphibian conservation should do more for the habitats where the larvae live.

      The manuscript is well written and easy to understand. The methods are sound.

      While the study will make an interesting contribution to conservation science, there are many things that I disagree with.

      I don't think that amphibian larvae and their requirements are a "blind spot" as the title suggests. When reading the manuscript, I didn't learn how conservation practice should change in response to the results.

      I wonder whether the relationship between species traits and extinction risk is of great importance for conservation. If a species is Data Deficient on the IUCN Red List, then species traits could be used to predict its Red List category. However, for other conservation projects, I don't see how this would work. How would traits be linked to captive breeding, conservation translocation, pond construction or habitat management in general? In some cases, I can envision a link between species traits and pond hydroperiod.

      Species traits are body size and morphological traits. That makes sense. However, one of the species traits was microhabitat. I find it far-fetched to call habitat a species trait. This is standard habitat ecology. It is well known that habitats matter and that different habitat types face different threats, and consequently, the species that live in those habitats. Furthermore, habitat and morphology may be confounded. For example, tadpoles in lentic and lotic habitats have very different morphologies. So is it habitat or morphology?

      I don't know how the threat status of Chinese amphibians is determined. IUCN has multiple reasons why a species can be Red Listed. One reason is range size, and another reason is population decline. Personally, I don't think they should be pooled in an analysis because they are fundamentally different reasons why a species has a high extinction risk. A reduction in population size of greater than 30% in 10 years or 3 generations is not the same thing as a small distribution range. Another issue is that IUCN developed the Green Status of species. The Green Status shows that even a species which is LC on the Red List may be significantly depleted.

      The species traits in Table 1 are mostly functional/morphological and body size related (and microhabitat). While there may be correlations between traits and Red List status, it is unknown whether this is correlation or causation. In addition, it is difficult to know the conservation interventions that may be necessary now that we know that relative head with and Red List status are correlated.

      In the discussion, the authors explain why body size and other traits may affect extinction risk and whether there is a causal relationship. I agree that body size may have a direct effect because larger species are harvested more frequently (it was interesting to learn that tadpoles are harvested as well). However, as macroecological studies show, smaller species often have larger populations than larger species. Abundance may matter.

      I found it much harder to understand why relative head length and tympanum size correlated with Red List status. I wasn't convinced by the arguments in the discussion. Typanum size may be related to hearing and anthropogenic noise. Several studies are cited which show that frogs alter their calling behaviour in response to noise. Crucially, however, they describe changes in behaviour or properties of the advertisement call, yet none show that noise has effects on population viability. If some anthropogenic stressor affects individuals, then this does not mean that it will cause a population decline. When IUCN published the second global amphibian assessment, did they list noise as a major threat to amphibians?

      There are statements that the tadpole stage is the most important stage: "a critical period for amphibian survival" (line 78-79). While there is high mortality in the tadpole stage, tadpole survival is rather unlikely to affect population survival. Many population models show this. See, for example, Biek et al. 2002 in Conservation Biology. Other papers have argued that the postmetamorphic juvenile stage is most important (Petrovan and Schmidt 2009 Biological Conservation).

      The authors repeatedly make the statement that amphibian conservation should focus more on the tadpole stage. I don't understand why this statement is made. For example, a major activity in amphibian conservation is the restoration and de novo construction of ponds (see Calhoun et al. 2014 PNAS, Moor et al. 2022 PNAS). Ponds are habitats for tadpoles. Others removed fish from amphibian breeding sites because fish prey on tadpoles (and adults; see Vredenburg 2004 PNAS). Semlitsch (2002 in Conservation Biology) argued that the management of pond hydroperiod is a critical element of amphibian recovery plans. Ponds should be temporary because this effectively removes predators that consume tadpoles. Clearly, the tadpole stage is not a neglected stage in amphibian conservation.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors tried to examine whether there are differences in the association between functional traits and extinction risk in adult and tadpole stages in Chinese anurans.

      Strengths:

      Overall, I think the basic idea of the study is interesting and important. It can be applied to other taxa with complex life cycles throughout the animal kingdom.

      Weaknesses:

      I do not think the authors achieve their aims, as the results only partially support their conclusions. The study has several drawbacks that need to be clarified or revised, including the unclear threat categories for tadpoles, model selection and model averaging, the potential problem of AIC, and the omission of other important species traits.